Skip to content

Commit

Permalink
Use deterministic impl of index_put and index backward CPU when `…
Browse files Browse the repository at this point in the history
…torch.are_deterministic_algorithms_enabled() == True` (#51388)

Summary:
Fixes #51366

Pull Request resolved: #51388

Reviewed By: zou3519

Differential Revision: D26235290

Pulled By: ngimel

fbshipit-source-id: 64cce1a5e75d8a9ce9807c28d641da82ede666e2
  • Loading branch information
kurtamohler authored and facebook-github-bot committed Feb 4, 2021
1 parent f1a63b7 commit c41678f
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 2 deletions.
9 changes: 7 additions & 2 deletions aten/src/ATen/native/cpu/IndexKernel.cpp
Expand Up @@ -111,8 +111,13 @@ void index_put_kernel(TensorIterator& iter, IntArrayRef index_size, IntArrayRef
AT_DISPATCH_ALL_TYPES_AND_COMPLEX_AND3(ScalarType::Half, ScalarType::Bool, ScalarType::BFloat16,
iter.dtype(), "index_put", [&] {
if (accumulate) {
bool use_parallel_for = ((iter.numel() >= internal::GRAIN_SIZE) && (at::get_num_threads() > 1));
if (iter.dtype() == ScalarType::Float && use_parallel_for) {
// See Note [Enabling Deterministic Operations]
// Parallel cpu_index_kernel with accumulation is nondeterministic, so we
// must enable serial execution if deterministic algorithms are enabled.
bool is_deterministic = at::globalContext().deterministicAlgorithms();
bool use_parallel_for = (!is_deterministic) && (
(iter.numel() >= internal::GRAIN_SIZE) && (at::get_num_threads() > 1));
if (use_parallel_for && iter.dtype() == ScalarType::Float) {
cpu_index_kernel<float>(iter, index_size, index_stride, [](char* dst, char* src, int64_t offset) {
cpu_atomic_add_float((float*)(dst + offset), *(float*)src);
});
Expand Down
4 changes: 4 additions & 0 deletions torch/__init__.py
Expand Up @@ -353,6 +353,10 @@ def use_deterministic_algorithms(d):
* :class:`torch.nn.ConvTranspose2d` when called on CUDA tensor
* :class:`torch.nn.ConvTranspose3d` when called on CUDA tensor
* :func:`torch.bmm` when called on sparse-dense CUDA tensors
* :func:`torch.__getitem__` backward when `self` is a CPU tensor and
``indices`` is a list of tensors
* :func:`torch.index_put` with ``accumulate=True`` when called on a CPU
tensor
The following normally-nondeterministic operations will throw a
:class:`RuntimeError` when `d=True`:
Expand Down

0 comments on commit c41678f

Please sign in to comment.