Skip to content

Conversation

yf225
Copy link
Contributor

@yf225 yf225 commented Mar 28, 2019

After the Variable/Tensor merge, code paths in ATen need to be able to check whether a tensor requires gradient, and throw errors in places where a requires_grad=true tensor is not allowed (such as https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/Utils.h#L76-L78 and https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/SparseTensorImpl.cpp#L86). Since the GradMode thread-local variable controls whether a tensor should accumulate gradients, we need to be able to check this variable from ATen when we determine whether a tensor requires gradient, hence the PR to move GradMode / AutoGradMode / NoGradGuard to ATen.

Note that we intentionally don't merge at::GradMode and at::NonVariableTypeMode, with the following reasoning:
Semantically, at::GradMode and at::NonVariableTypeMode actually mean different things: at::GradMode controls whether a tensor should accumulate gradients, and at::NonVariableTypeMode controls whether a Variable should be treated as a non-Variable tensor in type dispatches. There are places whether we don't want the tensor to accumulate gradients, but still want the Variable to be treated as a Variable. Here is one example:

#  torch/tensor.py
with torch.no_grad():
   ...
   new_tensor = self.new()    # `at::GradMode` is false at this point
   ...
// tools/autograd/templates/python_variable_methods.cpp
static PyObject * THPVariable_new(PyObject* self, PyObject* args, PyObject* kwargs)
{
  ...
  // if we merge `at::GradMode` and `at::NonVariableTypeMode`, since `at::GradMode` is false and `self_.type()` checks `at::GradMode` to decide whether to return non-Variable type, it will return a non-Variable type here, which is not what we want (and throws a "Tensor that was converted to Variable was not actually a Variable" error)
  return THPVariable_Wrap(torch::utils::legacy_tensor_new(self_.type(), args, kwargs));
  ...
}

For the above reason, we cannot merge at::GradMode and at::NonVariableTypeMode, as they have different purposes.

@yf225 yf225 requested review from ezyang and gchanan March 28, 2019 17:57
@yf225
Copy link
Contributor Author

yf225 commented Mar 28, 2019

Note from offline discussion with @gchanan : we should look into how to avoid moving GradMode to ATen, and change places where this move is expected.

@ezyang
Copy link
Contributor

ezyang commented Mar 29, 2019

Note that we intentionally don't merge at::GradMode and at::NonVariableTypeMode, with the following reasoning

This would make a really good Note in the source code

Copy link
Contributor

@ezyang ezyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okey dokey

Copy link
Contributor

@gchanan gchanan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the note on the offline discussion says we don't actually need this, right?

@yf225 yf225 closed this Apr 16, 2019
@yf225 yf225 reopened this Jul 5, 2019
@yf225 yf225 force-pushed the move_grad_mode_to_aten branch from 1fe8022 to b0f096d Compare July 5, 2019 18:34
@pytorchbot pytorchbot added caffe2 module: autograd Related to torch.autograd, and the autograd engine in general module: build Build system issues module: cpp Related to C++ API module: internals Related to internal abstractions in c10 and ATen labels Jul 5, 2019
@yf225 yf225 changed the title Move GradMode / AutoGradMode / NoGradGuard to ATen Move GradMode / AutoGradMode / NoGradGuard to ATen core Jul 5, 2019
@yf225
Copy link
Contributor Author

yf225 commented Jul 5, 2019

#22473 requires that we check torch::autograd::GradMode::is_enabled() in Caffe2 tensor's enforce_invariants() method in caffe2/core/tensor.cc. Since caffe2/core/tensor.cc is in Caffe2 core, @ljk53 suggested that we should move torch::autograd::GradMode from torch/csrc/autograd to ATen core, so that core code doesn't need to depend on non-core code (i.e. torch/csrc/autograd), which can prevent build structure headache.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yf225 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yf225 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yf225 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yf225 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@yf225 yf225 force-pushed the move_grad_mode_to_aten branch from f8739c0 to d6adf6d Compare July 6, 2019 00:52
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yf225 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@yf225 merged this pull request in 221af09.

zdevito pushed a commit to zdevito/ATen that referenced this pull request Jul 6, 2019
Summary:
After the Variable/Tensor merge, code paths in ATen need to be able to check whether a tensor requires gradient, and throw errors in places where a `requires_grad=true` tensor is not allowed (such as https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/Utils.h#L76-L78 and https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/SparseTensorImpl.cpp#L86). Since the `GradMode` thread-local variable controls whether a tensor should accumulate gradients, we need to be able to check this variable from ATen when we determine whether a tensor requires gradient, hence the PR to move `GradMode` / `AutoGradMode` / `NoGradGuard` to ATen.

Note that we intentionally don't merge `at::GradMode` and `at::NonVariableTypeMode`, with the following reasoning:
Semantically, `at::GradMode` and `at::NonVariableTypeMode` actually mean different things: `at::GradMode` controls whether a tensor should accumulate gradients, and `at::NonVariableTypeMode` controls whether a Variable should be treated as a non-Variable tensor in type dispatches. There are places whether we *don't* want the tensor to accumulate gradients, but *still* want the Variable to be treated as a Variable. Here is one example:
```python
#  torch/tensor.py
with torch.no_grad():
   ...
   new_tensor = self.new()    # `at::GradMode` is false at this point
   ...
```
```cpp
// tools/autograd/templates/python_variable_methods.cpp
static PyObject * THPVariable_new(PyObject* self, PyObject* args, PyObject* kwargs)
{
  ...
  // if we merge `at::GradMode` and `at::NonVariableTypeMode`, since `at::GradMode` is false and `self_.type()` checks `at::GradMode` to decide whether to return non-Variable type, it will return a non-Variable type here, which is not what we want (and throws a "Tensor that was converted to Variable was not actually a Variable" error)
  return THPVariable_Wrap(torch::utils::legacy_tensor_new(self_.type(), args, kwargs));
  ...
}
```
For the above reason, we cannot merge `at::GradMode` and `at::NonVariableTypeMode`, as they have different purposes.
Pull Request resolved: pytorch/pytorch#18573

Differential Revision: D16134413

Pulled By: yf225

fbshipit-source-id: 6140347e78bc54206506499c264818eb693cdb8a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

caffe2 module: autograd Related to torch.autograd, and the autograd engine in general module: build Build system issues module: cpp Related to C++ API module: internals Related to internal abstractions in c10 and ATen

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants