-
Notifications
You must be signed in to change notification settings - Fork 25k
Relax some limitations of InferenceMode. #54403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
doesn't bump version. [ghstack-poisoned]
💊 CI failures summary and remediationsAs of commit 972e766 (more details on the Dr. CI page):
🕵️ 2 new failures recognized by patternsThe following CI failures do not appear to be due to upstream breakages:
|
… long as it" doesn't bump version. [ghstack-poisoned]
… long as it" doesn't bump version. [ghstack-poisoned]
I think there's a correctness problem with erroring on bumping, versus erroring on read. From correspondence:
Also, zeroing out the VC in TensorImpl constructor seems a bit smelly to me, because that's not where the final setting of VC gets done anyway for views (it can't be, because we need to share th VC in that situation) |
A few important points about InferenceMode behavior: 1. Inference tensor can only be created inside InferenceMode. But not all tensors created in InferenceMode are inference tensors. Namely view of normal tensor inside InferenceMode produce a normal tensor, which is exactly the same as creating a view inside NoGradMode. 2. All ops are allowed inside InferenceMode, faster than normal mode. 3. Inference tensor cannot be inplace updated outside InferenceMode. 4. It's not allowed to take views of inference tensor outside InferenceMode. (This can be relaxed if needed. See comment in the generated InplaceOrView_x.cpp.) ``` // Theorectically we can allow view ops on inference tensor in normal mode // as long as we mark the output also inference tensor in this kernel. // But it'll break an invariant we currently have: inference tensor can // only be created inside InferenceMode. // This invariant makes inference tensor easier for users to understand, // so we should only break it when there's a valid use case. TORCH_CHECK(!self.unsafeGetTensorImpl()->is_inference_tensor(), "Calling view ops on inference tensor outside InferenceMode is not allowed, ", "consider doing it inside infernce mode to work around this error. ", "If you have a valid use case, please make a feature request to PyTorch."); ``` [ghstack-poisoned]
A few important points about InferenceMode behavior: 1. Inference tensor can only be created inside InferenceMode. But not all tensors created in InferenceMode are inference tensors. Namely view of normal tensor inside InferenceMode produce a normal tensor, which is exactly the same as creating a view inside NoGradMode. 2. All ops are allowed inside InferenceMode, faster than normal mode. 3. Inference tensor cannot be inplace updated outside InferenceMode. 4. It's not allowed to take views of inference tensor outside InferenceMode. (This can be relaxed if needed. See comment in the generated InplaceOrView_x.cpp.) ``` // Theorectically we can allow view ops on inference tensor in normal mode // as long as we mark the output also inference tensor in this kernel. // But it'll break an invariant we currently have: inference tensor can // only be created inside InferenceMode. // This invariant makes inference tensor easier for users to understand, // so we should only break it when there's a valid use case. TORCH_CHECK(!self.unsafeGetTensorImpl()->is_inference_tensor(), "Calling view ops on inference tensor outside InferenceMode is not allowed, ", "consider doing it inside infernce mode to work around this error. ", "If you have a valid use case, please make a feature request to PyTorch."); ``` [ghstack-poisoned]
A few important points about InferenceMode behavior: 1. Inference tensor can only be created inside InferenceMode. But not all tensors created in InferenceMode are inference tensors. Namely view of normal tensor inside InferenceMode produce a normal tensor, which is exactly the same as creating a view inside NoGradMode. 2. All ops are allowed inside InferenceMode, faster than normal mode. 3. Inference tensor cannot be inplace updated outside InferenceMode. 4. It's not allowed to take views of inference tensor outside InferenceMode. (This can be relaxed if needed. See comment in the as_view() function in VariableTypeUtils.h.) ``` // Theorectically we can allow view ops on inference tensor in normal mode // as long as we mark the output also inference tensor in this kernel. // But it'll break an invariant we currently have: inference tensor can // only be created inside InferenceMode. // This invariant makes inference tensor easier for users to understand, // so we should only break it when there's a valid use case. TORCH_CHECK(!self.unsafeGetTensorImpl()->is_inference_tensor(), "Calling view ops on inference tensor outside InferenceMode is not allowed, ", "consider doing it inside infernce mode to work around this error. ", "If you have a valid use case, please make a feature request to PyTorch."); ``` [ghstack-poisoned]
A few important points about InferenceMode behavior: 1. Inference tensor can only be created inside InferenceMode. But not all tensors created in InferenceMode are inference tensors. Namely view of normal tensor inside InferenceMode produce a normal tensor, which is exactly the same as creating a view inside NoGradMode. 2. All ops are allowed inside InferenceMode, faster than normal mode. 3. Inference tensor cannot be inplace updated outside InferenceMode. 4. It's not allowed to take views of inference tensor outside InferenceMode. (This can be relaxed if needed. See comment in the as_view() function in VariableTypeUtils.h.) ``` // Theorectically we can allow view ops on inference tensor in normal mode // as long as we mark the output also inference tensor in this kernel. // But it'll break an invariant we currently have: inference tensor can // only be created inside InferenceMode. // This invariant makes inference tensor easier for users to understand, // so we should only break it when there's a valid use case. TORCH_CHECK(!self.unsafeGetTensorImpl()->is_inference_tensor(), "Calling view ops on inference tensor outside InferenceMode is not allowed, ", "consider doing it inside infernce mode to work around this error. ", "If you have a valid use case, please make a feature request to PyTorch."); ``` ghstack-source-id: c520df2 Pull Request resolved: #54403
A few important points about InferenceMode behavior: 1. Inference tensor can only be created inside InferenceMode. But not all tensors created in InferenceMode are inference tensors. Namely view of normal tensor inside InferenceMode produce a normal tensor, which is exactly the same as creating a view inside NoGradMode. 2. All ops are allowed inside InferenceMode, faster than normal mode. 3. Inference tensor cannot be inplace updated outside InferenceMode. 4. It's not allowed to take views of inference tensor outside InferenceMode. (This can be relaxed if needed. See comment in the as_view() function in VariableTypeUtils.h.) ``` // Theorectically we can allow view ops on inference tensor in normal mode // as long as we mark the output also inference tensor in this kernel. // But it'll break an invariant we currently have: inference tensor can // only be created inside InferenceMode. // This invariant makes inference tensor easier for users to understand, // so we should only break it when there's a valid use case. TORCH_CHECK(!self.unsafeGetTensorImpl()->is_inference_tensor(), "Calling view ops on inference tensor outside InferenceMode is not allowed, ", "consider doing it inside infernce mode to work around this error. ", "If you have a valid use case, please make a feature request to PyTorch."); ``` [ghstack-poisoned]
A few important points about InferenceMode behavior: 1. Inference tensor can only be created inside InferenceMode. But not all tensors created in InferenceMode are inference tensors.1. All tensors created in InferenceMode are inference tensors except for view ops. - view ops produce output has the same is_inference_tensor property as their input. Namely view of normal tensor inside InferenceMode produce a normal tensor, which is exactly the same as creating a view inside NoGradMode. And view of inference tensor outside InferenceMode produce inference tensor as output. 2. All ops are allowed inside InferenceMode, faster than normal mode. 3. Inference tensor cannot be saved for backward. [ghstack-poisoned]
@@ -401,7 +415,9 @@ void TensorImpl::copy_tensor_metadata( | |||
const c10::VariableVersion& version_counter, | |||
bool allow_tensor_metadata_change) { | |||
copy_tensor_metadata_except_version_counter(src_impl, dest_impl, allow_tensor_metadata_change); | |||
dest_impl->set_version_counter(version_counter); | |||
if (!dest_impl->is_inference_tensor()) { | |||
dest_impl->set_version_counter(version_counter); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this is another one for subsequent refactoring, but this also looks a bit suspicious, because if I'm copying the VC from source to dest, then if source was an inference tensor the VC is empty and it's harmless to set version counter here. This should be a consequence of https://github.com/pytorch/pytorch/pull/54403/files#r607472380
I think one reason this might not hold is because of this indirect call of copy tensor metadata
Tensor VariableHooks::variable_data(const Tensor& self) const {
TORCH_CHECK(self.defined(), "cannot call variable_data() on undefined tensor");
auto self_impl_copy = self.unsafeGetTensorImpl()->shallow_copy_and_detach(
/*version_counter=*/0,
/*allow_tensor_metadata_change=*/false);
so this is freshly creating a new version counter and would trigger the error here. This is kind of suspicious anyway because we don't actually want to allocate a version counter and then throw it out, so maybe just testing if self is an inference tensor in this function would solve this problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea this one will be considered in the subsequent refactor, essentially we have some code calling shallow_copy_and_detach
and they're all suspicious. E.g. https://github.com/pytorch/pytorch/blob/master/torch/csrc/autograd/variable.h#L688 is called when you jit load a model inside InferenceMode and cause here to fail.
In as_view
we also call shallow_copy_and_detach
so the followup refactor will focus on rationalize uses shallow_copy_and_detach
and the VariableVersions passed from there.
@@ -401,7 +415,9 @@ void TensorImpl::copy_tensor_metadata( | |||
const c10::VariableVersion& version_counter, | |||
bool allow_tensor_metadata_change) { | |||
copy_tensor_metadata_except_version_counter(src_impl, dest_impl, allow_tensor_metadata_change); | |||
dest_impl->set_version_counter(version_counter); | |||
if (!dest_impl->is_inference_tensor()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is surprising to have some inference specific logic in here no?
Why can't this setter just be a no-op for such Tensors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea the reason I keep it here since this is the only one callsite left of set_version_counter()
on inference tensor with an enabled target version counter and it requires a larger refactor to get rid of it. In the ideal end state, set_version_counter on inference tensor is valid as long as the target version_counter is also disabled. I'll add a TODO here!
torch::Tensor out = view_op(inference_tensor); // go through kernels: InplaceOrView, CPU | ||
ASSERT_TRUE(is_inference_tensor(out)); | ||
ASSERT_FALSE(out.requires_grad()); | ||
ASSERT_FALSE(out.is_view()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is out not a view? Didn't we go in the InplaceOrView kernel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We go through InplaceOrView kernel but note it's a no-op if base is inference tensor.
I think this is better since it matches what you get from doing view of inference tensor inside InferenceMode (is_view=false). Since the output is inference tensor we don't use the ViewMeta anyway, so it's much cleaner to just not create it.
(another reason is if we had ViewMeta in autograd_meta, it suddenly make the tensor requires_grad which is super confusing to users.).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's ok to skip all the view logic for these.
This change is starting to get very intrusive and hard to think about...
(another reason is if we had ViewMeta in autograd_meta, it suddenly make the tensor requires_grad which is super confusing to users.).
Not sure about that. This is only true if the base does require gradients.
A few important points about InferenceMode behavior: 1. All tensors created in InferenceMode are inference tensors except for view of normal tensors. - View ops produce output has the same is_inference_tensor property as their input. Namely view of normal tensor inside InferenceMode produce a normal tensor, which is exactly the same as creating a view inside NoGradMode. And view of inference tensor outside InferenceMode produce inference tensor as output. - All inference tensors have requires_grad=False and is_leaf=True. 2. All ops are allowed inside InferenceMode, faster than normal mode. 3. Inference tensor cannot be saved for backward. 4. Inference tensor doesn't have version counter. 5. There's no way to change an existing tensor from normal to inference, or vice versa. 6. Leaf inference tensor with requires_grad=true can still have gradients. Differential Revision: [D27316483](https://our.internmc.facebook.com/intern/diff/D27316483) [ghstack-poisoned]
A few important points about InferenceMode behavior: 1. All tensors created in InferenceMode are inference tensors except for view of normal tensors. - View ops produce output has the same is_inference_tensor property as their input. Namely view of normal tensor inside InferenceMode produce a normal tensor, which is exactly the same as creating a view inside NoGradMode. And view of inference tensor outside InferenceMode produce inference tensor as output. - All inference tensors have requires_grad=False and is_leaf=True. 2. All ops are allowed inside InferenceMode, faster than normal mode. 3. Inference tensor cannot be saved for backward. 4. Inference tensor doesn't have version counter. 5. There's no way to change an existing tensor from normal to inference, or vice versa. 6. Leaf inference tensor with requires_grad=true can still have gradients. Differential Revision: [D27316483](https://our.internmc.facebook.com/intern/diff/D27316483) [ghstack-poisoned]
A few important points about InferenceMode behavior: 1. All tensors created in InferenceMode are inference tensors except for view of normal tensors. - View ops produce output has the same is_inference_tensor property as their input. Namely view of normal tensor inside InferenceMode produce a normal tensor, which is exactly the same as creating a view inside NoGradMode. And view of inference tensor outside InferenceMode produce inference tensor as output. - All inference tensors have requires_grad=False and is_leaf=True. 2. All ops are allowed inside InferenceMode, faster than normal mode. 3. Inference tensor cannot be saved for backward. 4. Inference tensor doesn't have version counter. 5. There's no way to change an existing tensor from normal to inference, or vice versa. 6. Leaf inference tensor with requires_grad=true can still have gradients. ghstack-source-id: f74582a Pull Request resolved: #54403
A few important points about InferenceMode behavior: 1. All tensors created in InferenceMode are inference tensors except for view of normal tensors. - View ops produce output has the same is_inference_tensor property as their input. Namely view of normal tensor inside InferenceMode produce a normal tensor, which is exactly the same as creating a view inside NoGradMode. And view of inference tensor outside InferenceMode produce inference tensor as output. - All inference tensors have requires_grad=False and is_leaf=True. 2. All ops are allowed inside InferenceMode, faster than normal mode. 3. Inference tensor cannot be saved for backward. 4. Inference tensor doesn't have version counter. 5. There's no way to change an existing tensor from normal to inference, or vice versa. 6. Leaf inference tensor with requires_grad=true can still have gradients. Differential Revision: [D27316483](https://our.internmc.facebook.com/intern/diff/D27316483) [ghstack-poisoned]
A few important points about InferenceMode behavior: 1. All tensors created in InferenceMode are inference tensors except for view of normal tensors. - View ops produce output has the same is_inference_tensor property as their input. Namely view of normal tensor inside InferenceMode produce a normal tensor, which is exactly the same as creating a view inside NoGradMode. And view of inference tensor outside InferenceMode produce inference tensor as output. - All inference tensors have requires_grad=False and is_leaf=True. 2. All ops are allowed inside InferenceMode, faster than normal mode. 3. Inference tensor cannot be saved for backward. 4. Inference tensor doesn't have version counter. 5. There's no way to change an existing tensor from normal to inference, or vice versa. 6. Leaf inference tensor with requires_grad=true can still have gradients. Differential Revision: [D27316483](https://our.internmc.facebook.com/intern/diff/D27316483) [ghstack-poisoned]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upcoming refactors will make this better, but I think this is good enough to go today
A few important points about InferenceMode behavior: 1. All tensors created in InferenceMode are inference tensors except for view of normal tensors. - View ops produce output has the same is_inference_tensor property as their input. Namely view of normal tensor inside InferenceMode produce a normal tensor, which is exactly the same as creating a view inside NoGradMode. And view of inference tensor outside InferenceMode produce inference tensor as output. - All inference tensors have requires_grad=False and is_leaf=True. 2. All ops are allowed inside InferenceMode, faster than normal mode. 3. Inference tensor cannot be saved for backward. 4. Inference tensor doesn't have version counter. 5. There's no way to change an existing tensor from normal to inference, or vice versa. 6. Leaf inference tensor with requires_grad=true can still have gradients. Differential Revision: [D27316483](https://our.internmc.facebook.com/intern/diff/D27316483) [ghstack-poisoned]
Summary: Pull Request resolved: pytorch#54403 A few important points about InferenceMode behavior: 1. All tensors created in InferenceMode are inference tensors except for view ops. - view ops produce output has the same is_inference_tensor property as their input. Namely view of normal tensor inside InferenceMode produce a normal tensor, which is exactly the same as creating a view inside NoGradMode. And view of inference tensor outside InferenceMode produce inference tensor as output. 2. All ops are allowed inside InferenceMode, faster than normal mode. 3. Inference tensor cannot be saved for backward. Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D27316483 Pulled By: ailzhang fbshipit-source-id: e03248a66d42e2d43cfe7ccb61e49cc4afb2923b
Stack from ghstack:
A few important points about InferenceMode behavior:
Namely view of normal tensor inside InferenceMode produce a normal tensor, which is
exactly the same as creating a view inside NoGradMode. And view of
inference tensor outside InferenceMode produce inference tensor as output.
or vice versa.
gradients.
Differential Revision: D27316483