New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specialize optional tensor inputs to graphs in the JIT #18360
Conversation
This specializes optional tensor inputs to either a DimensionedTensorType or, when None is passed, UndefinedTensor (aka AutogradZeroTensorType). This works because we already have different specs and thus separate plans for the two cases. It enhances the shape analysis - because now unwrapped optional tensors will have DimensionedTensorType with appropriate shape and required grad etc. Also, when combined with "if-pruning" (which I understand
Looks great! I'm going to wait for @wanchaol to review since I have less of a knowledge of how UndefinedTensor works. I think maybe mustBeNone() api should check for UndefinedTensor case now but I'm not completely sure |
Well, I'll follow up if our patches don't achieve what I have in mind. :) |
Also i'm not completely sure on this either but Constant Propagation / Constant Pooling might need to be updated now to account for this case now too |
Adding the input shapes happens right at the beginning, so I think it would be before the various passes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good. I will update my other PR so that it handles the Undefined Tensor case. Maybe wait to land until @wanchaol comments
test/test_jit.py
Outdated
return 0 | ||
|
||
fn(None) | ||
g = fn.graph_for(None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a check that the output of the function is 1 / 0 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I only added a return value to not have it tell me that it needs to do nothing... :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ya more just as a sanity check that the optimizations i'm adding in the other PR don't break this, although i'll wait for the landing of that PR with this one to be safe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I would suggest is that we amend the check to see that the graph is only a constant return for each case. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, have a quick conversation with @eellison, this might need to deal with other types for shape analysis.
@@ -513,6 +513,12 @@ class ShapePropagator { | |||
} | |||
return; | |||
} | |||
case prim::unchecked_unwrap_optional: { | |||
if (node->input()->type()->isSubtypeOf(TensorType::get())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more general case here is to set the output to be the unwrapped type of the input if it's an optional input, and the input if it's not an optional
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to be a bit more detailed than that about None. So I'd use the following:
- if it's an
IValue
and thatisNone
, leave the output alone, - if it's an
OptionalType
return theelementType
, - otherwise return the Type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, for unchecked, we could always pass on the type because it should only be inserted where we know it's not None, so I'll skip the first there.
@@ -529,6 +535,10 @@ class ShapePropagator { | |||
return; | |||
} | |||
case aten::_unwrap_optional: { | |||
if (node->input()->type()->isSubtypeOf(TensorType::get())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since _unwrap_optional will throw an exception if it's None, we can do the same thing as with prim::unchecked_unwrap and set the output to be the unwrapped type of the input if it's an optional input, and the input if it's not an optional
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above. Here, we need the None handling and can use the form that is already in that case.
Conditions as discussed in the review With correct indenting, the test actually tests...
The CI failures seem unrelated, so if you're happy with it, I'd suggest to land this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@soumith is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
So Adam thinks it that it is preferrable to use None instead of UndefinedTensor. I'll follow up with an improvement. |
In pytorch#18360, we used undefined Tensor (aka AutogradZeroTensor), but this can be errorprone when the type or value is compared to None, e.g. as seen when comined with the (not yet landed) For this to work, we must allow None passed to functions taking Tensor?.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm very confused by what has changed here
@@ -153,7 +153,8 @@ struct ArgumentSpec { | |||
|
|||
private: | |||
TypePtr fillType(TypePtr original, size_t& offset) const { | |||
if (original->isSubtypeOf(TensorType::get())) { | |||
if (original->isSubtypeOf(TensorType::get()) | |||
|| original->isSubtypeOf(OptionalType::ofTensor())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite surprising. Do we seriously translate None
to an undefined tensor when you call a script function? How can that even work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, is it because all args are zero-initialized, so you're abusing the fact that this arg is not a tensor, but it will report being undefined? Bleh
if(auto ot = node->input()->type()->cast<OptionalType>()) { | ||
node->output()->setType(ot->getElementType()); | ||
} else { | ||
node->output()->setType(node->input()->type()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we have those calls on non-optional inputs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, when is that op used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These calls would be inserted before input types have been specialized. If we have an optional tensor as an input to the graph and we specialize the graph to a non-optional tensor, than these calls will no longer be needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When does an unchecked_unwrap_optional ever have a different type than ot->getElementType? This code looks like a no-op to me. Why is it ineeded?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
node->output()->setType(ot->getElementType()); | ||
} else { | ||
node->output()->setType(node->input()->type()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did we add this? How is _unwrap_optional
different from unchecked_unwrap_optional
? This code looks really weird, because we're trying to see if we can infer statically that this thing gets a None
as an argument (which it never should??), and otherwise we assume it doesn't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I think we should revert this and probably also abandon the newer patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_unwrap_optional is user code. it throws an error if the input is None and returns the input. unchecked_unwrap_optional has been inserted by the compiler when we can statically reason that the input will never be none (and so doesn't check if the input is None)
This specializes optional tensor inputs to either a DimensionedTensorType or, when None is passed,
UndefinedTensor (aka AutogradZeroTensorType).
This works because we already have different specs and thus separate plans for the two cases.
It enhances the shape analysis - because now unwrapped optional tensors will have DimensionedTensorType with appropriate shape and required grad etc.
Also, when combined with "if-pruning" (which I understand #18259 works towards), we actually get much nicer concrete graphs, too.