Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specialize Optional[T] to T (or subtype for Tensor) or None when executing graph #18407

Open
wants to merge 19 commits into
base: master
from

Conversation

Projects
None yet
5 participants
@t-vi
Copy link
Collaborator

t-vi commented Mar 24, 2019

This patch specializes Optional[Tensor] graph inputs to either a DimensionedTensorType (if a Tensor is passed) or NoneType. Other Optional[T] are specialized to T or None.

Functional changes

  • For unwrapping (checked and unchecked) we need to keep the output type, as IR code that follows unwrapping may not work with NoneType (just as it doesn't deal with Optional). While it would not be hit during execution, it will run against the (legitimate) assumptions of the analysis passes.
  • Function lookup currently will not match NoneType when it expects optional (I'm not entirely sure why this doesn't lead to unhappyness currently, but hey), I amend this at the level of the function matching code (operator.cpp), but see Adam's comments. We would run into trouble if we needed to select between functions whose signature only differs in Optional types with different subtypes, but we would have the same problem when calling them directly, so I would think this is OK.

Advantages:

  • It would enable throwing away branches we can't hit. This also reduces the "blockyness" of the graph, so it may be easier to apply optimizations (e.g. fuse things in if t is None: ... and outside the if.
  • Arguments passed into Optional[Tensor] arguments will get shape information, which is very handy.
  • It get's rid of the problem that tensors passed into Optional arguments get requires_grad set erroneously #18270 (though that also affects lists, which aren't fixed here).
  • Optional[List[int]] is needed for #18697.

Disadvantages:

  • We're changing typing in a more subtle way than the TensorType->DimensionedTensorType.
  • In particular, specializing to NoneType loses the Type information captured in the OptionalType element type.
Specialize Optional (Tensor) to None when executing graph
In #18360, we used undefined Tensor (aka AutogradZeroTensor),
but this can be errorprone when the type or value is compared
to None, e.g. as seen when comined with the (not yet landed)

For this to work, we must allow None passed to functions
taking Tensor?.
Show resolved Hide resolved torch/csrc/jit/fuser/fallback.cpp Outdated
Show resolved Hide resolved torch/csrc/jit/passes/shape_analysis.cpp Outdated
Show resolved Hide resolved torch/csrc/jit/operator.cpp Outdated
Show resolved Hide resolved torch/csrc/jit/passes/shape_analysis.cpp
Show resolved Hide resolved torch/csrc/jit/passes/shape_analysis.cpp Outdated
Show resolved Hide resolved torch/csrc/jit/passes/specialize_autogradzero.cpp Outdated

@t-vi t-vi changed the title Specialize Optional (Tensor) to None when executing graph [WIP] Specialize Optional (Tensor) to None when executing graph Mar 24, 2019

@t-vi

This comment has been minimized.

Copy link
Collaborator Author

t-vi commented Mar 24, 2019

Given the rather fundamental nature of the shortcomings of this approach, I'd have a tendency to drop it.

facebook-github-bot added a commit that referenced this pull request Mar 25, 2019

Revert "Specialize optional tensor inputs to graphs in the JIT (#18360)…
…" (#18411)

Summary:
This reverts commit 7cc7ed1.

I think it's better to sort out the issues raised in #18407 firs. I'm sorry for not stopping it earlier.
Pull Request resolved: #18411

Differential Revision: D14594937

Pulled By: soumith

fbshipit-source-id: 3c90b7fa7694e2f59e55607acecde4a47af801ea

@t-vi t-vi changed the title [WIP] Specialize Optional (Tensor) to None when executing graph Specialize Optional (Tensor) to None when executing graph Mar 25, 2019

@eellison

This comment has been minimized.

Copy link
Contributor

eellison commented Apr 2, 2019

Is this the relevant PR for specializing Optional tensors as graph input?

@t-vi

This comment has been minimized.

Copy link
Collaborator Author

t-vi commented Apr 2, 2019

@eellison
Copy link
Contributor

eellison left a comment

Looks good to me, just a few minor nits. I think someone with a better knowledge of the specialize autogradzero pass should review before we land @apaszke @zdevito

Show resolved Hide resolved test/test_jit.py Outdated
Show resolved Hide resolved test/test_jit.py Outdated
Show resolved Hide resolved torch/csrc/jit/argument_spec.h Outdated
Show resolved Hide resolved torch/csrc/jit/ir.cpp Outdated
Show resolved Hide resolved torch/csrc/jit/passes/peephole.cpp Outdated
Show resolved Hide resolved torch/csrc/jit/passes/shape_analysis.cpp Outdated
Show resolved Hide resolved torch/csrc/jit/passes/shape_analysis.cpp Outdated
Show resolved Hide resolved torch/csrc/jit/operator.cpp Outdated
@eellison

This comment has been minimized.

Copy link
Contributor

eellison commented Apr 2, 2019

I'm not sure that this case will necessarily be hit but just to be safe could you modify the operator of aten::_unwrap_optionnal so that it checks for the undefined tensor case ?

Edit: if we only encounter undefined tensors in the backward pass than maybe it's not needed

@apaszke
Copy link
Member

apaszke left a comment

I think I finally understand where a lot of confusion is coming from, and why AD has regressed in weird ways. See my comment in argument_spec.h.

Show resolved Hide resolved torch/csrc/jit/operator.cpp Outdated
Show resolved Hide resolved torch/csrc/jit/passes/shape_analysis.cpp
Show resolved Hide resolved torch/csrc/jit/passes/shape_analysis.cpp Outdated
Show resolved Hide resolved torch/csrc/jit/argument_spec.h Outdated
Show resolved Hide resolved torch/csrc/jit/ir.cpp Outdated
@t-vi

This comment has been minimized.

Copy link
Collaborator Author

t-vi commented Apr 3, 2019

So after merging master (including the ArgumentSpecCreator), I do distinguish between undefined tensor and None by having an extra flag is_none_ instead of abusing defined_. As the is_tensor_ is dropped, I added a new is_nontensor_optional_ (so that zeroing the struct will cover the tensor case).
I actually have the non-tensor case (that used to be part of #18697) here, maybe they're better here.

Note that before this patch, Optional[Tensor] hadn't been specialized at all, now we specialize into cases for the various dimensions and None. On the other hand Optional[Tuple[...]] and Optional[Class[...]] only looked at in terms of None / Value, not in terms of their contents (but that wasn't the case before, either, and I don't want to complicate things even more).

@t-vi t-vi changed the title Specialize Optional (Tensor) to None when executing graph Specialize Optional[T] to T or None when executing graph Apr 4, 2019

@t-vi t-vi changed the title Specialize Optional[T] to T or None when executing graph Specialize Optional[T] to T (or subtype for Tensor) or None when executing graph Apr 4, 2019

@t-vi t-vi removed the needs discussion label Apr 7, 2019

@t-vi

This comment has been minimized.

Copy link
Collaborator Author

t-vi commented Apr 7, 2019

So I think we end up with the same thing in the _jit_pass_complete_shape_analysis. There was a (pre-existing) bug in the ArgumentSpec test that it compared a std::moved spec to a fresh one.
I could split out the removal of CompleteArgumentSpec if that's preferable. It's not strictly needed, but I felt bad having two differently working ArgumentSpec/CompleteArgumentSpec around when I needed to touch the first.

@eellison eellison requested a review from zdevito Apr 9, 2019

@zdevito
Copy link
Contributor

zdevito left a comment

This is a really cool PR, but there are a couple of changes that introduce subtle bugs in the type system. See inline comments for how to avoid them.

Show resolved Hide resolved test/cpp/jit/test_argument_spec.h Outdated
Show resolved Hide resolved torch/csrc/jit/argument_spec.cpp Outdated
Show resolved Hide resolved torch/csrc/jit/argument_spec.h Outdated
Show resolved Hide resolved torch/csrc/jit/argument_spec.h Outdated
Show resolved Hide resolved torch/csrc/jit/operator.cpp Outdated
Show resolved Hide resolved torch/csrc/jit/operator.cpp Outdated
Show resolved Hide resolved torch/csrc/jit/passes/peephole.cpp Outdated
@@ -518,6 +518,19 @@ class ShapePropagator {
}
return;
}
case prim::unchecked_unwrap_optional: {

This comment has been minimized.

Copy link
@zdevito

zdevito Apr 10, 2019

Contributor

Input to unchecked_unwrap_optional should always be an optional type. Things will break if it isn't

This comment has been minimized.

Copy link
@t-vi

t-vi Apr 10, 2019

Author Collaborator

Well, no. If we specialize optionals, it might be element type instead.

This comment has been minimized.

Copy link
@t-vi

t-vi Apr 10, 2019

Author Collaborator

I changed it to test whether the input is a subtype of the output and if so, take the input type. This allows Optional[Tensor]s that have been specialized to DimensionedTensorType to have the more precise output type.

Compared to previous versions, I did eliminate the None checking and purely check that if we have a subtype of T input into y:T = unwrap(x : Optional[T]), then we set the output type to the actual type of x. I formulated it that way to be sure that Optional[Optional[T]]isn't dangerous to me (whether or not it makes sense, but it's just aisSubtypeOfinstead of checking false oncast`).
I hope this case is OK.

@@ -540,10 +553,17 @@ class ShapePropagator {
return;
}
case aten::_unwrap_optional: {
auto input_ivalue = toIValue(node->input());
if (input_ivalue && input_ivalue->isNone()) {
// During analysis we don't want to pass None through here

This comment has been minimized.

Copy link
@zdevito

zdevito Apr 10, 2019

Contributor

same here

This comment has been minimized.

Copy link
@t-vi

t-vi Apr 10, 2019

Author Collaborator

Again, it might be element (sub-) type after specialization.

auto ot = (*input_stack.back()++)->expect<OptionalType>();
auto& arg = spec.at(arg_spec_offset++);
if (arg.isNone()) {
result_stack.back().emplace_back(NoneType::get());

This comment has been minimized.

Copy link
@zdevito

zdevito Apr 10, 2019

Contributor

This is the source of the problems with NoneType. Explicit None's are represented as:

a : Optional[int] = prim::Constant()

in the graph. NoneType is only used as an intermediate in the compiler. Instead of trying to change the input to NoneType. Instead I suggest disconnecting the input that must be none from its uses and replacing it with the appropriate None constant. This will require changing the API here:

ArgumentSpecCreator::specializeTypes(Graph& graph, ...) rather than simply returning new types. This change is fine to make, and will end up helping changes that I may need to do to argspec in the future.

This comment has been minimized.

Copy link
@t-vi

t-vi Apr 10, 2019

Author Collaborator

This has been the thing I've been missing. Thank you so much!

The remaining not so nice thing here is that I don't know how to specialize Optionals to a Typed-None-Constant inside tuples and objects, because I cannot use replaceAllUses.
My tentative conclusion is that I cannot do this currently unless I can add some attribute to the optional type to indicate that it is actually a Typed-None constant.
What could be done is to see which uses are unpacking this, but that's really ugly and probably would remain incomplete.

@t-vi

This comment has been minimized.

Copy link
Collaborator Author

t-vi commented Apr 10, 2019

Thanks @zdevito, it'll make a much shinier patch once I implement your advice.

So one thing I'd need a yes/no on is: Do you want _jit_pass_complete_shape_analysis to move to ArgumentSpecCreator (the "no" option being to continue to use CompleteArgumentSpec)?

t-vi added some commits Apr 10, 2019

@t-vi

This comment has been minimized.

Copy link
Collaborator Author

t-vi commented Apr 10, 2019

So I think this is much better now. I left open the two things where I think I might not be completely following the advice to the letter (in the unwrapping) or see a limitation.

I left the _jit_pass_complete_shape_analysis with using the new mechanism and left CompleteArgumentSpec unchanged (except fixing one test).

@zdevito zdevito self-requested a review Apr 16, 2019

@zdevito

This comment has been minimized.

Copy link
Contributor

zdevito commented Apr 18, 2019

@t-vi I've been swamped this week, so I haven't gotten to look at this change again, but I just wanted to let you know I haven't forgotten about and will try to get to it as soon as I can.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.