Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long term plan for adding operators #36

Closed
ezyang opened this issue Aug 3, 2017 · 2 comments
Closed

Long term plan for adding operators #36

ezyang opened this issue Aug 3, 2017 · 2 comments

Comments

@ezyang
Copy link
Owner

ezyang commented Aug 3, 2017

Right now, we have been adding operators to the AST one-by-one to support various operations we may be interested in. There are problems with doing this long term: every new operation you add, is another case that must be handled in any IR_IF case statement. IR_IF works well if you don't have too many forms in your language, not if you have a separate subkind for every single one of PyTorch's operations!

The object-oriented way to solve this problem is to keep subclasses, stop writing IR_IFs and move methods into the class definition. In GHC, we solved this problem differently, by maintaining an environment of "known" ops and recording with them information necessary to do analyses and optimizations with them (e.g., their types, strictness, unfoldings, etc.) I think this latter approach scales better with lots of ops (and is similar to how NNVM is structuring their operators: https://github.com/dmlc/nnvm/blob/master/docs/overview.md) but we should make a decision one way or another before we start adding tons and tons of operator definitions to fill out our PyTorch support.

Unlike NNVM, one thing I do NOT advocate is making this information public (at least for now.) So, no user-written ops that support fusion. That's a very tricky API to get right and there should be some concerted design that happens before we let people rely on it.

@ezyang
Copy link
Owner Author

ezyang commented Aug 3, 2017

One thing to note: until we start porting operators into direct C++, translation of an operator like Sigmoid into a a plain IR node is a pessimization, because the only way we can actually run it is by reconstruction the appropriate Python function and then running it.

@ezyang
Copy link
Owner Author

ezyang commented Aug 14, 2017

@zdevito, @killeent, and I have decided that we will solve this problem by introducing a new Primitive node type, which corresponds directly to ToffeeIR nodes (semantics defined by ToffeeIR). User-defined autograd Functions which correspond to a ToffeeIR node will define how they convert into ToffeeIR, and the optimizer will work directly with ToffeeIR nodes.

See also https://fb.quip.com/bg3MAJe0MStd

@ezyang ezyang closed this as completed Sep 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant