-
Notifications
You must be signed in to change notification settings - Fork 22.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-enabled mm+add tree batching in the JIT #13228
Conversation
torch/csrc/jit/passes/batch_mm.cpp
Outdated
batch_mm->insertBefore(root.node); | ||
root.node->output()->replaceAllUsesWith(batch_mm->output()); | ||
auto matmuls = root.removeTransposesAndGatherMatmuls(); | ||
WithInsertPoint insert_gurad {root.node}; |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
||
RegisterOperators mm_tree_reduction_reg({ | ||
Operator( | ||
Symbol::prim("MMTreeReduce"), |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zou3519 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
} | ||
|
||
bool shape_is_fast(const at::Tensor& lhs, const at::Tensor& rhs) { | ||
size_t l = lhs.size(0); |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zou3519 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
I've had to generously increase the range of the CreateADSubgraphs pass, because even though it collapses the RNN loop to a single differentiable subgraphs and a few other nodes, the range uses the distances in the original graph...
cc @zdevito @zou3519