-
-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-using tensor storage when possible #664
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
coreylowman
force-pushed
the
phantom-tensor
branch
from
April 5, 2023 13:44
085ec54
to
c8e6b7c
Compare
coreylowman
changed the title
[WIP] Re-using tensor storage when possible
Re-using tensor storage when possible
Apr 5, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
For operations that are made up of many sub-operations like batchnorm, we currently allocate new data for each operation. This notably doesn't use the ownership model of rust. For instance calling
t.add(0.1).sqrt()
can modify t's buffer in place. The backward call for scalar addition doesn't even need to keep a reference tot
, andsqrt
can keep a reference to its output.Summary
GhostTensor
which holds id/length/shape/strides, but importantly NOT a reference to the data.Gradients
to useGhostTensor
for all internal methodsUnary Operations
In general there are three cases for unary operations:
For case 1 we can re-use the input data, and not keep a reference to the output. For case 2, we can re-use the input buffer, and keep a reference to the output. Case 3 is what happens now, which is we allocate & keep a reference to the input.
Binary Operations
There are only two cases for binary operations:
For case 1, we can re-use the input buffers, and not keep a reference to the output. This actually saves a lot because add and sub are very common operations. We do have to be careful about broadcasts, so we can actually only reuse an input buffer if its contiguous.
There is also a heuristic of trying to use an input buffer with only 1 reference, since we can try to pick between either lhs and rhs.
Results
Here are some current results on my dev CPU for batchnorm2d & softmax.
CPU
cargo bench --bench batchnorm2d
:cargo bench --bench softmax
:A10 GPU
cargo bench -F cuda --bench batchnorm2d
:cargo bench -F cuda --bench softmax
: