New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[discussion] Smarter version of torch.reshape (can avoid realloc in some cases) #28090
Comments
as far as I understand, part of the question is semantics of view_as (does it take into account strides or not). and the accompanying question of treatment of -1 (whether C-contiguity of dimensions is enforced regardless of strides) |
Hi,
|
|
e.g. currently: |
Ho your point was about The related attempt that exist is the new memory format system that more and more functions support and that is preserved by some operations. And I'm going to go back on what I said: In your case, the reshape here cannot be done without copy:
I get your point that if you only do this to perform element-wise inplace operation on it this is sub optimal. |
I think we are saying the same thing: existing reshape / reshape_as have semantics that does not use strides. If one uses strides to be smart and avoid realloc, semantics are different (so maybe a new name or at least kwarg would be needed). In practice I stumbled upon this in https://github.com/pytorch/pytorch/pull/27621/files#diff-7541a840cb6478a343d371533cd95a7aL678 that does reshape in order to do element-wise ops later (and then reshape back at the very end). It does reshape in order to simplify code and deal with only one dimension always. If input is transposed and is not logically contiguous in the dimensions that are being flattened, a copy will happen. If this new semantics exists, probably some methods that use reshape for convenience (or generalising 1d/2d/3d ops) would support the other memory format with fewer reallocations. |
Is the following what you want? a = torch.rand(2,3,4)
a_ = a.transpose(-1, -2)
a_strided_view = a_.as_strided((2, 12), (12, 1)) # Note that the data inside a_strided_view are impossible to interpret except element-wise !
b_ = someltwiseinplacefunc_(a_strided_view) # This has to be an inplace operation
# If it returns a new contiguous Tensor, the next line will output nonsense.
b = b_.as_strided(a_.size(), a_.stride()) # Make these data sensible again by re-using the proper strides. The problem with this approach is that you need to make sure that the intermediary function does not do any copy otherwise the result won't be correct. |
A scope reduction: I am meaning only the stride consideration when the flattened area is in fact contiguous in memory. If we can check it is contiguous, then it's only order of dimensions that is important (not the strides actual values), i.e. we can sort the dimensions by strides first, then do conventional view, then apply whatever operation (including a copy one) we wish, then reshape back, then permute back. bad pseudo code: # if the dimensions that we want to flat out with -1 are in fact contiguous in memory, we can permute them, then view should succeed
b_ = a_.permute(0, ....sorted order by strides...)
b__ = view(2, -1)
c = somefuncionincludingcopying(b__)
c_ = c.view_as(b_).permute(...some good order restoring dimension order of a...) i'll try to write a better snippet |
Imagine a following situation:
Currently
a_.reshape(2, -1)
seems to reallocate, but it's not necessary for all operations, especially elementwise ones (given that frequently will reshape back the result). These things happen in C++ sometimes. I guess currently the solution is manual handling of strides.If it doesn't reallocate, the semantics seems different, but it may still be worthwhile to allow for flattening of contiguous chunks of memory without reallocation and ignoring the striding dimension order.
The text was updated successfully, but these errors were encountered: