New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[doc] Add note about torch.flip
returning new tensor and not view.
#50041
[doc] Add note about torch.flip
returning new tensor and not view.
#50041
Conversation
💊 CI failures summary and remediationsAs of commit f6eb4cd (more details on the Dr. CI page):
🕵️ 2 new failures recognized by patternsThe following CI failures do not appear to be due to upstream breakages: pytorch_linux_xenial_py3_clang7_onnx_ort_test2 (1/2)Step: "Run tests" (full log | diagnosis details | 🔁 rerun)
|
Codecov Report
@@ Coverage Diff @@
## master #50041 +/- ##
=======================================
Coverage 80.70% 80.70%
=======================================
Files 1904 1904
Lines 206597 206597
=======================================
Hits 166742 166742
Misses 39855 39855 |
torch/_torch_docs.py
Outdated
@@ -8332,6 +8332,11 @@ def merge_dicts(*dicts): | |||
|
|||
Reverse the order of a n-D tensor along given axis in dims. | |||
|
|||
.. note:: | |||
Unlike `np.flip`, `torch.flip` returns a new Tensor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's tweak and simplify this just a bit.
"`torch.flip` makes a copy of :attr:`input`'s data. This is different from `np.flip`, which returns a view."
There are a couple reasons I want to suggest this change:
- it starts with torch.flip, no knowledge of NumPy required
- the phrase "a new tensor" may be ambiguous; a view appears like "a new tensor," too. We want to be explicit that torch.flip copies the data.
- it removes the time complexity discussion. That discussion is correct and relevant, but a little too detailed since it's not the only implication of this behavior. For example, the write semantics also change, but the level of detail of discussing the time complexity suggests that a change like that would be mentioned, too. Best simplify and avoid it.
Then this can be a template to substitute flip{ud, lr} into. Does that sound good to you, @kshitij12345?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first two points make sense.
I think it would be good to mention the time-complexity as it is a major gotcha #16424 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK that's a good point.
Big O notation seems like a lot of machinery to bring to this documentation, though. Could we sneak in the performance discussion with something like (I'm also going to add an explicit reference to NumPy since "np" in the above is not defined):
"This is different from NumPy's np.flip
, which quickly returns a view."
If that's too subtle, what about avoiding big O by doing:
"This is different from NumPy's np.flip
, which returns a view. Since copying a tensor's data is more work than viewing that data, torch.flip
is expected to be slower than np.flip
."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe flip
would be used heavily on CV side in augmentation pipelines, so it might be better to mention about time-complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Big O notation seems like a lot of machinery to bring to this documentation, though.
Agreed.
I would prefer,
"This is different from NumPy's np.flip, which returns a view. Since copying a tensor's data is more work than viewing that data, torch.flip is expected to be slower than np.flip."
as
which quickly returns a view.
Seems a bit ambiguous as to what is quick and what isn't
So change would be
.. note:: This is different from NumPy's np.flip, which returns a view. Since copying a tensor's data is more work than viewing that data, torch.flip is expected to be slower than np.flip.
Just a side note, numpy.flip documentation does mention that it is a constant-time operation. (Maybe we can say torch.flip is linear-time w.r.t number of elements in tensor)
https://numpy.org/doc/stable/reference/generated/numpy.flip.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
".. note:: torch.flip
makes a copy of :attr:input
's data. This is different from NumPy's np.flip, which returns a view in constant time. Since copying a tensor's data is more work than viewing that data, torch.flip is expected to be slower than np.flip."
Merging your suggestion about constant time + adding back the first sentence. How does this look?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup. This feels better. Thanks!
@mruberry PTAL :) Failure is irrelevant. |
torch/_torch_docs.py
Outdated
@@ -8365,6 +8370,11 @@ def merge_dicts(*dicts): | |||
Note: | |||
Equivalent to input[:,::-1]. Requires the array to be at least 2-D. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm now noticing these notes. I think both the fliplr and flipud suggestions for advanced in indexing will actually throw errors in PyTorch? Would you check? If so, seems like we should remove these notes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct indeed.
>>> import torch
>>> t = torch.ones((3,2))
>>> t[:, ::-1]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: step must be greater than zero
>>> t[:, ::-1]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: step must be greater than zero
>>> t[::-1, ...]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: step must be greater than zero
>>> t[::1, ...]
tensor([[1., 1.],
[1., 1.],
[1., 1.]])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updating it to
Note:
Requires the array to be at least 2-D.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second look should we make it,
Note:
Requires the tensor to be at least 2-D.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be more appropriate. Sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! Thanks @kshitij12345, and thanks for fixing the old and incorrect notes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Reference: #38271