Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[doc] Add note about torch.flip returning new tensor and not view. #50041

Closed
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
15 changes: 15 additions & 0 deletions torch/_torch_docs.py
Expand Up @@ -8332,6 +8332,11 @@ def merge_dicts(*dicts):

Reverse the order of a n-D tensor along given axis in dims.

.. note::
Unlike `np.flip`, `torch.flip` returns a new Tensor.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's tweak and simplify this just a bit.

"`torch.flip` makes a copy of :attr:`input`'s data. This is different from `np.flip`, which returns a view."

There are a couple reasons I want to suggest this change:

  • it starts with torch.flip, no knowledge of NumPy required
  • the phrase "a new tensor" may be ambiguous; a view appears like "a new tensor," too. We want to be explicit that torch.flip copies the data.
  • it removes the time complexity discussion. That discussion is correct and relevant, but a little too detailed since it's not the only implication of this behavior. For example, the write semantics also change, but the level of detail of discussing the time complexity suggests that a change like that would be mentioned, too. Best simplify and avoid it.

Then this can be a template to substitute flip{ud, lr} into. Does that sound good to you, @kshitij12345?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first two points make sense.

I think it would be good to mention the time-complexity as it is a major gotcha #16424 (comment)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK that's a good point.

Big O notation seems like a lot of machinery to bring to this documentation, though. Could we sneak in the performance discussion with something like (I'm also going to add an explicit reference to NumPy since "np" in the above is not defined):

"This is different from NumPy's np.flip, which quickly returns a view."

If that's too subtle, what about avoiding big O by doing:

"This is different from NumPy's np.flip, which returns a view. Since copying a tensor's data is more work than viewing that data, torch.flip is expected to be slower than np.flip."

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe flip would be used heavily on CV side in augmentation pipelines, so it might be better to mention about time-complexity.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big O notation seems like a lot of machinery to bring to this documentation, though.

Agreed.

I would prefer,

"This is different from NumPy's np.flip, which returns a view. Since copying a tensor's data is more work than viewing that data, torch.flip is expected to be slower than np.flip."
as

which quickly returns a view.

Seems a bit ambiguous as to what is quick and what isn't

So change would be

.. note:: This is different from NumPy's np.flip, which returns a view. Since copying a tensor's data is more work than viewing that data, torch.flip is expected to be slower than np.flip.
   

Just a side note, numpy.flip documentation does mention that it is a constant-time operation. (Maybe we can say torch.flip is linear-time w.r.t number of elements in tensor)

https://numpy.org/doc/stable/reference/generated/numpy.flip.html

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

".. note:: torch.flip makes a copy of :attr:input's data. This is different from NumPy's np.flip, which returns a view in constant time. Since copying a tensor's data is more work than viewing that data, torch.flip is expected to be slower than np.flip."

Merging your suggestion about constant time + adding back the first sentence. How does this look?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. This feels better. Thanks!

Due to this behavior, `torch.flip` has time complexity
of `O(N)` while `np.flip` is `O(1)`.

Args:
{input}
dims (a list or tuple): axis to flip on
Expand Down Expand Up @@ -8365,6 +8370,11 @@ def merge_dicts(*dicts):
Note:
Equivalent to input[:,::-1]. Requires the array to be at least 2-D.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm now noticing these notes. I think both the fliplr and flipud suggestions for advanced in indexing will actually throw errors in PyTorch? Would you check? If so, seems like we should remove these notes?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct indeed.

>>> import torch
>>> t = torch.ones((3,2))
>>> t[:, ::-1]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: step must be greater than zero
>>> t[:, ::-1]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: step must be greater than zero
>>> t[::-1, ...]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: step must be greater than zero
>>> t[::1, ...]
tensor([[1., 1.],
        [1., 1.],
        [1., 1.]])

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating it to

Note:
    Requires the array to be at least 2-D.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second look should we make it,

Note:
    Requires the tensor to be at least 2-D.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be more appropriate. Sure.


.. note::
Unlike `np.fliplr`, `torch.fliplr` returns a new Tensor.
Due to this behavior, `torch.fliplr` has time complexity
of `O(N)` while `np.fliplr` is `O(1)`.

Args:
input (Tensor): Must be at least 2-dimensional.

Expand All @@ -8391,6 +8401,11 @@ def merge_dicts(*dicts):
Note:
Equivalent to input[::-1,...]. Requires the array to be at least 1-D.

.. note::
Unlike `np.flipud`, `torch.flipud` returns a new Tensor.
Due to this behavior, `torch.flipud` has time complexity
of `O(N)` while `np.flipud` is `O(1)`.

Args:
input (Tensor): Must be at least 1-dimensional.

Expand Down