Skip to content

Conversation

wanchaol
Copy link
Collaborator

@wanchaol wanchaol commented Jun 6, 2024

as titled, skip the autograd function and directly return the
local_tensor if it's under no_grad context, this would avoid creating
views

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Jun 6, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/128145

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (6 Unrelated Failures)

As of commit 7a1cfdf with merge base 2fc9079 (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

UNSTABLE - The following jobs failed but were likely due to flakiness present on trunk and has been marked as unstable:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added ciflow/inductor oncall: distributed Add this issue/PR to distributed oncall triage queue labels Jun 6, 2024
.. note:: `to_local` is differentiable, the `requires_grad` of the local tensor returned
will depend on if the `DTensor` requires_grad or not.
"""
if not torch.is_grad_enabled():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the question is, how expensive is torch.is_grad_enabled() 😆

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope it's pretty cheap given that we are using it invasively in pytorch ;)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tested and got a trace locally, I didn't even see the to_local or is_grad_enabled calls in the CPU trace :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that's because there is no aten op in that call stack, so the profiler cannot see it? (but this could still contribute to gaps between other ops in the profiler trace?)

@wanchaol wanchaol requested a review from albanD June 6, 2024 20:16
@wanchaol
Copy link
Collaborator Author

wanchaol commented Jun 6, 2024

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jun 6, 2024
@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team Raised by workflow job

@wanchaol
Copy link
Collaborator Author

wanchaol commented Jun 7, 2024

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@wanchaol
Copy link
Collaborator Author

wanchaol commented Jun 7, 2024

@pytorchbot merge -f "queued job"

@pytorchmergebot
Copy link
Collaborator

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

TharinduRusira pushed a commit to TharinduRusira/pytorch that referenced this pull request Jun 14, 2024
as titled, skip the autograd function and directly return the
local_tensor if it's under no_grad context, this would avoid creating
views

Pull Request resolved: pytorch#128145
Approved by: https://github.com/awgu
ghstack dependencies: pytorch#128112
@github-actions github-actions bot deleted the gh/wanchaol/482/head branch July 8, 2024 01:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request Merged oncall: distributed Add this issue/PR to distributed oncall triage queue release notes: distributed (dtensor) release notes category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants