[DTensor] Register replication strategy for a few upsampling interpolate ops #137201

wz337 · 2024-10-02T18:42:50Z

Stack from ghstack (oldest at bottom):

-> [DTensor] Register replication strategy for a few upsampling interpolate ops #137201

To unblock Llama 3.2 vision's use case to resize positional embeddings for fine-tuning. Context in workplace post.

cc @XilunWu @H-Huang @awgu @kwen2501 @wanchaol @fegin @fduwjj @wconstab @d4l3k @c-p-i-o @tianyu-l

[ghstack-poisoned]

pytorch-bot · 2024-10-02T18:42:54Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/137201

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 4abdf2a with merge base 8962610 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

XilunWu

Replicate strategy should be fine for now.

[ghstack-poisoned]

ghstack-source-id: e7dd90a Pull Request resolved: #137201

wz337 · 2024-10-02T18:52:40Z

Replicate strategy should be fine for now.

Ye. I look at the ops and it doesn't seem we can improve sharding strategy for these ones so it would always require redistributing to replicate.

awgu · 2024-10-02T18:56:28Z

How would the replicated DTensor get converted back to sharded in the state dict load flow?

wz337 · 2024-10-02T19:07:52Z

How would the replicated DTensor get converted back to sharded in the state dict load flow?

Ah. Thanks for raising it. We won't be able to define the layout for the output, since it is determined by the next op.

In order to shard it back, users would have to do a dtensor.redistribute(placements=[Shard(0)]). This way there is no communication compared with using distribute_tensor().

wz337 · 2024-10-02T22:19:35Z

@pytorchmergebot merge

pytorchmergebot · 2024-10-02T22:22:26Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Update

4120e17

[ghstack-poisoned]

pytorch-bot bot added oncall: distributed Add this issue/PR to distributed oncall triage queue labels Oct 2, 2024

Update

c646d81

[ghstack-poisoned]

wz337 added module: dtensor distributed tensor tag topic: not user facing topic category labels Oct 2, 2024

wz337 changed the title ~~add interpoloate~~ [DTensor] Register replication strategy for a few upsampling interpolate ops Oct 2, 2024

wz337 requested review from XilunWu and weifengpy October 2, 2024 18:48

XilunWu approved these changes Oct 2, 2024

View reviewed changes

Update

4abdf2a

[ghstack-poisoned]

wz337 added a commit that referenced this pull request Oct 2, 2024

add interpoloate

29f64af

ghstack-source-id: e7dd90a Pull Request resolved: #137201

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 2, 2024

pytorchmergebot added the merging label Oct 2, 2024

pytorchmergebot added the Merged label Oct 3, 2024

pytorchmergebot closed this in 55c343f Oct 3, 2024

pytorchmergebot removed the merging label Oct 3, 2024

github-actions bot deleted the gh/wz337/35/head branch November 6, 2024 02:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DTensor] Register replication strategy for a few upsampling interpolate ops #137201

[DTensor] Register replication strategy for a few upsampling interpolate ops #137201

Uh oh!

wz337 commented Oct 2, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Oct 2, 2024 •

edited

Loading

Uh oh!

XilunWu left a comment

Uh oh!

wz337 commented Oct 2, 2024

Uh oh!

awgu commented Oct 2, 2024

Uh oh!

wz337 commented Oct 2, 2024 •

edited

Loading

Uh oh!

wz337 commented Oct 2, 2024

Uh oh!

pytorchmergebot commented Oct 2, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[DTensor] Register replication strategy for a few upsampling interpolate ops #137201

[DTensor] Register replication strategy for a few upsampling interpolate ops #137201

Uh oh!

Conversation

wz337 commented Oct 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/137201

✅ No Failures

Uh oh!

XilunWu left a comment

Choose a reason for hiding this comment

Uh oh!

wz337 commented Oct 2, 2024

Uh oh!

awgu commented Oct 2, 2024

Uh oh!

wz337 commented Oct 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wz337 commented Oct 2, 2024

Uh oh!

pytorchmergebot commented Oct 2, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

wz337 commented Oct 2, 2024 •

edited

Loading

pytorch-bot bot commented Oct 2, 2024 •

edited

Loading

wz337 commented Oct 2, 2024 •

edited

Loading