switch dtensor and functional collective to use optree #110670

wanchaol · 2023-10-06T00:33:44Z

Stack from ghstack (oldest at bottom):

-> switch dtensor and functional collective to use optree #110670

optree recently landed and provide quite good perf, conditionally import
new optree if optree is installed

Some numbers testing mlp layer with TP + func collective:
before this PR: 10.390ms
after this PR: 9.189ms

so around e2e 10% CPU overhead reduction

optree recently landed and provide quite good perf, conditionally import new optree if optree is installed [ghstack-poisoned]

pytorch-bot · 2023-10-06T00:33:46Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/110670

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 73754cb with merge base 2aa3064 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

optree recently landed and provide quite good perf, conditionally import new optree if optree is installed ghstack-source-id: 28ac266eab5e4b34690bbf8aae0cfe7619451e12 Pull Request resolved: #110670

fduwjj

wow do we know how much perf we can get from switching to optree?

wconstab · 2023-10-06T15:43:11Z

torch/distributed/_tensor/dispatch.py

+try:
+    from torch.utils._cxx_pytree import tree_flatten, tree_unflatten
+except ImportError:
+    tree_flatten = torch.utils._pytree.tree_flatten  # type: ignore[assignment]


curious, why the assignment version here instead of doing import like above

I followed a similar approach taken in #109684, probably another import also works (in both case we need to ignore type annotation, lmk your preference :)

wconstab · 2023-10-06T15:45:35Z

torch/distributed/_functional_collectives.py

+try:
+    from torch.utils._cxx_pytree import tree_map_only
+except ImportError:
+    tree_map_only = torch.utils._pytree.tree_map_only  # type: ignore[assignment]


do we know under what builds/setups this import won't exist? Should we issue a warning if we're in a known slow mode?

I believe some certain CI does not have access to pip channel and it only have conda channel, where optree not yet available in conda. @XuehaiPan might have more details

I'm making optree optional in PR #109684. We don't need this try-except blocks after PR #109684.

[pytree] Make optree optional and populate members from _pytree when not available #109684

wanchaol · 2023-10-06T22:25:06Z

wow do we know how much perf we can get from switching to optree?

I tried it and there's not too much perf gain for large scale 2d, reason probably because we already hide almost all cpu overhead. This could potentially benefits smaller models we can try

See some updates on summary, locally testing shows around 10% CPU overhead reduction

fegin

Nice

torch/distributed/_tensor/op_schema.py

optree recently landed and provide quite good perf, conditionally import new optree if optree is installed Some numbers testing mlp layer with TP + func collective: before this PR: 10.390ms after this PR: 9.189ms so around e2e 10% CPU overhead reduction [ghstack-poisoned]

optree recently landed and provide quite good perf, conditionally import new optree if optree is installed ghstack-source-id: cdaf58eaacd93bf32f2edc2060d0b54b68470d0d Pull Request resolved: #110670

wanchaol · 2023-10-08T03:01:58Z

@pytorchbot merge

pytorchmergebot · 2023-10-08T03:05:17Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

switch dtensor and functional collective to use optree

705307a

optree recently landed and provide quite good perf, conditionally import new optree if optree is installed [ghstack-poisoned]

wanchaol requested review from mrshenli, zhaojuanmao and rohan-varma as code owners October 6, 2023 00:33

wanchaol requested review from H-Huang, awgu, kwen2501, fegin, fduwjj, kiukchung, d4l3k and wz337 as code owners October 6, 2023 00:33

wanchaol requested review from wconstab, zou3519 and Chillee October 6, 2023 00:38

wanchaol added the release notes: distributed (dtensor) release notes category label Oct 6, 2023

fduwjj reviewed Oct 6, 2023

View reviewed changes

wconstab reviewed Oct 6, 2023

View reviewed changes

fegin approved these changes Oct 6, 2023

View reviewed changes

XuehaiPan reviewed Oct 7, 2023

View reviewed changes

torch/distributed/_tensor/op_schema.py Outdated Show resolved Hide resolved

XuehaiPan reviewed Oct 7, 2023

View reviewed changes

torch/distributed/_tensor/op_schema.py Outdated Show resolved Hide resolved

wanchaol added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 7, 2023

pytorchmergebot added the merging label Oct 8, 2023

pytorchmergebot added Merged and removed merging labels Oct 8, 2023

pytorchmergebot closed this in 459cef8 Oct 8, 2023

facebook-github-bot deleted the gh/wanchaol/361/head branch October 11, 2023 14:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

switch dtensor and functional collective to use optree #110670

switch dtensor and functional collective to use optree #110670

wanchaol commented Oct 6, 2023 •

edited

pytorch-bot bot commented Oct 6, 2023 •

edited

fduwjj left a comment

wconstab Oct 6, 2023

wanchaol Oct 6, 2023

wconstab Oct 6, 2023

wanchaol Oct 6, 2023

XuehaiPan Oct 7, 2023

wanchaol commented Oct 6, 2023 •

edited

fegin left a comment

wanchaol commented Oct 8, 2023

pytorchmergebot commented Oct 8, 2023

switch dtensor and functional collective to use optree #110670

switch dtensor and functional collective to use optree #110670

Conversation

wanchaol commented Oct 6, 2023 • edited

pytorch-bot bot commented Oct 6, 2023 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/110670

✅ No Failures

fduwjj left a comment

Choose a reason for hiding this comment

wconstab Oct 6, 2023

Choose a reason for hiding this comment

wanchaol Oct 6, 2023

Choose a reason for hiding this comment

wconstab Oct 6, 2023

Choose a reason for hiding this comment

wanchaol Oct 6, 2023

Choose a reason for hiding this comment

XuehaiPan Oct 7, 2023

Choose a reason for hiding this comment

wanchaol commented Oct 6, 2023 • edited

fegin left a comment

Choose a reason for hiding this comment

wanchaol commented Oct 8, 2023

pytorchmergebot commented Oct 8, 2023

Merge started

wanchaol commented Oct 6, 2023 •

edited

pytorch-bot bot commented Oct 6, 2023 •

edited

wanchaol commented Oct 6, 2023 •

edited