Make input casting in root module only in default #91365

zhaojuanmao · 2022-12-23T19:07:28Z

Make input casting in root module only in default, meanwhile allowing to set different mixed precisions for different submodules

pytorch-bot · 2022-12-23T19:07:31Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/91365

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 Failures

As of commit 8fa7531:

FLAKY - The following jobs failed but were likely due to flakiness present on master:

linux-focal-rocm5.3-py3.8 / test (default, 1, 2, linux.rocm.gpu)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

test/distributed/fsdp/test_fsdp_mixed_precision.py

torch/distributed/fsdp/_runtime_utils.py

torch/distributed/fsdp/api.py

torch/distributed/fsdp/_runtime_utils.py

awgu

Overall, the PR looks good to me. I left a few nits about clarifying comments, and also we should fix the cast_root_foward_inputs typo by changing to cast_root_forward_inputs.

Feel free to re-request review when ready.

test/distributed/fsdp/test_fsdp_mixed_precision.py

facebook-github-bot · 2022-12-28T04:25:34Z

@zhaojuanmao has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2022-12-28T05:03:57Z

@zhaojuanmao has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

awgu

LGTM!

awgu · 2022-12-28T12:26:37Z

torch/distributed/fsdp/api.py

+        FSDP(FSDP(model.c2, MixedPrecision(param_dtype=torch.float16, cast_forward_inputs=True)),
+        model.c1, MixedPrecision(param_dtype=torch.bfloat16, cast_forward_inputs=True)),
+        model.c1 should be the first one executed, so that its inputs could be casted
+        as expected inside the root FSDP instance.see examples in unit tests


I am not sure if we want to reference the unit tests like this from public docs since (1) users should not need to dig into our unit tests to understand the note and (2) the unit test may change without us remembering to change this public note.

We do not need to change address this in this PR. I can submit a follow-up if you do not mind, or you can do it as well.

zhaojuanmao · 2022-12-28T17:59:11Z

@pytorchbot merge

pytorchmergebot · 2022-12-28T18:01:49Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2022-12-28T22:59:25Z

Merge failed

Reason: 2 additional jobs have failed, first few of them are: trunk ,trunk / linux-focal-rocm5.3-py3.8 / test (default, 1, 2, linux.rocm.gpu)

Details for Dev Infra team

Raised by workflow job

zhaojuanmao · 2022-12-29T03:18:56Z

@pytorchbot merge -f "failures are not related"

pytorchmergebot · 2022-12-29T03:20:28Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

zhaojuanmao requested review from mrshenli, pritamdamania87, rohan-varma, H-Huang, awgu, kwen2501 and wanchaol as code owners December 23, 2022 19:07

pytorch-bot bot added the release notes: distributed (fsdp) release notes category label Dec 23, 2022

awgu reviewed Dec 24, 2022

View reviewed changes

test/distributed/fsdp/test_fsdp_mixed_precision.py Outdated Show resolved Hide resolved

test/distributed/fsdp/test_fsdp_mixed_precision.py Outdated Show resolved Hide resolved

awgu reviewed Dec 27, 2022

View reviewed changes

test/distributed/fsdp/test_fsdp_mixed_precision.py Outdated Show resolved Hide resolved

test/distributed/fsdp/test_fsdp_mixed_precision.py Outdated Show resolved Hide resolved

test/distributed/fsdp/test_fsdp_mixed_precision.py Outdated Show resolved Hide resolved

awgu self-requested a review December 27, 2022 22:46

zhaojuanmao force-pushed the mixedPrcesionConversion branch 2 times, most recently from 0a861cc to 0b0a7dd Compare December 28, 2022 04:21

Make input casting in root module only in default

8fa7531

zhaojuanmao force-pushed the mixedPrcesionConversion branch from 0b0a7dd to 8fa7531 Compare December 28, 2022 05:02

awgu approved these changes Dec 28, 2022

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Dec 28, 2022

pytorchmergebot added the Merged label Dec 29, 2022

pytorchmergebot closed this in 9b144dd Dec 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make input casting in root module only in default #91365

Make input casting in root module only in default #91365

zhaojuanmao commented Dec 23, 2022

pytorch-bot bot commented Dec 23, 2022 •

edited

awgu left a comment

facebook-github-bot commented Dec 28, 2022

facebook-github-bot commented Dec 28, 2022

awgu left a comment

awgu Dec 28, 2022

zhaojuanmao commented Dec 28, 2022

pytorchmergebot commented Dec 28, 2022

pytorchmergebot commented Dec 28, 2022

zhaojuanmao commented Dec 29, 2022

pytorchmergebot commented Dec 29, 2022

Make input casting in root module only in default #91365

Make input casting in root module only in default #91365

Conversation

zhaojuanmao commented Dec 23, 2022

pytorch-bot bot commented Dec 23, 2022 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/91365

❌ 1 Failures

awgu left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Dec 28, 2022

facebook-github-bot commented Dec 28, 2022

awgu left a comment

Choose a reason for hiding this comment

awgu Dec 28, 2022

Choose a reason for hiding this comment

zhaojuanmao commented Dec 28, 2022

pytorchmergebot commented Dec 28, 2022

Merge started

pytorchmergebot commented Dec 28, 2022

Merge failed

zhaojuanmao commented Dec 29, 2022

pytorchmergebot commented Dec 29, 2022

Merge started

pytorch-bot bot commented Dec 23, 2022 •

edited