Cudagraphs support for compiled optimizers #107504

mlazos · 2023-08-19T03:25:29Z

This removes extra copies introduced by torch.compile when compiling the optimizer. These copies would tank perf and make cudagraphs unusable.

Marks all params/optimizer state as static addresses and a finalizer which cleans up the graph attributes when the optimizer goes out of scope.

**Note: this does not mark grads as static because this will increase memory usage significantly

There are two cases:

The upstream graph is cudagraphed - this case will work fine OOTB
The upstream graph is not cudagraphed - in this case, there will be a lot of copies introduced from the upstream (to copy the grads) into cudagraphed-owned memory, unless the user explicitly marks the grads as static. If the user does this, this will also require not deallocating the grads in zero_grad() (either the mod or optimizer version) by setting them to zero vs None. There is a PR (Throw error if setting static grads to None in zero_grad() #107853) in flight to throw an error if zero_grad attempts to set static grads to None.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @ngimel @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov @anijain2305

pytorch-bot · 2023-08-19T03:25:32Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/107504

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 283cb16 with merge base ad17e5e ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

eellison

cool! A few questions:

should grads be marked as static? I dont know that they will be. if they're not, it will just lead to endless recompilation.
The cleanup stuff is cool, i dont know if it could be more generic. Also, it wouldnt be sufficient to clean up cudagraph memory usage.. we'll need something where we invalidate a particular compilation within cudagraphs. i can submit that pr separately, not a blocker for this.
-You might want someone to help with dynamo reviewing that is not me

torch/_dynamo/variables/optimizer.py

jansel · 2023-08-23T16:37:52Z

I will let @eellison handle the review on this one

mlazos · 2023-08-30T19:41:16Z

@pytorchbot merge

pytorch-bot · 2023-08-30T19:41:24Z

This PR needs to be approved by an authorized maintainer before merge.

mlazos · 2023-08-31T20:45:26Z

@pytorchbot merge

pytorchmergebot · 2023-08-31T20:47:12Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

mlazos added 7 commits August 16, 2023 22:30

Initial impl

1058daa

Initial impl

c1412f2

Add mypy annotation

2ad346c

Initial impl

e781734

Initial impl tests passing

5f12a0f

Make clearing of params safe

cd3d282

Add ref test

ffcb625

mlazos requested a review from eellison August 19, 2023 03:25

github-actions bot added module: inductor module: dynamo ciflow/inductor labels Aug 19, 2023

mlazos added release notes: dynamo ciflow/trunk Trigger trunk jobs on your pull request module: dynamo module: inductor ciflow/inductor and removed module: inductor module: dynamo ciflow/inductor labels Aug 19, 2023

Move test to compiled optimizers

50f7e94

eellison reviewed Aug 22, 2023

View reviewed changes

torch/_dynamo/variables/optimizer.py Show resolved Hide resolved

torch/_dynamo/variables/optimizer.py Show resolved Hide resolved

mlazos requested a review from jansel August 22, 2023 04:01

jansel removed their request for review August 23, 2023 16:37

mlazos added 4 commits August 30, 2023 03:30

Merge branch 'main' into mlazos/opt-cudagraphs

7f6b03e

Disable section of failing test for dynamo

7200d76

Disable dynamo on piece of test

3056106

Set recursive to False

bc317b5

mlazos requested a review from eellison August 30, 2023 19:42

Merge branch 'main' into mlazos/opt-cudagraphs

283cb16

eellison approved these changes Aug 31, 2023

View reviewed changes

pytorchmergebot added the merging label Aug 31, 2023

pytorchmergebot added Merged and removed merging labels Aug 31, 2023

pytorchmergebot closed this in 49df1de Aug 31, 2023

mlazos mentioned this pull request Aug 31, 2023

(Possible) Memory leak on deleting a compiled model #104095

Closed

github-actions bot deleted the mlazos/opt-cudagraphs branch February 19, 2024 01:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cudagraphs support for compiled optimizers #107504

Cudagraphs support for compiled optimizers #107504

mlazos commented Aug 19, 2023 •

edited

pytorch-bot bot commented Aug 19, 2023 •

edited

eellison left a comment

jansel commented Aug 23, 2023

mlazos commented Aug 30, 2023

pytorch-bot bot commented Aug 30, 2023

mlazos commented Aug 31, 2023

pytorchmergebot commented Aug 31, 2023

Cudagraphs support for compiled optimizers #107504

Cudagraphs support for compiled optimizers #107504

Conversation

mlazos commented Aug 19, 2023 • edited

pytorch-bot bot commented Aug 19, 2023 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/107504

✅ No Failures

eellison left a comment

Choose a reason for hiding this comment

jansel commented Aug 23, 2023

mlazos commented Aug 30, 2023

pytorch-bot bot commented Aug 30, 2023

mlazos commented Aug 31, 2023

pytorchmergebot commented Aug 31, 2023

Merge started

mlazos commented Aug 19, 2023 •

edited

pytorch-bot bot commented Aug 19, 2023 •

edited