[export] Skip guard propagation for export only. #112685

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

zhxchen17 wants to merge 1 commit into pytorch:main from zhxchen17:export-D50914819

Contributor

zhxchen17 commented Nov 2, 2023 •

edited by pytorch-bot bot

Loading

Summary:
IIUC, from the beginning of torch.export to the current moment, we never really find a way to meaningfully use generated guards from torch dynamo, since export by design must output a non Python dependent artifact to users. Therefore I think in the forseeable future we still don't need dynamo guards to be present.

Recently we ran into some slowness in dynamo for internal models and observed a substantial time consumed in guard propagation logic, especially when we're dealing with list VTs. This PR effectively detects whether we're exporting and if so skip the entire guard propagation logic.

This doesn't resolve the fundemental issue of O(n^2) algorithm used in VT but greatly reduce the constant factor of it, thus still gives us great speedup for exporting time. We've discussed the resolution in pt2 core meeting but as a quick unblocker in the short term I think we could also try to skip guards to unblock several internal models.

Some experiment ran on the internal FM model:
local_fm (training): 15min -> 4min
ads launcher ir generation (training, D50847630): >1hr -> 30min

Test Plan: CI

Differential Revision: D50914819

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @aakhundov @kadeng

pytorch-bot bot commented Nov 2, 2023 •

edited

Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/112685

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

MacOS M1 CI outage with broken conda environment

✅ No Failures

As of commit cba8099 with merge base 2337d8d ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Contributor

facebook-github-bot commented Nov 2, 2023

This pull request was exported from Phabricator. Differential Revision: D50914819

facebook-github-bot added the fb-exported label

github-actions bot added module: dynamo ciflow/inductor labels

zhxchen17 requested review from ezyang, tugsbayasgalan and suo

November 2, 2023 02:03

zhxchen17 added the topic: not user facing label

zhxchen17 requested a review from jansel

November 2, 2023 02:04

zhxchen17 force-pushed the export-D50914819 branch from df7129f to 0716db8 Compare

November 2, 2023 15:16

Contributor

facebook-github-bot commented Nov 2, 2023

This pull request was exported from Phabricator. Differential Revision: D50914819

tugsbayasgalan approved these changes

View reviewed changes

ezyang requested review from voznesenskym and mlazos

November 3, 2023 14:16

ezyang reviewed

View reviewed changes

torch/_dynamo/variables/base.py Show resolved Hide resolved

zhxchen17 force-pushed the export-D50914819 branch from 0716db8 to 6d65663 Compare

November 3, 2023 15:02

Contributor

facebook-github-bot commented Nov 3, 2023

This pull request was exported from Phabricator. Differential Revision: D50914819

zhxchen17 force-pushed the export-D50914819 branch from 6d65663 to 83c41be Compare

November 3, 2023 15:02

zhxchen17 requested a review from ezyang

November 3, 2023 15:02

Contributor

facebook-github-bot commented Nov 3, 2023

This pull request was exported from Phabricator. Differential Revision: D50914819


          [export] Skip guard propagation for export only. (pytorch#112685)

cba8099

Summary:

IIUC, from the beginning of torch.export to the current moment, we never really find a way to meaningfully use generated guards from torch dynamo, since export by design must output a non Python dependent artifact to users. Therefore I think in the forseeable future we still don't need dynamo guards to be present.

Recently we ran into some slowness in dynamo for internal models and observed a substantial time consumed in guard propagation logic, especially when we're dealing with list VTs. This PR effectively detects whether we're exporting and if so skip the entire guard propagation logic.

This doesn't resolve the fundemental issue of O(n^2) algorithm used in VT but greatly reduce the constant factor of it, thus still gives us great speedup for exporting time. We've discussed the resolution in pt2 core meeting but as a quick unblocker in the short term I think we could also try to skip guards to unblock several internal models.

Some experiment ran on the internal FM model:
local_fm (training): 15min -> 4min
ads launcher ir generation (training, D50847630): >1hr -> 30min

Test Plan: CI

Reviewed By: tugsbayasgalan

Differential Revision: D50914819

Contributor

facebook-github-bot commented Nov 3, 2023

This pull request was exported from Phabricator. Differential Revision: D50914819

zhxchen17 force-pushed the export-D50914819 branch from 83c41be to 52d1ba5 Compare

November 3, 2023 15:05

Contributor

facebook-github-bot commented Nov 3, 2023

This pull request was exported from Phabricator. Differential Revision: D50914819

zhxchen17 force-pushed the export-D50914819 branch from 52d1ba5 to cba8099 Compare

November 3, 2023 15:05

Contributor Author

zhxchen17 commented Nov 3, 2023

Added some comments inline about why we skip guards propagation for export.

jansel requested changes

View reviewed changes

Contributor

jansel left a comment

I think this will be made obsolete by:
#111726

Does that PR fix the performance issues you are seeing? If so maybe we don't need to add an extra mode.

Contributor Author

zhxchen17 commented Nov 3, 2023

I think this will be made obsolete by: #111726

Does that PR fix the performance issues you are seeing? If so maybe we don't need to add an extra mode.

I will do a test with that PR.

zhxchen17 closed this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

jansel jansel requested changes

tugsbayasgalan tugsbayasgalan approved these changes

suo Awaiting requested review from suo

voznesenskym Awaiting requested review from voznesenskym

mlazos Awaiting requested review from mlazos

ezyang Awaiting requested review from ezyang

Labels

ciflow/inductor fb-exported module: dynamo topic: not user facing