[onnx.export] Avoid linear look up in env for exist_in_env #124909

gustavla · 2024-04-25T04:18:58Z

This PR is part of a series of PRs to significantly speed up torch.onnx.export for models with many nodes (e.g. LLM). See #121422 for more analysis.

As part of torch.onnx.export, a reverse look-up is made in env. This is done for each node, and this look-up costs in proportional to the graph size, which incurs and overall O(N^2) time complexity.
A pragmatic solution is simply to keep a separate data structure to make this de facto constant time. So, this introduces a set containing all the values of env. Open to other ideas. Ideally exist_in_env wouldn't be needed at all, but to preserve current behavior exactly I'm not sure how that can be done.
Resolves (4) in ONNX export is unnecessarily slow (O(N^2)) #121422.
This code change and the choice of py::set looks a bit more natural on top of [onnx.export] Avoid dict <-> unordered_map implicit copies #123063, where the env is changed from a std::unordered_map to a py::dict.

Partially fixes #121422

pytorch-bot · 2024-04-25T04:19:01Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/124909

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Too many GPU instances requested - any *.nvidia.gpu and ephemeral instances should experience significant queue times

✅ No Failures

As of commit 0726246 with merge base 6ea226b ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

linux-foundation-easycla · 2024-04-25T04:19:02Z

The committers listed above are authorized under a signed CLA.

✅ login: gustavla / name: Gustav Larsson (0726246)

justinchuby · 2024-05-09T04:29:46Z

torch/csrc/jit/passes/onnx.cpp

@@ -169,8 +169,14 @@ std::shared_ptr<Graph> ToONNX(
  ConstantValueMap::ClearMaps();
  auto new_graph = std::make_shared<Graph>(graph->current_scope());
  std::unordered_map<Value*, Value*> env;
+  py::set values_in_env;


Could you add the rationale as a comment here for future readers? Thanks!

Added a comment, thanks! I also pushed a rebase since #123063 merged, that touches many of the same lines.

- As part of torch.onnx.export, a reverse look-up is made in env. This is done for each node, and this look-up costs in proportion to the graph size, which incurs and overall O(N^2) time complexity. - A pragmatic solution is simply to keep a separate data structure to make this de facto constant time. So, this introduces a set containing all the values of env. - Resolves (4) in pytorch#121422.

justinchuby · 2024-05-09T19:07:11Z

@pytorchbot merge

pytorchmergebot · 2024-05-09T19:09:05Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

gustavla requested review from BowenBao, thiagocrepaldi and wschin as code owners April 25, 2024 04:18

pytorch-bot bot added the release notes: onnx torch.onnx related changes that should show up in the release notes label Apr 25, 2024

gustavla force-pushed the dev/gustav/onnx_export_speedup_121422_D_4_real branch from 6e80e7f to ab070ac Compare April 25, 2024 04:20

gustavla mentioned this pull request Apr 25, 2024

ONNX export is unnecessarily slow (O(N^2)) #121422

Closed

9 tasks

gustavla changed the title ~~Avoid linear look up in env for exist_in_env~~ [onnx.export] Avoid linear look up in env for exist_in_env Apr 25, 2024

pytorchbot added the open source label Apr 25, 2024

gustavla force-pushed the dev/gustav/onnx_export_speedup_121422_D_4_real branch from ab070ac to b59ab82 Compare April 25, 2024 15:57

ezyang added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Apr 25, 2024

srikris-sridhar approved these changes May 4, 2024

View reviewed changes

justinchuby reviewed May 9, 2024

View reviewed changes

justinchuby approved these changes May 9, 2024

View reviewed changes

gustavla force-pushed the dev/gustav/onnx_export_speedup_121422_D_4_real branch from b59ab82 to 0726246 Compare May 9, 2024 18:28

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label May 9, 2024

pytorchmergebot added the merging label May 9, 2024

pytorchmergebot added the Merged label May 9, 2024

pytorchmergebot closed this in 52fad83 May 9, 2024

pytorchmergebot removed the merging label May 9, 2024

linshokaku mentioned this pull request May 13, 2024

follow new graph context I/F pfnet/pytorch-pfn-extras#819

Open

titaiwangms added the topic: improvements topic category label Jul 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[onnx.export] Avoid linear look up in env for exist_in_env #124909

[onnx.export] Avoid linear look up in env for exist_in_env #124909

gustavla commented Apr 25, 2024

pytorch-bot bot commented Apr 25, 2024 •

edited

Loading

linux-foundation-easycla bot commented Apr 25, 2024 •

edited

Loading

justinchuby May 9, 2024

gustavla May 9, 2024

justinchuby commented May 9, 2024

pytorchmergebot commented May 9, 2024

[onnx.export] Avoid linear look up in env for exist_in_env #124909

[onnx.export] Avoid linear look up in env for exist_in_env #124909

Conversation

gustavla commented Apr 25, 2024

pytorch-bot bot commented Apr 25, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/124909

❗ 1 Active SEVs

✅ No Failures

linux-foundation-easycla bot commented Apr 25, 2024 • edited Loading

justinchuby May 9, 2024

Choose a reason for hiding this comment

gustavla May 9, 2024

Choose a reason for hiding this comment

justinchuby commented May 9, 2024

pytorchmergebot commented May 9, 2024

Merge started

pytorch-bot bot commented Apr 25, 2024 •

edited

Loading

linux-foundation-easycla bot commented Apr 25, 2024 •

edited

Loading