[ONNX] Improve diagnostics performance #99936

BowenBao · 2023-04-24T22:31:12Z

Stack from ghstack (oldest at bottom):

Summary

Do not call fx_graph_module.print_readable when recording fx.GraphModule function argument diagnostics.
Cache inspect.getsourcelines results.

[ghstack-poisoned]

pytorch-bot · 2023-04-24T22:31:16Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/99936

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

GitHub Incident with Actions, API Requests, Copilot and Git Operations

✅ No Failures

As of commit a8c6282:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

BowenBao · 2023-04-24T22:39:19Z

I'm inclined to include a sanity test like below, but didn't due to concerns over its flakiness. Ideas are welcomed.

    def test_export_remains_efficient_with_diagnostics(self):
        model_name = "gpt2"
        # Download pytorch model
        model = transformers.AutoModel.from_pretrained(model_name)
        tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)
        inputs = tokenizer("Hello world!", return_tensors="pt")

        start_time = time.time()
        with common_utils.TemporaryFileName() as path:
            torch.onnx.dynamo_export(model, **inputs).save(path)
        elapsed_time = time.time() - start_time
        time_threshold_in_seconds = 15.0
        self.assertTrue(
            elapsed_time < time_threshold_in_seconds,
            (
                f"Exporting GPT2 model took too long! "
                f"{elapsed_time} seconds > {time_threshold_in_seconds} seconds."
                f"This is a sanity check that `torch.onnx.dynamo_export` remains "
                f"reasonably efficient with all the diagnostics and analysis enabled. "
                f"The time constraint is loosely set such that the test should pass "
                f"on most machines."
            ),
        )

torch/onnx/_internal/diagnostics/infra/utils.py

torch/onnx/_internal/fx/diagnostics.py

justinchuby · 2023-04-24T22:59:34Z

Curious on the speed gain?

abock · 2023-04-24T22:59:43Z

I was just about to suggest we include GPT-2 as some kind of nodes/s baseline test with a large tolerance.

   Uncached stack:     36 nodes/s  (1x)    ← Baseline
        LRU stack:  1,208 nodes/s  (33.5x) ← This PR  🎉
    Omitted stack:  2,827 nodes/s  (78.5x) ← Not helpful

This PR is a good balance and we should merge it as-is (unless you wan to add the test), but we may want to further only gather stack info under trace/debug level for another ~2x bump later?

abock · 2023-04-24T23:01:23Z

@justinchuby for context, I stuck tqdm around the node loop in _export_fx_node_to_onnxscript for my own amusement.

abock · 2023-04-24T23:02:25Z

import torch
import transformers

torch.onnx.dynamo_export(
    transformers.GPT2Model.from_pretrained("gpt2"),
    **transformers.GPT2Tokenizer.from_pretrained("gpt2")(
        "Tokenize me",
        return_tensors="pt",
    ),
).save("gpt2.onnx")

BowenBao · 2023-04-24T23:30:27Z

@abock thanks for posting speed gain. Yep I think it should be configurable through api. We'd want export to be fast so any perf heavy diagnosing should hide behind it. Merging after adding comments per @justinchuby 's suggestion.

Summary - Do not call `fx_graph_module.print_readable` when recording `fx.GraphModule` function argument diagnostics. - Cache `inspect.getsourcelines` results. [ghstack-poisoned]

BowenBao · 2023-04-24T23:42:42Z

@pytorchbot merge

pytorchmergebot · 2023-04-24T23:44:37Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2023-04-25T00:04:52Z

Merge failed

Reason: 1 mandatory check(s) failed. The first few are:

Lint / lintrunner / linux-job

Dig deeper by viewing the failures on hud

Details for Dev Infra team

Raised by workflow job

Failing merge rule: Core Maintainers

Summary - Do not call `fx_graph_module.print_readable` when recording `fx.GraphModule` function argument diagnostics. - Cache `inspect.getsourcelines` results. [ghstack-poisoned]

BowenBao · 2023-04-25T01:28:52Z

@pytorchbot merge

pytorchmergebot · 2023-04-25T01:30:45Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

[ONNX] Improve diagnostics performance

a891994

[ghstack-poisoned]

pytorch-bot bot added the release notes: onnx torch.onnx related changes that should show up in the release notes label Apr 24, 2023

BowenBao mentioned this pull request Apr 24, 2023

[ONNX] Introduce 'diagnostics' to 'dynamo_export' api #99668

Closed

pytorchbot added the open source label Apr 24, 2023

Update on "[ONNX] Improve diagnostics performance"

946dc09

[ghstack-poisoned]

BowenBao marked this pull request as ready for review April 24, 2023 22:40

BowenBao requested a review from abock as a code owner April 24, 2023 22:40

justinchuby reviewed Apr 24, 2023

View reviewed changes

torch/onnx/_internal/diagnostics/infra/utils.py Show resolved Hide resolved

justinchuby approved these changes Apr 24, 2023

View reviewed changes

torch/onnx/_internal/fx/diagnostics.py Show resolved Hide resolved

abock approved these changes Apr 24, 2023

View reviewed changes

Update on "[ONNX] Improve diagnostics performance"

ed18bb2

Summary - Do not call `fx_graph_module.print_readable` when recording `fx.GraphModule` function argument diagnostics. - Cache `inspect.getsourcelines` results. [ghstack-poisoned]

BowenBao mentioned this pull request Apr 24, 2023

[ONNX] Remove 'diagnose_step' #99944

Closed

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Apr 24, 2023

BowenBao added module: onnx Related to torch.onnx topic: performance topic category labels Apr 24, 2023

pytorchmergebot added the merging label Apr 24, 2023

Update on "[ONNX] Improve diagnostics performance"

a8c6282

Summary - Do not call `fx_graph_module.print_readable` when recording `fx.GraphModule` function argument diagnostics. - Cache `inspect.getsourcelines` results. [ghstack-poisoned]

pytorchmergebot added the Merged label Apr 25, 2023

pytorchmergebot removed the merging label Apr 25, 2023

pytorchmergebot closed this in e514532 Apr 25, 2023

facebook-github-bot deleted the gh/BowenBao/237/head branch June 8, 2023 14:28

[ONNX] Improve diagnostics performance #99936

[ONNX] Improve diagnostics performance #99936

Uh oh!

Conversation

BowenBao commented Apr 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Apr 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/99936

❗ 1 Active SEVs

✅ No Failures

Uh oh!

BowenBao commented Apr 24, 2023

Uh oh!

Uh oh!

Uh oh!

justinchuby commented Apr 24, 2023

Uh oh!

abock commented Apr 24, 2023

Uh oh!

abock commented Apr 24, 2023

Uh oh!

abock commented Apr 24, 2023

Uh oh!

BowenBao commented Apr 24, 2023

Uh oh!

BowenBao commented Apr 24, 2023

Uh oh!

pytorchmergebot commented Apr 24, 2023

Merge started

Uh oh!

pytorchmergebot commented Apr 25, 2023

Merge failed

Uh oh!

BowenBao commented Apr 25, 2023

Uh oh!

pytorchmergebot commented Apr 25, 2023

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

BowenBao commented Apr 24, 2023 •

edited

Loading

pytorch-bot bot commented Apr 24, 2023 •

edited

Loading