Skip to content

Conversation

BowenBao
Copy link
Collaborator

@BowenBao BowenBao commented Apr 24, 2023

Stack from ghstack (oldest at bottom):

Summary

  • Do not call fx_graph_module.print_readable when recording fx.GraphModule function argument diagnostics.
  • Cache inspect.getsourcelines results.

@pytorch-bot pytorch-bot bot added the release notes: onnx torch.onnx related changes that should show up in the release notes label Apr 24, 2023
@pytorch-bot
Copy link

pytorch-bot bot commented Apr 24, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/99936

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

✅ No Failures

As of commit a8c6282:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@BowenBao
Copy link
Collaborator Author

I'm inclined to include a sanity test like below, but didn't due to concerns over its flakiness. Ideas are welcomed.

    def test_export_remains_efficient_with_diagnostics(self):
        model_name = "gpt2"
        # Download pytorch model
        model = transformers.AutoModel.from_pretrained(model_name)
        tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)
        inputs = tokenizer("Hello world!", return_tensors="pt")

        start_time = time.time()
        with common_utils.TemporaryFileName() as path:
            torch.onnx.dynamo_export(model, **inputs).save(path)
        elapsed_time = time.time() - start_time
        time_threshold_in_seconds = 15.0
        self.assertTrue(
            elapsed_time < time_threshold_in_seconds,
            (
                f"Exporting GPT2 model took too long! "
                f"{elapsed_time} seconds > {time_threshold_in_seconds} seconds."
                f"This is a sanity check that `torch.onnx.dynamo_export` remains "
                f"reasonably efficient with all the diagnostics and analysis enabled. "
                f"The time constraint is loosely set such that the test should pass "
                f"on most machines."
            ),
        )

@BowenBao BowenBao marked this pull request as ready for review April 24, 2023 22:40
@BowenBao BowenBao requested a review from abock as a code owner April 24, 2023 22:40
@justinchuby
Copy link
Collaborator

Curious on the speed gain?

@abock
Copy link
Contributor

abock commented Apr 24, 2023

I was just about to suggest we include GPT-2 as some kind of nodes/s baseline test with a large tolerance.

   Uncached stack:     36 nodes/s  (1x)    ← Baseline
        LRU stack:  1,208 nodes/s  (33.5x) ← This PR  🎉
    Omitted stack:  2,827 nodes/s  (78.5x) ← Not helpful

This PR is a good balance and we should merge it as-is (unless you wan to add the test), but we may want to further only gather stack info under trace/debug level for another ~2x bump later?

@abock
Copy link
Contributor

abock commented Apr 24, 2023

@justinchuby for context, I stuck tqdm around the node loop in _export_fx_node_to_onnxscript for my own amusement.

@abock
Copy link
Contributor

abock commented Apr 24, 2023

import torch
import transformers

torch.onnx.dynamo_export(
    transformers.GPT2Model.from_pretrained("gpt2"),
    **transformers.GPT2Tokenizer.from_pretrained("gpt2")(
        "Tokenize me",
        return_tensors="pt",
    ),
).save("gpt2.onnx")

@BowenBao
Copy link
Collaborator Author

@abock thanks for posting speed gain. Yep I think it should be configurable through api. We'd want export to be fast so any perf heavy diagnosing should hide behind it. Merging after adding comments per @justinchuby 's suggestion.

Summary
- Do not call `fx_graph_module.print_readable` when recording `fx.GraphModule` function argument diagnostics. 
- Cache `inspect.getsourcelines` results.

[ghstack-poisoned]
@BowenBao
Copy link
Collaborator Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Apr 24, 2023
@BowenBao BowenBao added module: onnx Related to torch.onnx topic: performance topic category labels Apr 24, 2023
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 mandatory check(s) failed. The first few are:

Dig deeper by viewing the failures on hud

Details for Dev Infra team Raised by workflow job

Failing merge rule: Core Maintainers

Summary
- Do not call `fx_graph_module.print_readable` when recording `fx.GraphModule` function argument diagnostics. 
- Cache `inspect.getsourcelines` results.

[ghstack-poisoned]
@BowenBao
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@facebook-github-bot facebook-github-bot deleted the gh/BowenBao/237/head branch June 8, 2023 14:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged module: onnx Related to torch.onnx open source release notes: onnx torch.onnx related changes that should show up in the release notes topic: performance topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants