Skip to content

Conversation

davidchencsl
Copy link
Contributor

@davidchencsl davidchencsl commented Jul 14, 2022

Stack from ghstack (oldest at bottom):

Summary: The main idea is that we can run some baseline benchmarks after we are done matching the events. This gives us ability to accurate measure speed gain because system performance varies from machine to machine.

Test Plan: I did some manually testing on all the models in torchbench, as well as added a simple test in test_profiler.py

Differential Revision: D37894566

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Jul 14, 2022

🔗 Helpful links

✅ No Failures (0 Pending)

As of commit 364e9a0 (more details on the Dr. CI page):

Expand to see more

💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

davidchencsl added a commit that referenced this pull request Jul 14, 2022
@davidchencsl davidchencsl requested a review from robieta July 14, 2022 20:49
… CUDA Copy Pattern"


Summary: The main idea is that we can run some baseline benchmarks after we are done matching the events. This gives us ability to accurate measure speed gain because system performance varies from machine to machine.

Test Plan: I did some manually testing on all the models in torchbench, as well as added a simple test in test_profiler.py

[ghstack-poisoned]
… CUDA Copy Pattern"


Summary: The main idea is that we can run some baseline benchmarks after we are done matching the events. This gives us ability to accurate measure speed gain because system performance varies from machine to machine.

Test Plan: I did some manually testing on all the models in torchbench, as well as added a simple test in test_profiler.py

[ghstack-poisoned]
@davidchencsl
Copy link
Contributor Author

@davidchencsl has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

… CUDA Copy Pattern"


Summary: The main idea is that we can run some baseline benchmarks after we are done matching the events. This gives us ability to accurate measure speed gain because system performance varies from machine to machine.

Test Plan: I did some manually testing on all the models in torchbench, as well as added a simple test in test_profiler.py

Differential Revision: [D37894566](https://our.internmc.facebook.com/intern/diff/D37894566)

[ghstack-poisoned]
@davidchencsl
Copy link
Contributor Author

@davidchencsl has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

… CUDA Copy Pattern"


Summary: The main idea is that we can run some baseline benchmarks after we are done matching the events. This gives us ability to accurate measure speed gain because system performance varies from machine to machine.

Test Plan: I did some manually testing on all the models in torchbench, as well as added a simple test in test_profiler.py

Differential Revision: [D37894566](https://our.internmc.facebook.com/intern/diff/D37894566)

[ghstack-poisoned]
davidchencsl added a commit that referenced this pull request Jul 15, 2022
@davidchencsl
Copy link
Contributor Author

@davidchencsl has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@davidchencsl davidchencsl requested a review from chaekit July 18, 2022 18:03
… CUDA Copy Pattern"


Summary: The main idea is that we can run some baseline benchmarks after we are done matching the events. This gives us ability to accurate measure speed gain because system performance varies from machine to machine.

Test Plan: I did some manually testing on all the models in torchbench, as well as added a simple test in test_profiler.py

Differential Revision: [D37894566](https://our.internmc.facebook.com/intern/diff/D37894566)

[ghstack-poisoned]
davidchencsl added a commit that referenced this pull request Jul 18, 2022
@davidchencsl
Copy link
Contributor Author

@davidchencsl has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link

@robieta robieta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. The only thing that I would say is report_all_anti_patterns should take should_benchmark: bool = False as an argument and plumb it through. It can add a lot of time to the analysis, so we want users to opt into it. (At some point TorchTidy might be sophisticated enough to pick an appropriate subset to benchmark, but that's a long way off.)

… CUDA Copy Pattern"


Summary: The main idea is that we can run some baseline benchmarks after we are done matching the events. This gives us ability to accurate measure speed gain because system performance varies from machine to machine.

Test Plan: I did some manually testing on all the models in torchbench, as well as added a simple test in test_profiler.py

Differential Revision: [D37894566](https://our.internmc.facebook.com/intern/diff/D37894566)

[ghstack-poisoned]
Copy link

@robieta robieta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

… CUDA Copy Pattern"


Summary: The main idea is that we can run some baseline benchmarks after we are done matching the events. This gives us ability to accurate measure speed gain because system performance varies from machine to machine.

Test Plan: I did some manually testing on all the models in torchbench, as well as added a simple test in test_profiler.py

Differential Revision: [D37894566](https://our.internmc.facebook.com/intern/diff/D37894566)

[ghstack-poisoned]
@davidchencsl
Copy link
Contributor Author

@davidchencsl has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

1 similar comment
@davidchencsl
Copy link
Contributor Author

@davidchencsl has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@davidchencsl
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a merge job. Check the current status here

@github-actions
Copy link
Contributor

Hey @davidchencsl.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

@davidchencsl
Copy link
Contributor Author

@davidchencsl has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot pushed a commit that referenced this pull request Jul 26, 2022
…Pattern (#81501) (#81501)

Summary:
The main idea is that we can run some baseline benchmarks after we are done matching the events. This gives us ability to accurate measure speed gain because system performance varies from machine to machine.

Pull Request resolved: #81501
Approved by: https://github.com/robieta

Test Plan:
contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/64c6387c0ff82d49a5bfdcae579b522ae830c2c8

Test plan from GitHub:
I did some manually testing on all the models in torchbench, as well as added a simple test in test_profiler.py

Original Phabricator Test Plan:
I did some manually testing on all the models in torchbench, as well as added a simple test in test_profiler.py

Reviewed By: robieta

Differential Revision: D37894566

Pulled By: davidchencsl

fbshipit-source-id: 3e7adcf9b647d02cfad28772cf72fe08da2c6f93
@facebook-github-bot facebook-github-bot deleted the gh/davidchencsl/21/head branch July 26, 2022 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants