New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Upstream push 0627 #80355
[WIP] Upstream push 0627 #80355
Conversation
🔗 Helpful links
❌ 1 New FailuresAs of commit bb73c20 (more details on the Dr. CI page): Expand to see more
🕵️ 1 new failure recognized by patternsThe following CI failures do not appear to be due to upstream breakagesTorchBench CI (pytorch-linux-py3.7-cu102) / run-torchbench (1/1)Step: "Run TorchBench" (full log | diagnosis details | 🔁 rerun)
|
f48d7b3
to
5caf4a2
Compare
hmmm, the win error looks nasty. https://pipelines.actions.githubusercontent.com/serviceHosts/7d146c05-69c3-4c20-a0e7-818111670117/_apis/pipelines/1/runs/2090862/signedlogcontent/40?urlExpires=2022-06-27T17%3A32%3A49.9990256Z&urlSigningMethod=HMACV1&urlSignature=ZEbsHmbmDmbJ%2B8x8xw6QAV1HmLCDqH3R38WG5ENxSWU%3D
|
We have recently simplified the CIFlow labels and
|
hmmmm. expand is patched here: csarofeen#1790. |
@davidberard98 is there something wrong with the torchbench? The log seems to be complaining about commit name |
unfortunately I think you don't have permissions to run torchbench on PRs, I can see if I can get it to run |
@jjsjann123 can you comment on |
@davidberard98 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@jjsjann123 we probably need another rebase in order to skip over the macos-11-py3-x86-64 test failure |
Apparently viable/strict is too old.. I'll rebase again.... |
@jjsjann123 looks like viable/strict is only 5 hrs old on https://hud.pytorch.org/metrics ? |
I saw these warning: But looking at the full log, I also noticed this vvv. So it looked like a real thing. Let me fix it.
|
@davidberard98 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
This one looks promising. @davidberard98 should we start importing it? |
@davidberard98 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
bumping threads~ |
@jjsjann123 5-6% regression on autogen-40 and autogen-43, see https://github.com/pytorch/pytorch/runs/7223997510?check_suite_focus=true Not sure if this is significant or just random, but they do both look like they are pretty similar benchmarks ( https://github.com/pytorch/benchmark/blob/main/userbenchmark/nvfuser/ir.py Also depending on your judgement of whether or not this regression is okay could you rebase one more time? |
Neat! I mean, regression is bad, but it's nice that we are catching them with easy repros! Let met see if I can repro them locally and I can open issues internally to track it, so our next perf tuning would hopefully fix those. |
@pytorchbot rebase |
* Fix div(scalar, tensor) * lintrunner: clang-format
… on move assignment operator
Caching strides along with sizes. This is to support current expand, which introduces non-contiguous output tensor
This reverts commit e0ebfc1.
Successfully rebased |
5b25761
to
bb73c20
Compare
XLA failure seems unrelated. Is this one good to be merged? @davidberard98 re: regression on microbenchmark, haven't forgot about that, but it would be easier for me to repro after the merge (comparing perf across a single commit is easier) 😉 Will do it after the merge. |
@davidberard98 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! But I'm going to merge it via the internal workflow, so don't use pytorchbot to merge.
@pytorchbot merge (Initiating merge automatically since Phabricator Diff has merged) |
@pytorchbot successfully started a merge job. Check the current status here |
Hey @jjsjann123. |
Summary: Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/ Code changes includes: - TransformPropagator refactor: switched to Dijkstra instead of exhaustive enumeration on all possible paths to reduce compilation time on transform propagation; - Indexing refactor: remove reference tensor creation in all tensor indexing logic (#1690) - (more) generic grouped grid reduction kernel; - Minor parser/fuser patches: 1. zero-dim tensor reduction support 3. no-op binary removal within fused graph 4. expand supported in fusion Squashed commits to WAR github API Commits that's actually in this PR from the devel branch: ``` a054b3e Refactor TransormPropagator to allow specifying a position and propagating to part of the DAG (#1775) d67e1cd Indexing refactor stage 1: remove reference tensor creation in all tensor indexing logic (#1690) 1b65299 Issue 1770 (#1774) 35b0427 Avoid compilation errors like below: (#1773) 452c773 Ignore reductions of zero-dim tensors per PyTorch conventions (#1771) 31d6c56 TransformPropagator refactor (#1769) 570c5a8 Merge pull request #1767 from csarofeen/upstream_merge_0621 9d6c3d8 merging upstream 61305cd 0ed815f New TransformPropagator algorithm (#1763) 6c19520 no-op binary removal (#1764) ec7fa41 Proper propagation of IterType (#1762) b263562 Fix dimensionality check (#1759) 2d6343f More generic grouped grid reduction kernel (#1740) 64e2b56 [nvfuser] prevent spamming warning message (#77777) (#1758) 0c43162 [nvFuser] Improving bitwise ops support (#77158) (#1757) b93a147 Parser expand (#1754) ``` RUN_TORCHBENCH: nvfuser Pull Request resolved: #80355 Reviewed By: qihqi Differential Revision: D37573400 Pulled By: davidberard98 fbshipit-source-id: 52ab68d89ec01ef61f69f5abeb18c9d3a312aa64
FYI, running the microbenchmark locally seems to indicate that benchmark results are a little bit flaky. Looks like our kernel time varies a little bit, which is not surprising for a small kernel running < 20 us:
Also, autogen-40 & autogen-43 are just PW kernels, (even though there's batch_norm in it, it's running in inference mode. |
@jjsjann123 thanks for investigating! Figured it was worth flagging since both microbenchmarks looked similar and both were performing badly... but if your tests don't repro any issue or any kernel changes then it's probably fine. |
upstream fixes cherry-picked from pytorch#80355
Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/ Code changes includes: - TransformPropagator refactor: switched to Dijkstra instead of exhaustive enumeration on all possible paths to reduce compilation time on transform propagation; - Indexing refactor: remove reference tensor creation in all tensor indexing logic (#1690) - (more) generic grouped grid reduction kernel; - Minor parser/fuser patches: 1. zero-dim tensor reduction support 3. no-op binary removal within fused graph 4. expand supported in fusion Squashed commits to WAR github API Commits that's actually in this PR from the devel branch: ``` a054b3efcf5af58ea518de283f55aaf9fe06ff5f Refactor TransormPropagator to allow specifying a position and propagating to part of the DAG (#1775) d67e1cda9b802036841a371318014a818a849b0a Indexing refactor stage 1: remove reference tensor creation in all tensor indexing logic (#1690) 1b6529956a1ace220898ad09dde0bf85e49827f7 Issue 1770 (#1774) 35b04276b648c9b55cdb6a67f3889f54e745c3d2 Avoid compilation errors like below: (#1773) 452c77326a340d2a4130b7802f4f319aec60e72a Ignore reductions of zero-dim tensors per PyTorch conventions (#1771) 31d6c56d88afba09ac53b2d5dd3493d625f8cd57 TransformPropagator refactor (#1769) 570c5a84b91a3cf67207331be9650d26a2d37e3d Merge pull request #1767 from csarofeen/upstream_merge_0621 9d6c3d84be86da643df6fd51695543938111f20d merging upstream 61305cd638b6fcd73a0b66b4cde7014fecb9e8ce 0ed815f76b08f285bda855dd500692ff10a8abce New TransformPropagator algorithm (#1763) 6c195200c0a92fb0f38c833431a8940ed07569b9 no-op binary removal (#1764) ec7fa4187c177186527409dfc5c7b1754d30bc92 Proper propagation of IterType (#1762) b263562dbc3c865007ad7d7d42a58a20be8d7922 Fix dimensionality check (#1759) 2d6343f6cc1e47b63ef20a50d1446f6480736478 More generic grouped grid reduction kernel (#1740) 64e2b56df2c8b9fd22a362d9cc05974a8607ef3d [nvfuser] prevent spamming warning message (#77777) (#1758) 0c431624ff15b6458b9f9b674a3852373fc426b1 [nvFuser] Improving bitwise ops support (#77158) (#1757) b93a14777fde3b9b39684b9cf1715651a806b281 Parser expand (#1754) ``` RUN_TORCHBENCH: nvfuser Pull Request resolved: pytorch/pytorch#80355 Approved by: https://github.com/davidberard98
Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/ Code changes includes: - TransformPropagator refactor: switched to Dijkstra instead of exhaustive enumeration on all possible paths to reduce compilation time on transform propagation; - Indexing refactor: remove reference tensor creation in all tensor indexing logic (#1690) - (more) generic grouped grid reduction kernel; - Minor parser/fuser patches: 1. zero-dim tensor reduction support 3. no-op binary removal within fused graph 4. expand supported in fusion Squashed commits to WAR github API Commits that's actually in this PR from the devel branch: ``` 2848a5f Refactor TransormPropagator to allow specifying a position and propagating to part of the DAG (#1775) 97d3b84 Indexing refactor stage 1: remove reference tensor creation in all tensor indexing logic (#1690) c8b4f42 Issue 1770 (#1774) 35b04276b648c9b55cdb6a67f3889f54e745c3d2 Avoid compilation errors like below: (#1773) 0773c33 Ignore reductions of zero-dim tensors per PyTorch conventions (#1771) 074d078 TransformPropagator refactor (#1769) 3e9637b Merge pull request #1767 from csarofeen/upstream_merge_0621 690900a merging upstream 61305cd638b6fcd73a0b66b4cde7014fecb9e8ce 86fc20a New TransformPropagator algorithm (#1763) 073e521 no-op binary removal (#1764) dfaca9a Proper propagation of IterType (#1762) 4bc0e6b Fix dimensionality check (#1759) 7ec1263 More generic grouped grid reduction kernel (#1740) bf9c6c6 [nvfuser] prevent spamming warning message (#77777) (#1758) 6f631a8 [nvFuser] Improving bitwise ops support (#77158) (#1757) 921efc8 Parser expand (#1754) ``` RUN_TORCHBENCH: nvfuser Pull Request resolved: pytorch/pytorch#80355 Approved by: https://github.com/davidberard98
Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/
Code changes includes:
Squashed commits to WAR github API
Commits that's actually in this PR from the devel branch:
RUN_TORCHBENCH: nvfuser