-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Migrate fuse_chunk_reshape_concat_pass to PT2 #134026
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/134026
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit 556a012 with merge base 2a73ba2 ( BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This pull request was exported from Phabricator. Differential Revision: D60789747 |
This pull request was exported from Phabricator. Differential Revision: D60789747 |
cbd230b
to
0339249
Compare
Summary: Pull Request resolved: pytorch#134026 This is part of the work of dper pass migration https://fburl.com/gdoc/wxwykxns This pass has ~2.4% perf impact for adfinder_reels_ctr_model Test Plan: Still in test Differential Revision: D60789747
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Summary: Pull Request resolved: pytorch#134026 This is part of the work of dper pass migration https://fburl.com/gdoc/wxwykxns This pass has ~2.4% perf impact for adfinder_reels_ctr_model Test Plan: For unit tests: ``` %run ~/fbsource/fbcode/caffe2/test/inductor/fb/test_fuse_chunk_reshape_unsqueeze_concat_pass.py ``` For the pass itself, we can run ``` buck2 run mode/opt-split-dwarf mode/inplace -c fbcode.platform010_cuda_version=12 -c fbcode.nvcc_arch=a100 caffe2/torch/fb/model_transform/experimental/benchmark:mts_gpu_benchmark -- --model-path=manifold://ads_storage_fblearner/tree/user/facebook/fblearner/predictor/565872608/1881/gpu_lowering/input.predictor.disagg.gpu.merge --lower-backend=AOT_INDUCTOR --remove-passes= --disable-acc-tracer=True ``` and verify that the QPS is increased by ~2% after the diff is applied. Reviewed By: huxintong Differential Revision: D60789747
0339249
to
556a012
Compare
This pull request was exported from Phabricator. Differential Revision: D60789747 |
@pytorchbot merge -f 'Landed internally' (Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally) |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Summary: This is part of the work of dper pass migration https://fburl.com/gdoc/wxwykxns This pass has ~2.4% perf impact for adfinder_reels_ctr_model Test Plan: Still in test Differential Revision: D60789747 Pull Request resolved: pytorch#134026 Approved by: https://github.com/huxintong
Summary:
This is part of the work of dper pass migration https://fburl.com/gdoc/wxwykxns
This pass has ~2.4% perf impact for adfinder_reels_ctr_model
Test Plan: Still in test
Differential Revision: D60789747
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang