-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[PT2][Optimus] Add missing example value for introduced nodes #132297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/132297
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (8 Unrelated Failures)As of commit 6254680 with merge base 9853c04 ( BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This pull request was exported from Phabricator. Differential Revision: D60364165 |
This pull request was exported from Phabricator. Differential Revision: D60364165 |
4b1c691
to
8e69b7c
Compare
…h#132297) Summary: Pull Request resolved: pytorch#132297 We observed that many introduced nodes during split cat and batch fusion pattern optimization did not have example value meta data, which will cause problems in our follow up pattern optimizations, thus we add all missing values. We also fix bugs in some meta update and corner case bug for the old pattern, which caused problems in the follow up pattern optimization. We delete merge_stack_tahn_unbind_pass pattern, which was designed for cmf model, and it could be replaced by the more advanced pattern we added, thus we remove it for easy maintenance. Test Plan: # unit test ``` buck2 test //caffe2/test/inductor:split_cat_fx_passes ``` Test UI: https://www.internalfb.com/intern/testinfra/testrun/3940649915381395 Network: Up: 90KiB Down: 118KiB (reSessionID-a4a438a0-b043-4ae0-8cf4-a83a5150d0fa) Jobs completed: 28. Time elapsed: 4:57.9s. Cache hits: 0%. Commands: 2 (cached: 0, remote: 0, local: 2) Tests finished: Pass 10. Fail 0. Fatal 0. Skip 1. Build failure 0 ``` buck2 test mode/opt pytorch/diff_train_tests/ads/optimus:local_pt2_runner ``` Network: Up: 1.3GiB Down: 84MiB (reSessionID-ff135cdd-e42c-4ab5-8217-907ada465f01) Jobs completed: 61. Time elapsed: 21:56.5s. Cache hits: 0%. Commands: 39 (cached: 0, remote: 0, local: 39) Tests finished: Pass 8. Fail 0. Fatal 0. Skip 0. Build failure 0 # benchmark ``` CUDA_VISIBLE_DEVICES=3 OC_CAUSE=1 buck2 run mode/opt //scripts/jackiexu0313/pt2:local_model_with_pt2 -- --test_mode batch-split --model_type "ig_ctr" --flow_id 584880697 ``` Counter({'pattern_matcher_nodes': 752, 'pattern_matcher_count': 732, 'normalization_pass': 328, 'normalization_aten_pass': 12, 'scmerge_cat_removed': 5, 'scmerge_cat_added': 4, 'scmerge_split_removed': 3, 'unbind_stack_pass': 3, 'batch_tanh': 2, 'scmerge_split_sections_removed': 2, 'scmerge_split_added': 2, 'optimize_cat_inputs_pass': 1, 'unbind_cat_to_view_pass': 1, 'fxgraph_cache_miss': 1}) Differential Revision: D60364165
This pull request was exported from Phabricator. Differential Revision: D60364165 |
8e69b7c
to
d9f1a03
Compare
This pull request was exported from Phabricator. Differential Revision: D60364165 |
d9f1a03
to
6f3c775
Compare
This pull request was exported from Phabricator. Differential Revision: D60364165 |
6f3c775
to
ff6ac7f
Compare
Summary: Pull Request resolved: #132297 We observed that many introduced nodes during split cat and batch fusion pattern optimization did not have example value meta data, which will cause problems in our follow up pattern optimizations, thus we add all missing values. We also fix bugs in some meta update and corner case bug for the old pattern, which caused problems in the follow up pattern optimization. We delete merge_stack_tahn_unbind_pass pattern, which was designed for cmf model, and it could be replaced by the more advanced pattern we added, thus we remove it for easy maintenance. Test Plan: # unit test ``` buck2 test //caffe2/test/inductor:split_cat_fx_passes ``` Test UI: https://www.internalfb.com/intern/testinfra/testrun/3940649915381395 Network: Up: 90KiB Down: 118KiB (reSessionID-a4a438a0-b043-4ae0-8cf4-a83a5150d0fa) Jobs completed: 28. Time elapsed: 4:57.9s. Cache hits: 0%. Commands: 2 (cached: 0, remote: 0, local: 2) Tests finished: Pass 10. Fail 0. Fatal 0. Skip 1. Build failure 0 ``` buck2 test mode/opt pytorch/diff_train_tests/ads/optimus:local_pt2_runner ``` Network: Up: 1.3GiB Down: 84MiB (reSessionID-ff135cdd-e42c-4ab5-8217-907ada465f01) Jobs completed: 61. Time elapsed: 21:56.5s. Cache hits: 0%. Commands: 39 (cached: 0, remote: 0, local: 39) Tests finished: Pass 8. Fail 0. Fatal 0. Skip 0. Build failure 0 # benchmark ``` CUDA_VISIBLE_DEVICES=3 OC_CAUSE=1 buck2 run mode/opt //scripts/jackiexu0313/pt2:local_model_with_pt2 -- --test_mode batch-split --model_type "ig_ctr" --flow_id 584880697 ``` Counter({'pattern_matcher_nodes': 752, 'pattern_matcher_count': 732, 'normalization_pass': 328, 'normalization_aten_pass': 12, 'scmerge_cat_removed': 5, 'scmerge_cat_added': 4, 'scmerge_split_removed': 3, 'unbind_stack_pass': 3, 'batch_tanh': 2, 'scmerge_split_sections_removed': 2, 'scmerge_split_added': 2, 'optimize_cat_inputs_pass': 1, 'unbind_cat_to_view_pass': 1, 'fxgraph_cache_miss': 1}) Differential Revision: D60364165
ff6ac7f
to
cad8c3c
Compare
This pull request was exported from Phabricator. Differential Revision: D60364165 |
This pull request was exported from Phabricator. Differential Revision: D60364165 |
cad8c3c
to
6025dff
Compare
This pull request was exported from Phabricator. Differential Revision: D60364165 |
…h#132297) Summary: Pull Request resolved: pytorch#132297 We observed that many introduced nodes during split cat and batch fusion pattern optimization did not have example value meta data, which will cause problems in our follow up pattern optimizations, thus we add all missing values. We also fix bugs in some meta update and corner case bug for the old pattern, which caused problems in the follow up pattern optimization. We delete merge_stack_tahn_unbind_pass pattern, which was designed for cmf model, and it could be replaced by the more advanced pattern we added, thus we remove it for easy maintenance. Test Plan: # unit test ``` buck2 test //caffe2/test/inductor:split_cat_fx_passes ``` Test UI: https://www.internalfb.com/intern/testinfra/testrun/15481123762720165 Network: Up: 230KiB Down: 702KiB (reSessionID-756346bf-6da3-4fa0-8d03-1b4fd61e0a7a) Jobs completed: 30. Time elapsed: 7:23.9s. Cache hits: 20%. Commands: 5 (cached: 1, remote: 0, local: 4) Tests finished: Pass 9. Fail 0. Fatal 0. Skip 1. Build failure 0 ``` buck2 test mode/opt pytorch/diff_train_tests/ads/optimus:local_pt2_runner ``` Network: Up: 1.3GiB Down: 84MiB (reSessionID-ff135cdd-e42c-4ab5-8217-907ada465f01) Jobs completed: 61. Time elapsed: 21:56.5s. Cache hits: 0%. Commands: 39 (cached: 0, remote: 0, local: 39) Tests finished: Pass 8. Fail 0. Fatal 0. Skip 0. Build failure 0 # benchmark ``` CUDA_VISIBLE_DEVICES=3 OC_CAUSE=1 buck2 run mode/opt //scripts/jackiexu0313/pt2:local_model_with_pt2 -- --test_mode batch-split --model_type "ig_ctr" --flow_id 584880697 ``` Counter({'pattern_matcher_nodes': 752, 'pattern_matcher_count': 732, 'normalization_pass': 328, 'normalization_aten_pass': 12, 'scmerge_cat_removed': 5, 'scmerge_cat_added': 4, 'scmerge_split_removed': 3, 'unbind_stack_pass': 3, 'batch_tanh': 2, 'scmerge_split_sections_removed': 2, 'scmerge_split_added': 2, 'optimize_cat_inputs_pass': 1, 'unbind_cat_to_view_pass': 1, 'fxgraph_cache_miss': 1}) Reviewed By: jackiexu1992 Differential Revision: D60364165
6025dff
to
4ce2875
Compare
…h#132297) Summary: Pull Request resolved: pytorch#132297 We observed that many introduced nodes during split cat and batch fusion pattern optimization did not have example value meta data, which will cause problems in our follow up pattern optimizations, thus we add all missing values. We also fix bugs in some meta update and corner case bug for the old pattern, which caused problems in the follow up pattern optimization. We delete merge_stack_tahn_unbind_pass pattern, which was designed for cmf model, and it could be replaced by the more advanced pattern we added, thus we remove it for easy maintenance. Test Plan: # unit test ``` buck2 test //caffe2/test/inductor:split_cat_fx_passes ``` Test UI: https://www.internalfb.com/intern/testinfra/testrun/15481123762720165 Network: Up: 230KiB Down: 702KiB (reSessionID-756346bf-6da3-4fa0-8d03-1b4fd61e0a7a) Jobs completed: 30. Time elapsed: 7:23.9s. Cache hits: 20%. Commands: 5 (cached: 1, remote: 0, local: 4) Tests finished: Pass 9. Fail 0. Fatal 0. Skip 1. Build failure 0 ``` buck2 test mode/opt pytorch/diff_train_tests/ads/optimus:local_pt2_runner ``` Network: Up: 1.3GiB Down: 84MiB (reSessionID-ff135cdd-e42c-4ab5-8217-907ada465f01) Jobs completed: 61. Time elapsed: 21:56.5s. Cache hits: 0%. Commands: 39 (cached: 0, remote: 0, local: 39) Tests finished: Pass 8. Fail 0. Fatal 0. Skip 0. Build failure 0 # benchmark ``` CUDA_VISIBLE_DEVICES=3 OC_CAUSE=1 buck2 run mode/opt //scripts/jackiexu0313/pt2:local_model_with_pt2 -- --test_mode batch-split --model_type "ig_ctr" --flow_id 584880697 ``` Counter({'pattern_matcher_nodes': 752, 'pattern_matcher_count': 732, 'normalization_pass': 328, 'normalization_aten_pass': 12, 'scmerge_cat_removed': 5, 'scmerge_cat_added': 4, 'scmerge_split_removed': 3, 'unbind_stack_pass': 3, 'batch_tanh': 2, 'scmerge_split_sections_removed': 2, 'scmerge_split_added': 2, 'optimize_cat_inputs_pass': 1, 'unbind_cat_to_view_pass': 1, 'fxgraph_cache_miss': 1}) Reviewed By: jackiexu1992 Differential Revision: D60364165
This pull request was exported from Phabricator. Differential Revision: D60364165 |
4ce2875
to
6254680
Compare
@pytorchbot merge (Initiating merge automatically since Phabricator Diff has merged) |
Merge failedReason: This PR has internal changes and must be landed via Phabricator! Please try reimporting/rexporting the PR! Details for Dev Infra teamRaised by workflow job |
@pytorchbot merge |
Merge failedReason: This PR has internal changes and must be landed via Phabricator! Please try reimporting/rexporting the PR! Details for Dev Infra teamRaised by workflow job |
@pytorchbot merge -f 'unrelated' |
You need to provide a reason for using force merge, in the format @pytorchbot merge -f 'Explanation'.
|
@pytorchbot merge -f 'bypass flaky tests' |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Summary:
We observed that many introduced nodes during split cat and batch fusion pattern optimization did not have example value meta data, which will cause problems in our follow up pattern optimizations, thus we add all missing values.
We also fix bugs in some meta update and corner case bug for the old pattern, which caused problems in the follow up pattern optimization.
We delete merge_stack_tahn_unbind_pass pattern, which was designed for cmf model, and it could be replaced by the more advanced pattern we added, thus we remove it for easy maintenance.
Test Plan:
unit test
Test UI: https://www.internalfb.com/intern/testinfra/testrun/15481123762720165
Network: Up: 230KiB Down: 702KiB (reSessionID-756346bf-6da3-4fa0-8d03-1b4fd61e0a7a)
Jobs completed: 30. Time elapsed: 7:23.9s.
Cache hits: 20%. Commands: 5 (cached: 1, remote: 0, local: 4)
Tests finished: Pass 9. Fail 0. Fatal 0. Skip 1. Build failure 0
Network: Up: 1.3GiB Down: 84MiB (reSessionID-ff135cdd-e42c-4ab5-8217-907ada465f01)
Jobs completed: 61. Time elapsed: 21:56.5s.
Cache hits: 0%. Commands: 39 (cached: 0, remote: 0, local: 39)
Tests finished: Pass 8. Fail 0. Fatal 0. Skip 0. Build failure 0
benchmark
Counter({'pattern_matcher_nodes': 752, 'pattern_matcher_count': 732, 'normalization_pass': 328, 'normalization_aten_pass': 12, 'scmerge_cat_removed': 5, 'scmerge_cat_added': 4, 'scmerge_split_removed': 3, 'unbind_stack_pass': 3, 'batch_tanh': 2, 'scmerge_split_sections_removed': 2, 'scmerge_split_added': 2, 'optimize_cat_inputs_pass': 1, 'unbind_cat_to_view_pass': 1, 'fxgraph_cache_miss': 1})
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang