-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Merge pytorch master into lazy_tensor_staging #72894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Summary: Tests under `test/onnx/test_models_onnxruntime.py` complains `AttributeError: 'TestModels' object has no attribute 'onnx_shape_inference'`. This failure in CI appears suddenly without any code changes to related files. It is likely due to different test case run order. The test code was badly written such that test class `TestModels_new_jit_API`, if called first, will assign `TestModels.onnx_shape_inference = True`, circumventing this problem. On the other hand, if `TestModels` is called first, `AttributeError` will be raised. Fixes #72337 Pull Request resolved: #72350 Reviewed By: jbschlosser, seemethere, janeyx99 Differential Revision: D34010794 Pulled By: malfet fbshipit-source-id: 816f7bee89ea0251bb5df8f482b68f8dc4823997 (cherry picked from commit b39b23b)
Summary: Update branch location after https://github.com/pytorch/test-infra/pull/182/files A step for pytorch/test-infra#175 Pull Request resolved: #72233 Reviewed By: jbschlosser Differential Revision: D34013014 Pulled By: kit1980 fbshipit-source-id: a05be1870608d5d2dc6cfd270222abd56b89e5fe (cherry picked from commit 9806e7d)
Summary: Pull Request resolved: #72244 att Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: albanD Differential Revision: D33971544 fbshipit-source-id: a6e01d08ae9fdd479bba8abfa0941c555650f84e (cherry picked from commit fe756cc)
Summary: In order to make them accessible from other libraries Also make `REGISTER_ARCH_DISPATCH` export dispatches as TORCH_API, so that stubs could be called from libraries other than `torch_cpu`. To satisfy Windows builds, add the same `TORCH_API` to the static members declarations, although they are noops on Linux. Pull Request resolved: #72340 Reviewed By: janeyx99 Differential Revision: D34007756 Pulled By: malfet fbshipit-source-id: 6dcc4e350920c72f8b1762a5018082f7aeec98e9 (cherry picked from commit 9c1f44d)
Summary: Pull Request resolved: #72349 1. Interface call'd methods need to be registered to class. Previously all interface calls are inlined so there was no such problem. 2. parseDoubleList and parseBoolList got reversed when refactoring. Test Plan: 1. Get ASR's test model at ``` mkdir ~/asr1 && cd ~/asr1 fbpkg fetch speech.tuna.milan.ondevice.en_us ``` 2. Convert model: ``` cd ~/fbsource buck run //xplat/caffe2/fb/lite_predictor:convert_model -- --model=$HOME/asr1/pytorchmodel.pt --output_name=$HOME/asr1/pytorchmodel.ff ``` 3. Ran lite_predictor_flatbuffer ``` buck run //xplat/caffe2/fb/lite_predictor:lite_predictor_flatbuffer -- --model=$HOME/asr1/pytorchmodel.ff --method_to_call=encode_src --method_to_generate_input=get_all_bundled_inputs_for_encode_src ``` See perf metric generated (means loading and inference succeeded). Reviewed By: gmagogsfm, zhxchen17 Differential Revision: D33959746 fbshipit-source-id: 24671e1189438119f477032eb6c29bd7736e74ca (cherry picked from commit 5e18809)
Summary: D34018849 Test Plan: D34018849 Reviewed By: shoumikhin Differential Revision: D34018840 fbshipit-source-id: a78e3ea5b8ac93e9e002e2583961fd3a545a0abd (cherry picked from commit 57b7c51)
Summary: I think this diff stack broke all the related tasks below. Test Plan: For our failing tests: buck test //fbandroid/instrumentation_tests/com/facebook/pytorch/bi_xray:instrumentation_test -c test.external_runner=tpx -- --regex 'testBIXRayModel.*PyTorchBIXRayInstrumentationTest' --force-remote-execution --run-disabled For the ubn: Not really sure what to do, trying to build the app and see if I can use an effect? Reviewed By: shoumikhin Differential Revision: D34018849 fbshipit-source-id: 3571718cb6621931af931b494e0a70d6e0164e65 (cherry picked from commit 3cc63cb)
Summary: Pull Request resolved: #71783 Adds Quantized Matmul op which just naively performs dequantize -> matmul -> quantize. (To be optimized in the future) Test Plan: From fbcode: ```buck test caffe2/test:quantization -- test_qmatmul``` Reviewed By: kimishpatel Differential Revision: D33443161 fbshipit-source-id: 7e0a8e45bed1a63f9cd68a70cadbc9a8e35b2faa (cherry picked from commit 100d8b3)
Summary: Pull Request resolved: #69854 ghstack-source-id: 148315147 Test Plan: Time reported to start up static runtime on ctr_mobile_feed local_ro net is 8.8s instead of 9.5s Reviewed By: suo, d1jang Differential Revision: D33039733 fbshipit-source-id: 218dc7ff9aa421a352b71952ec77757368095860 (cherry picked from commit 7586712)
Summary: Pull Request resolved: #71520 Specifically this pr is to deal with adverse cases such as D33662192 It is possible for someone to export a package, change something in it, and then attempt to repackage it. In this case it's possible that dependencies of the package are no longer interned. In this case it is not obvious where we would look for these packages, therefore, we throw an error. Test Plan: Imported from OSS Reviewed By: bradleyhd Differential Revision: D33675557 Pulled By: PaliC fbshipit-source-id: 807962bfb340d30d418617d6e78661a033828314 (cherry picked from commit 1b10c23)
…d always 64 bug (#72375) Summary: Pull Request resolved: #72375 Current MHA does not have num_head as param, add it Test Plan: In the following diff Reviewed By: swolchok Differential Revision: D33972168 fbshipit-source-id: 6b31bd6a516354d781e6dd5eea347a31d6cea272 (cherry picked from commit 3d07066)
Reviewed By: zertosh Differential Revision: D34028007 fbshipit-source-id: 6bb39fab7232baae3769d68857964a2a4e581e7d (cherry picked from commit 3a05821)
Summary: This is an automated pull request to update the first-party submodule for [pytorch/FBGEMM](https://github.com/pytorch/FBGEMM). New submodule commit: pytorch/FBGEMM@49fe829 Pull Request resolved: #72158 Test Plan: Ensure that CI jobs succeed on GitHub before landing. Reviewed By: jspark1105 Differential Revision: D33932905 Pulled By: jasonjk-park fbshipit-source-id: 8b40ba08de2880f374e5b2dc09a7a451262385a7 (cherry picked from commit 67944c2)
Summary: Pull Request resolved: #71312 Renames `seen_op_info` to `seen_q_op_info` and `SeenOpInfo` to `SeenQOpInfo`. This is to make it clear that the op here is a quantizeable op. This is useful for a future PR where we will start recording the DAG of non-quantizeable ops. That will be needed to properly support function fusion. Test Plan: CI and mypy Reviewed By: albanD Differential Revision: D33584751 Pulled By: vkuzo fbshipit-source-id: 0b659d4ecefc96d532c451abac410c638e457dcb (cherry picked from commit 6d85745)
Summary: Pull Request resolved: #71324 In a future PR we need to start recording the DAG of non-quantizeable ops, in order to properly reason about function fusion. This PR refactors the prepare hooks to have the first_call versions be separate functions. This ensures that the logic to record the DAG (the first_call functions) is separate from the logic used at inference. Test Plan: ``` python test/test_quantization.py -k DBR ``` Reviewed By: albanD Differential Revision: D33588558 Pulled By: vkuzo fbshipit-source-id: 7b27ee4e5b64f26bfb082ca2bbf2c04894bd2a97 (cherry picked from commit 21fa635)
Summary: Pull Request resolved: #71551 This PR makes DBR quant record the DAG of non-quantizeable ops. Having this will enable us to analyze the entire traced graph of pytorch ops, regardless of whether they support quantization or not. That, in turn, will enable analysis of uses of each op, allowing us to safely determine whether a subgraph of ops can be fused or not. In future PRs, this functionality will be used to implement function fusion. Test Plan: ``` python test/test_quantization.py -k DBR ``` Reviewed By: jerryzh168 Differential Revision: D33684130 Pulled By: vkuzo fbshipit-source-id: 497d9882f0670a36eef2a0900ea2517c82addf66 (cherry picked from commit b0d48a8)
Summary: Pull Request resolved: #71764 For DBR quant, adds the code for matching seen ops to function fusion patterns. After we have the full DAG, we have a separate pass over the dag and add matched fusion patterns to the seen op data structure. This is the first PR in the stack which implements matching and recording the match results. Future PRs in this stack will use the match results to modify observer insertion and inference. Test Plan: ``` python test/test_quantization.py TestQuantizeDBR.test_fusion_functions ``` Reviewed By: jerryzh168 Differential Revision: D33775098 Pulled By: vkuzo fbshipit-source-id: 488aac902bf568d41c863ee49248990411ed9c53 (cherry picked from commit 4ad1ca1)
Summary: Pull Request resolved: #71780 Adds support for matching operator.add -> torch.relu in FX graph mode quantization. It would be nice to support torch.relu better in general, but saving that for a future PR to keep PRs small. This is useful for DBR quant because we have some test cases in DBR quant which use add-relu, and we'd like to match them to FX. Test Plan: ``` python test/test_quantization.py TestQuantizeFxOps.test_add_relu python test/test_quantization.py TestQuantizeFxOps.test_mul_relu ``` Reviewed By: jerryzh168 Differential Revision: D33775096 Pulled By: vkuzo fbshipit-source-id: 889d9b41d3758ecbbb6d7eab67f64ce3d4892d24 (cherry picked from commit c1f9f38)
…ence (#71781) Summary: Pull Request resolved: #71781 The previous PR added information about fusions found in the subgraphs. This PR uses that information for: 1. inserting observers at the end of fusions and not in the middle 2. during inference, replacing the original op with the fused op. The way this is implemented is that the base op is replaced with the fused op, and all other ops are replaced with identity functions. Test Plan: ``` python test/test_quantization.py TestQuantizeDBR.test_fusion_functions ``` Reviewed By: jerryzh168 Differential Revision: D33775097 Pulled By: vkuzo fbshipit-source-id: 12249b85b2f7ba7545a54872aeb5f1ff2fc928cf (cherry picked from commit 0db4324)
Summary: Pull Request resolved: #72344 ATen core is mostly compliant already so we can just add the flag to the build system. The only exception is interned string which includes symbols like `aten::add` generated for each operator. Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D34010820 Pulled By: albanD fbshipit-source-id: ef1a625d96f30457b5e6beffc5e630516e54f9b4 (cherry picked from commit b90c262)
Summary: Pull Request resolved: #72284 This update adds the prod op to the fx2trt tool which is used to create a TensorRT engine for a PyTorch model. Test Plan: A new unit test was added to test that the op was added to the acc tracer. This text can be run using the following command: buck test --debug //caffe2/test:test_fx_acc_tracer -- --exact 'caffe2/test:test_fx_acc_tracer - test_prod (fx_acc.test_acc_tracer.AccTracerTest)' A new suite of unit tests were also added for the conversion to tensorRT and can be tested using the following command: buck test mode/dev-nosan //caffe2/test/fx2trt/converters:test_prod Please note that unfortunately unlike other pytorch reduce ops such as sum, the pytorch prod function does not support reducing more than 1 dimension at a time (the dim arg cannot be a tuple, only a single int is acceptable for prod). Therefore prod cannot utilize all of the reduce_op code. https://pxl.cl/1Xpn8 https://pxl.cl/1Xpn9 Reviewed By: 842974287 Differential Revision: D33875336 fbshipit-source-id: f9340db3685d681b1cf4ffc3b9fd25d16914e231 (cherry picked from commit cfe48d3)
Pull Request resolved: #72372
Summary: Pull Request resolved: #71129 cc ezyang albanD zou3519 gqchen pearu nikitaved soulitzer Lezcano Varal7 Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D34012577 Pulled By: anjali411 fbshipit-source-id: 02d2f2d761f7c9332e2f3cc529e8f1c6b60d7da2 (cherry picked from commit 87318a2)
Summary: This reverts the previous PR and add some comments to make it clear what the intent is. Also removes some extra static_assert that are not needed (at least for the compilers I tried). Pull Request resolved: #72336 Reviewed By: r-barnes Differential Revision: D34006722 Pulled By: albanD fbshipit-source-id: 290fb89a2d2c66a0d1c3651198b31d21216ec230 (cherry picked from commit 76f0aaa)
Summary: Pull Request resolved: #70030 range_push and range_pop do not support multi-thread. It only works for push and pop range in the same thread. For process level ranges, we should use range_start and range_end. This is important because PyTorch forward is on one thread, while the autograd is on a different thread. See NVidia implementation documentation: https://github.com/nvpro-samples/shared_external/blob/cab2dec7608ebc9d36fb086a07ce5112700b089d/NSight/nvToolsExt.h#L397-L407 Test Plan: ``` buck test caffe2/test:cuda Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/8162774391483460 ✓ ListingSuccess: caffe2/test:cuda - main (19.640) Summary ListingSuccess: 1 If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users Finished test run: https://www.internalfb.com/intern/testinfra/testrun/8162774391483460 ``` Reviewed By: malfet Differential Revision: D33155244 fbshipit-source-id: c7d5143f6da9b6ef0e0811e2fcae03a3e76f24de (cherry picked from commit 22134e9)
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit 8aabaa5 (more details on the Dr. CI page):
🕵️ 1 new failure recognized by patternsThe following CI failures do not appear to be due to upstream breakages:
|
| Job | Step | Action |
|---|---|---|
| Assert that regenerating the workflows didn't change them | 🔁 rerun | |
| Run mypy | 🔁 rerun |
This comment was automatically generated by Dr. CI (expand for details).
Please report bugs/suggestions to the (internal) Dr. CI Users group.
Krovatkin
approved these changes
Feb 15, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
cla signed
module: fx
module: rocm
AMD GPU support for Pytorch
oncall: distributed
Add this issue/PR to distributed oncall triage queue
oncall: jit
Add this issue/PR to JIT oncall triage queue
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.