[quant][graphmode][fx] Add support for dynamic quant for RNN and RNNCell #49126

jerryzh168 · 2020-12-09T23:30:00Z

Stack from ghstack:

[quant][be] Add typing for quantization_mappings.py #49179 [quant][be] Add typing for quantization_mappings.py
[quant][graphmode][fx] Add support for dynamic quant for RNN and RNNCell #49126 [quant][graphmode][fx] Add support for dynamic quant for RNN and RNNCell

Summary:

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_rnn
python test/test_quantization.py TestQuantizeFxOps.test_rnn_cell

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D25449047

Summary: Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_rnn python test/test_quantization.py TestQuantizeFxOps.test_rnn_cell Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

…NN and RNNCell" Summary: Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_rnn python test/test_quantization.py TestQuantizeFxOps.test_rnn_cell Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_rnn python test/test_quantization.py TestQuantizeFxOps.test_rnn_cell Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 810a87cfb19308fa036220936e456df46c5714f5 Pull Request resolved: #49126

dr-ci · 2020-12-09T23:38:48Z

💊 CI failures summary and remediations

As of commit ff4353e (more details on the Dr. CI page):

2/4 failures possibly* introduced in this PR
- 1/2 non-CircleCI failure(s)
2/4 broken upstream at merge base e5a98c5 on Dec 09 from 8:37am to 6:41pm

🕵️ 3 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test1 (1/3)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Dec 10 22:07:54 [E request_callback_no_python.cpp:636] Received error while processing request type 258: RuntimeError: Can not pickle torch.futures.Future

Dec 10 22:07:54 At:
Dec 10 22:07:54   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(120): serialize
Dec 10 22:07:54   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(172): serialize
Dec 10 22:07:54 
Dec 10 22:07:54 [E request_callback_no_python.cpp:636] Received error while processing request type 258: RuntimeError: Can not pickle torch.futures.Future
Dec 10 22:07:54 
Dec 10 22:07:54 At:
Dec 10 22:07:54   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(120): serialize
Dec 10 22:07:54   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(172): serialize
Dec 10 22:07:54 
Dec 10 22:07:54 [E request_callback_no_python.cpp:636] Received error while processing request type 258: RuntimeError: Can not pickle torch.futures.Future
Dec 10 22:07:54 
Dec 10 22:07:54 At:
Dec 10 22:07:54   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(120): serialize
Dec 10 22:07:54   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(172): serialize
Dec 10 22:07:54 
Dec 10 22:07:54 [W tensorpipe_agent.cpp:547] RPC agent for worker2 encountered error when reading incoming request from worker0: EOF: end of file (this is expected to happen during shutdown)
Dec 10 22:07:54 [W tensorpipe_agent.cpp:547] RPC agent for worker0 encountered error when reading incoming request from worker3: EOF: end of file (this is expected to happen during shutdown)
Dec 10 22:07:54 [W tensorpipe_agent.cpp:547] RPC agent for worker0 encountered error when reading incoming request from worker2: EOF: end of file (this is expected to happen during shutdown)
Dec 10 22:07:55 ok (1.227s)
Dec 10 22:07:56   test_return_future_remote (__main__.TensorPipeRpcTestWithSpawn) ... [W tensorpipe_agent.cpp:547] RPC agent for worker1 encountered error when reading incoming request from worker0: EOF: end of file (this is expected to happen during shutdown)

pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test2 (2/3)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Dec 10 21:38:14 [E request_callback_no_python.cpp:636] Received error while processing request type 258: RuntimeError: Can not pickle torch.futures.Future

Dec 10 21:38:14 At:
Dec 10 21:38:14   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(120): serialize
Dec 10 21:38:14   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(172): serialize
Dec 10 21:38:14 
Dec 10 21:38:14 [E request_callback_no_python.cpp:636] Received error while processing request type 258: RuntimeError: Can not pickle torch.futures.Future
Dec 10 21:38:14 
Dec 10 21:38:14 At:
Dec 10 21:38:14   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(120): serialize
Dec 10 21:38:14   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(172): serialize
Dec 10 21:38:14 
Dec 10 21:38:14 [E request_callback_no_python.cpp:636] Received error while processing request type 258: RuntimeError: Can not pickle torch.futures.Future
Dec 10 21:38:14 
Dec 10 21:38:14 At:
Dec 10 21:38:14   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(120): serialize
Dec 10 21:38:14   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(172): serialize
Dec 10 21:38:14 
Dec 10 21:38:14 ok (1.026s)
Dec 10 21:38:15   test_return_future_remote (__main__.ProcessGroupRpcTestWithSpawn) ... RPC was initialized with the PROCESS_GROUP backend which is deprecated and slated to be removed and superseded by the TENSORPIPE backend. It is recommended to migrate to the TENSORPIPE backend.
Dec 10 21:38:15 RPC was initialized with the PROCESS_GROUP backend which is deprecated and slated to be removed and superseded by the TENSORPIPE backend. It is recommended to migrate to the TENSORPIPE backend.
Dec 10 21:38:15 RPC was initialized with the PROCESS_GROUP backend which is deprecated and slated to be removed and superseded by the TENSORPIPE backend. It is recommended to migrate to the TENSORPIPE backend.
Dec 10 21:38:15 RPC was initialized with the PROCESS_GROUP backend which is deprecated and slated to be removed and superseded by the TENSORPIPE backend. It is recommended to migrate to the TENSORPIPE backend.

pytorch_linux_xenial_py3_clang7_onnx_build (3/3)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

Dec 10 19:31:13 sccache: error: couldn't connect to server

Dec 10 19:31:13 +++ eval 'extract_trap_cmd '
Dec 10 19:31:13 ++++ extract_trap_cmd
Dec 10 19:31:13 ++++ printf '%s\n' ''
Dec 10 19:31:13 +++ printf '%s\n' cleanup
Dec 10 19:31:13 ++ trap -- '
Dec 10 19:31:13 cleanup' EXIT
Dec 10 19:31:13 ++ [[ pytorch-linux-xenial-py3-clang7-onnx-build != *pytorch-win-* ]]
Dec 10 19:31:13 ++ which sccache
Dec 10 19:31:13 ++ sccache --stop-server
Dec 10 19:31:13 Stopping sccache server...
Dec 10 19:31:13 sccache: error: couldn't connect to server
Dec 10 19:31:13 sccache: caused by: Connection refused (os error 111)
Dec 10 19:31:13 ++ true
Dec 10 19:31:13 ++ rm /var/lib/jenkins/sccache_error.log
Dec 10 19:31:13 rm: cannot remove '/var/lib/jenkins/sccache_error.log': No such file or directory
Dec 10 19:31:13 ++ true
Dec 10 19:31:13 ++ [[ pytorch-linux-xenial-py3-clang7-onnx-build == *rocm* ]]
Dec 10 19:31:13 ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log
Dec 10 19:31:13 ++ SCCACHE_IDLE_TIMEOUT=1200
Dec 10 19:31:13 ++ RUST_LOG=sccache::server=error

--- ### 🚧 2 fixed upstream failures: These were probably **caused by upstream breakages** that were **already fixed**.

Please rebase on the viable/strict branch (expand for instructions)

If your commit is older than viable/strict, run these commands:

git fetch https://github.com/pytorch/pytorch viable/strict
git rebase FETCH_HEAD

Check out the recency history of this "viable master" tracking branch.

pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test1 on Dec 09 from 8:37am to 6:41pm (9f7fb54 - c7cc8a4)
- 🔁 rerun
pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test2 on Dec 09 from 10:09am to 6:20pm (492580b - bfa95f9)
- 🔁 rerun

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 9 times.

raghuramank100 · 2020-12-10T01:25:37Z

test/quantization/test_quantize_fx.py

+            }
+            model_graph = prepare_fx(model_graph, graph_qconfig_dict)
+            model_graph = convert_fx(model_graph)
+            self.assertEqual(model_eager(sample_input), model_graph(sample_input))


Do we need to test for serialization here?

this should be the same as eager mode module, I'm not very familiar, are we using state_dict?

or are you referring to checkScriptable

added checkScriptable here, but in general we'll do e2e test in TestQuantizeFxModels

vkuzo · 2020-12-10T04:04:18Z

torch/quantization/quantization_mappings.py

@@ -124,6 +124,19 @@ def get_static_quant_module_class(float_module_class, additional_static_quant_ma
        " does not have a corresponding quantized module class"
    return static_quant_module_class

+def get_dynamic_quant_module_class(float_module_class, additional_dynamic_quant_mapping=None):


would be great to add types to function I/O

we can add it in a separate PR I think, all other functions in this file are not typed yet

it's not blocking this PR, but would be awesome if we started adding these as we go, at least to function I/O. We don't have to wait for a file to have existing type annots to add more. This also distributes the cost of adding them to everyone, as opposed to one person.

yeah, fully agree that we should add types as we change code. I'm saying I plan to add it in a separate PR, or are you suggesting to add the type annotations for the functions in this file in this PR?

torch/quantization/fx/quantization_patterns.py

…NN and RNNCell" Summary: Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_rnn python test/test_quantization.py TestQuantizeFxOps.test_rnn_cell Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D25449047](https://our.internmc.facebook.com/intern/diff/D25449047) [ghstack-poisoned]

facebook-github-bot · 2020-12-11T03:12:52Z

This pull request has been merged in 882eb0f.

[quant][graphmode][fx] Add support for dynamic quant for RNN and RNNCell

ad64e84

Summary: Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_rnn python test/test_quantization.py TestQuantizeFxOps.test_rnn_cell Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

facebook-github-bot added cla signed fx labels Dec 9, 2020

jerryzh168 requested review from supriyar, z-a-f, vkuzo and raghuramank100 and removed request for supriyar December 9, 2020 23:35

raghuramank100 reviewed Dec 10, 2020

View reviewed changes

raghuramank100 approved these changes Dec 10, 2020

View reviewed changes

vkuzo reviewed Dec 10, 2020

View reviewed changes

torch/quantization/fx/quantization_patterns.py Outdated Show resolved Hide resolved

jerryzh168 mentioned this pull request Dec 10, 2020

[quant][be] Add typing for quantization_mappings.py #49179

Closed

facebook-github-bot closed this in 882eb0f Dec 11, 2020

facebook-github-bot added the Merged label Dec 11, 2020

facebook-github-bot deleted the gh/jerryzh168/518/head branch December 14, 2020 15:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[quant][graphmode][fx] Add support for dynamic quant for RNN and RNNCell #49126

[quant][graphmode][fx] Add support for dynamic quant for RNN and RNNCell #49126

jerryzh168 commented Dec 9, 2020 •

edited

dr-ci bot commented Dec 9, 2020 •

edited by facebook-github-bot

raghuramank100 Dec 10, 2020

jerryzh168 Dec 10, 2020 •

edited

jerryzh168 Dec 10, 2020

jerryzh168 Dec 10, 2020

vkuzo Dec 10, 2020

jerryzh168 Dec 10, 2020

vkuzo Dec 10, 2020

jerryzh168 Dec 10, 2020 •

edited

facebook-github-bot commented Dec 11, 2020

[quant][graphmode][fx] Add support for dynamic quant for RNN and RNNCell #49126

[quant][graphmode][fx] Add support for dynamic quant for RNN and RNNCell #49126

Conversation

jerryzh168 commented Dec 9, 2020 • edited

dr-ci bot commented Dec 9, 2020 • edited by facebook-github-bot

💊 CI failures summary and remediations

🕵️ 3 new failures recognized by patterns

pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test1 (1/3)

pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test2 (2/3)

pytorch_linux_xenial_py3_clang7_onnx_build (3/3)

raghuramank100 Dec 10, 2020

Choose a reason for hiding this comment

jerryzh168 Dec 10, 2020 • edited

Choose a reason for hiding this comment

jerryzh168 Dec 10, 2020

Choose a reason for hiding this comment

jerryzh168 Dec 10, 2020

Choose a reason for hiding this comment

vkuzo Dec 10, 2020

Choose a reason for hiding this comment

jerryzh168 Dec 10, 2020

Choose a reason for hiding this comment

vkuzo Dec 10, 2020

Choose a reason for hiding this comment

jerryzh168 Dec 10, 2020 • edited

Choose a reason for hiding this comment

facebook-github-bot commented Dec 11, 2020

jerryzh168 commented Dec 9, 2020 •

edited

dr-ci bot commented Dec 9, 2020 •

edited by facebook-github-bot

jerryzh168 Dec 10, 2020 •

edited

jerryzh168 Dec 10, 2020 •

edited