Skip to content

Conversation

@durumu
Copy link
Contributor

@durumu durumu commented Jun 24, 2020

This diff adds FakeQuantizeWithBackward. This works the same way as the regular FakeQuantize module, allowing QAT to occur in the forward pass, except it has an additional quantize_backward parameter. When quantize_backward is enabled, the gradients are fake quantized as well (dynamically, using hard-coded values). This allows the user to see whether there would be a significant loss of accuracy if the gradients were quantized in their model.

Stack from ghstack:

Differential Revision: D22217029

durumu added a commit that referenced this pull request Jun 24, 2020
ghstack-source-id: dae1f4e
Pull Request resolved: #40532
@dr-ci
Copy link

dr-ci bot commented Jun 24, 2020

💊 CI failures summary and remediations

As of commit b9993f4 (more details on the Dr. CI page):


  • 8/8 failures possibly* introduced in this PR
    • 1/8 non-CircleCI failure(s)

🕵️ 7 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages (reran 1 job to discount flakiness):

See CircleCI build pytorch_linux_xenial_py3_6_gcc5_4_ge_config_simple_test (1/7)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun) <confirmed not flaky by 2 failures>

Jul 29 22:04:01 ERROR:sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp: In function \'int main()\':\n/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp:2:23: error: expected \';\' before \'}\' token\n int main() { return 0 }\n ^\n" }
Jul 29 22:04:01 Traceback (most recent call last): 
Jul 29 22:04:01   File "test/run_test.py", line 752, in <module> 
Jul 29 22:04:01     main() 
Jul 29 22:04:01   File "test/run_test.py", line 741, in main 
Jul 29 22:04:01     raise RuntimeError(err) 
Jul 29 22:04:01 RuntimeError: test_type_hints failed! 
Jul 29 22:04:01 + cleanup 
Jul 29 22:04:01 + retcode=1 
Jul 29 22:04:01 + set +x 
Jul 29 22:04:01 =================== sccache compilation log =================== 
Jul 29 22:04:01 ERROR:sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp: In function \'int main()\':\n/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp:2:23: error: expected \';\' before \'}\' token\n int main() { return 0 }\n                       ^\n" } 
Jul 29 22:04:01  
Jul 29 22:04:01 =========== If your build fails, please take a look at the log above for possible reasons =========== 
Jul 29 22:04:01 Compile requests                 0 
Jul 29 22:04:01 Compile requests executed        0 
Jul 29 22:04:01 Cache hits                       0 
Jul 29 22:04:01 Cache misses                     0 
Jul 29 22:04:01 Cache timeouts                   0 
Jul 29 22:04:01 Cache read errors                0 
Jul 29 22:04:01 Forced recaches                  0 
Jul 29 22:04:01 Cache write errors               0 

See CircleCI build pytorch_linux_xenial_py3_6_gcc5_4_test (2/7)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun) <confirmed not flaky by 2 failures>

Jul 29 22:06:58 ERROR:sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp: In function \'int main()\':\n/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp:2:23: error: expected \';\' before \'}\' token\n int main() { return 0 }\n ^\n" }
Jul 29 22:06:58 Traceback (most recent call last): 
Jul 29 22:06:58   File "test/run_test.py", line 752, in <module> 
Jul 29 22:06:58     main() 
Jul 29 22:06:58   File "test/run_test.py", line 741, in main 
Jul 29 22:06:58     raise RuntimeError(err) 
Jul 29 22:06:58 RuntimeError: test_type_hints failed! 
Jul 29 22:06:58 =================== sccache compilation log =================== 
Jul 29 22:06:58 + cleanup 
Jul 29 22:06:58 + retcode=1 
Jul 29 22:06:58 + set +x 
Jul 29 22:06:58 ERROR:sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp: In function \'int main()\':\n/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp:2:23: error: expected \';\' before \'}\' token\n int main() { return 0 }\n                       ^\n" } 
Jul 29 22:06:58  
Jul 29 22:06:58 =========== If your build fails, please take a look at the log above for possible reasons =========== 
Jul 29 22:06:58 Compile requests                 0 
Jul 29 22:06:58 Compile requests executed        0 
Jul 29 22:06:58 Cache hits                       0 
Jul 29 22:06:58 Cache misses                     0 
Jul 29 22:06:58 Cache timeouts                   0 
Jul 29 22:06:58 Cache read errors                0 
Jul 29 22:06:58 Forced recaches                  0 
Jul 29 22:06:58 Cache write errors               0 

See CircleCI build pytorch_linux_bionic_py3_8_gcc9_test (3/7)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun) <confirmed not flaky by 2 failures>

Jul 29 22:01:16 ERROR:sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp: In function ‘int main()’:\n/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp:2:22: error: expected ‘;’ before ‘}’ token\n 2 | int main() { return 0 }\n | ^~\n | ;\n" }
Jul 29 22:01:16     raise RuntimeError(err) 
Jul 29 22:01:16 RuntimeError: test_type_hints failed! 
Jul 29 22:01:16  
Jul 29 22:01:16 real	28m22.749s 
Jul 29 22:01:16 user	38m6.077s 
Jul 29 22:01:16 sys	1m20.367s 
Jul 29 22:01:16 + cleanup 
Jul 29 22:01:16 + retcode=1 
Jul 29 22:01:16 + set +x 
Jul 29 22:01:16 =================== sccache compilation log =================== 
Jul 29 22:01:16 ERROR:sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp: In function ‘int main()’:\n/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp:2:22: error: expected ‘;’ before ‘}’ token\n    2 | int main() { return 0 }\n      |                      ^~\n      |                      ;\n" } 
Jul 29 22:01:16  
Jul 29 22:01:16 =========== If your build fails, please take a look at the log above for possible reasons =========== 
Jul 29 22:01:16 Compile requests                 65 
Jul 29 22:01:16 Compile requests executed        35 
Jul 29 22:01:16 Cache hits                       27 
Jul 29 22:01:16 Cache misses                      7 
Jul 29 22:01:16 Cache timeouts                    0 
Jul 29 22:01:16 Cache read errors                 0 
Jul 29 22:01:16 Forced recaches                   0 
Jul 29 22:01:16 Cache write errors                0 

See CircleCI build pytorch_linux_xenial_py3_clang5_asan_test2 (4/7)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun) <confirmed not flaky by 2 failures>

Jul 29 21:30:06 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /var/lib/jenkins/workspace/aten/src/ATen/Utils.cpp:11:3 in
Jul 29 21:30:06     #7 0x55bdf8f0f7eb in PyEval_EvalCode /tmp/build/80754af9/python_1588903631989/work/Python/ceval.c:731 
Jul 29 21:30:06     #8 0x55bdf8f8fe73 in run_mod /tmp/build/80754af9/python_1588903631989/work/Python/pythonrun.c:1025 
Jul 29 21:30:06     #9 0x55bdf8f8ff0c in PyRun_StringFlags /tmp/build/80754af9/python_1588903631989/work/Python/pythonrun.c:949 
Jul 29 21:30:06     #10 0x55bdf8f8ff6e in PyRun_SimpleStringFlags /tmp/build/80754af9/python_1588903631989/work/Python/pythonrun.c:445 
Jul 29 21:30:06     #11 0x55bdf8f93d72 in run_command /tmp/build/80754af9/python_1588903631989/work/Modules/main.c:301 
Jul 29 21:30:06     #12 0x55bdf8f93d72 in Py_Main /tmp/build/80754af9/python_1588903631989/work/Modules/main.c:749 
Jul 29 21:30:06     #13 0x55bdf8e5df2d in main /tmp/build/80754af9/python_1588903631989/work/Programs/python.c:69 
Jul 29 21:30:06     #14 0x7f2559a2483f in __libc_start_main /build/glibc-e6zv40/glibc-2.23/csu/../csu/libc-start.c:291 
Jul 29 21:30:06     #15 0x55bdf8f3d27e in _start /home/rdonnelly/mc/conda-bld/compilers_linux-64_1534865402226/work/.build/src/glibc-2.12.2/csu/../sysdeps/x86_64/elf/start.S:103 
Jul 29 21:30:06  
Jul 29 21:30:06 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /var/lib/jenkins/workspace/aten/src/ATen/Utils.cpp:11:3 in  
Jul 29 21:30:06 + retcode=1 
Jul 29 21:30:06 + set -e 
Jul 29 21:30:06 + return 1 
Jul 29 21:30:06 + [[ pytorch-linux-xenial-py3-clang5-asan-test2 == *-NO_AVX-* ]] 
Jul 29 21:30:06 + [[ pytorch-linux-xenial-py3-clang5-asan-test2 == *-NO_AVX2-* ]] 
Jul 29 21:30:06 + '[' -n '' ']' 
Jul 29 21:30:06 + [[ pytorch-linux-xenial-py3-clang5-asan-test2 == *tbb* ]] 
Jul 29 21:30:06 + [[ pytorch-linux-xenial-py3-clang5-asan-test2 == *libtorch* ]] 
Jul 29 21:30:06 + [[ pytorch-linux-xenial-py3-clang5-asan-test2 == *-bazel-* ]] 
Jul 29 21:30:06 + cd test 

See CircleCI build pytorch_linux_bionic_py3_7_conda_test (5/7)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun) <confirmed not flaky by 2 failures>

Jul 29 21:59:08 ERROR:sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp: In function ‘int main()’:\n/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp:2:23: error: expected ‘;’ before ‘}’ token\n int main() { return 0 }\n ^\n" }
Jul 29 21:59:07     raise RuntimeError(err) 
Jul 29 21:59:07 RuntimeError: test_type_hints failed! 
Jul 29 21:59:08  
Jul 29 21:59:08 real	29m6.833s 
Jul 29 21:59:08 user	38m25.963s 
Jul 29 21:59:08 sys	1m19.189s 
Jul 29 21:59:08 + cleanup 
Jul 29 21:59:08 + retcode=1 
Jul 29 21:59:08 + set +x 
Jul 29 21:59:08 =================== sccache compilation log =================== 
Jul 29 21:59:08 ERROR:sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp: In function ‘int main()’:\n/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp:2:23: error: expected ‘;’ before ‘}’ token\n int main() { return 0 }\n                       ^\n" } 
Jul 29 21:59:08  
Jul 29 21:59:08 =========== If your build fails, please take a look at the log above for possible reasons =========== 
Jul 29 21:59:08 Compile requests                 0 
Jul 29 21:59:08 Compile requests executed        0 
Jul 29 21:59:08 Cache hits                       0 
Jul 29 21:59:08 Cache misses                     0 
Jul 29 21:59:08 Cache timeouts                   0 
Jul 29 21:59:08 Cache read errors                0 
Jul 29 21:59:08 Forced recaches                  0 
Jul 29 21:59:08 Cache write errors               0 

See CircleCI build pytorch_linux_bionic_py3_6_clang9_test (6/7)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun) <confirmed not flaky by 2 failures>

Jul 29 21:57:41 ERROR:sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp: In function \'int main()\':\n/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp:2:23: error: expected \';\' before \'}\' token\n int main() { return 0 }\n ^\n" }
Jul 29 21:57:41     raise RuntimeError(err) 
Jul 29 21:57:41 RuntimeError: test_type_hints failed! 
Jul 29 21:57:41  
Jul 29 21:57:41 real	28m34.372s 
Jul 29 21:57:41 user	41m11.053s 
Jul 29 21:57:41 sys	2m31.110s 
Jul 29 21:57:41 + cleanup 
Jul 29 21:57:41 + retcode=1 
Jul 29 21:57:41 + set +x 
Jul 29 21:57:41 =================== sccache compilation log =================== 
Jul 29 21:57:41 ERROR:sccache::server: Compilation failed: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp: In function \'int main()\':\n/var/lib/jenkins/.cache/torch_extensions/test_compilation_error_formatting/main.cpp:2:23: error: expected \';\' before \'}\' token\n int main() { return 0 }\n                       ^\n" } 
Jul 29 21:57:41  
Jul 29 21:57:41 =========== If your build fails, please take a look at the log above for possible reasons =========== 
Jul 29 21:57:41 Compile requests                 65 
Jul 29 21:57:41 Compile requests executed        35 
Jul 29 21:57:41 Cache hits                       27 
Jul 29 21:57:41 Cache misses                      7 
Jul 29 21:57:41 Cache timeouts                    0 
Jul 29 21:57:41 Cache read errors                 0 
Jul 29 21:57:41 Forced recaches                   0 
Jul 29 21:57:41 Cache write errors                0 

See CircleCI build pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test (7/7)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun) <confirmed not flaky by 2 failures>

Jul 29 22:23:02 ConnectionResetError: [Errno 104] Connection reset by peer
Jul 29 22:23:02   File "/opt/conda/lib/python3.6/multiprocessing/connection.py", line 494, in Client 
Jul 29 22:23:02     deliver_challenge(c, authkey) 
Jul 29 22:23:02   File "/opt/conda/lib/python3.6/multiprocessing/connection.py", line 722, in deliver_challenge 
Jul 29 22:23:02     response = connection.recv_bytes(256)        # reject large message 
Jul 29 22:23:02   File "/opt/conda/lib/python3.6/multiprocessing/connection.py", line 216, in recv_bytes 
Jul 29 22:23:02     buf = self._recv_bytes(maxlength) 
Jul 29 22:23:02   File "/opt/conda/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes 
Jul 29 22:23:02     buf = self._recv(4) 
Jul 29 22:23:02   File "/opt/conda/lib/python3.6/multiprocessing/connection.py", line 379, in _recv 
Jul 29 22:23:02     chunk = read(handle, remaining) 
Jul 29 22:23:02 ConnectionResetError: [Errno 104] Connection reset by peer 
Jul 29 22:23:02  
Jul 29 22:23:02 Process ErrorTrackingProcess-150: 
Jul 29 22:23:02 Traceback (most recent call last): 
Jul 29 22:23:02   File "/opt/conda/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap 
Jul 29 22:23:02     self.run() 
Jul 29 22:23:02   File "/var/lib/jenkins/workspace/test/test_dataloader.py", line 361, in run 
Jul 29 22:23:02     super(ErrorTrackingProcess, self).run() 
Jul 29 22:23:02   File "/opt/conda/lib/python3.6/multiprocessing/process.py", line 93, in run 
Jul 29 22:23:02     self._target(*self._args, **self._kwargs) 
Jul 29 22:23:02   File "/var/lib/jenkins/workspace/test/test_dataloader.py", line 629, in _test_proper_exit 

ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 44 times.

durumu added a commit that referenced this pull request Jun 26, 2020
ghstack-source-id: 9a8d614
Pull Request resolved: #40532
durumu added a commit that referenced this pull request Jul 14, 2020
ghstack-source-id: f7c00c9
Pull Request resolved: #40532

Adds unit test for validating correctness

ghstack-source-id: f7c00c9
Pull Request resolved: #40715
durumu added a commit that referenced this pull request Jul 21, 2020
ghstack-source-id: f185ce9
Pull Request resolved: #40532

Adds unit test for validating correctness

ghstack-source-id: f185ce9
Pull Request resolved: #40715
@facebook-github-bot
Copy link
Contributor

@durumu merged this pull request in 48e978b.

@mruberry
Copy link
Collaborator

mruberry commented Aug 8, 2020

Unlanding. This broke multiple OSS CI builds. Example failure:

Aug 08 01:52:40 FAIL [52.052s]: test_run_mypy (__main__.TestTypeHints)
Aug 08 01:52:40 Runs mypy over all files specified in mypy.ini
Aug 08 01:52:40 ----------------------------------------------------------------------
Aug 08 01:52:40 Traceback (most recent call last):
Aug 08 01:52:40   File "test_type_hints.py", line 218, in test_run_mypy
Aug 08 01:52:40     self.fail("mypy failed: {}".format(stdout))
Aug 08 01:52:40 AssertionError: mypy failed: torch/quantization/qconfig.py:128: error: Module has no attribute "per_tensor_symmetric"  [attr-defined]
Aug 08 01:52:40 Found 1 error in 1 file (checked 1037 source files)
Aug 08 01:52:40 
Aug 08 01:52:40 
Aug 08 01:52:40 ----------------------------------------------------------------------
Aug 08 01:52:40 Ran 4 tests in 85.186s
Aug 08 01:52:40 
Aug 08 01:52:40 FAILED (failures=1)

@facebook-github-bot facebook-github-bot deleted the gh/durumu/6/head branch August 11, 2020 14:16
MauiDesign pushed a commit to MauiDesign/PyTorchPyTorch that referenced this pull request Aug 16, 2020
ghstack-source-id: 331009d
Pull Request resolved: pytorch/pytorch#40532

Adds unit test for validating correctness

ghstack-source-id: 331009d
Pull Request resolved: pytorch/pytorch#40715
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants