[quant][core][gpu][bux fix] Added clone and contiguous() to broadcasted_bias tensor in quantized cudnn linear op #75944

dzdang · 2022-04-17T02:14:56Z

Stack from ghstack (oldest at bottom):

[quant][core][gpu] Added additional uid-data_ptr pair for broadcasted_bias in quantized cudnn linear op #75969
[quant][core][gpu][improvement] Set tensors as virtual in quantized cudnn linear op #75967
-> [quant][core][gpu][bux fix] Added clone and contiguous() to broadcasted_bias tensor in quantized cudnn linear op #75944

Summary:
The previous implementation for broadcasted_bias in quantized cudnn
linear op has 2 issues.

broadcasted_bias is a view of the the input bias tensor. This is not
desired as any modifications to broadcasted_bias is also done to the
input bias. To remedy this, we clone the input bias tensor.
Calling broadcast_to doesn't affect the storage, which is problematic
for the cudnn operations. We need to create a fully broadcasted tensor,
rather than a view (which is what's returned by broadcast_to). To remedy
this, we call contiguous().

Test plan:
python test/test_quantization.py -k test_linear_cudnn

Differential Revision: D35717355

…ed_bias tensor in quantized cudnn linear op Summary: The previous implementation for broadcasted_bias in quantized cudnn linear op has 2 issues. 1) broadcasted_bias is a view of the the input bias tensor. This is not desired as any modifications to broadcasted_bias is also done to the input bias. To remedy this, we clone the input bias tensor. 2) Calling broadcast_to doesn't affect the storage, which is problematic for the cudnn operations. We need to create a fully broadcasted tensor, rather than a view (which is what's returned by broadcast_to). To remedy this, we call contiguous(). Test plan: python test/test_quantization.py -k test_linear_cudnn [ghstack-poisoned]

…ed_bias tensor in quantized cudnn linear op Summary: The previous implementation for broadcasted_bias in quantized cudnn linear op has 2 issues. 1) broadcasted_bias is a view of the the input bias tensor. This is not desired as any modifications to broadcasted_bias is also done to the input bias. To remedy this, we clone the input bias tensor. 2) Calling broadcast_to doesn't affect the storage, which is problematic for the cudnn operations. We need to create a fully broadcasted tensor, rather than a view (which is what's returned by broadcast_to). To remedy this, we call contiguous(). Test plan: python test/test_quantization.py -k test_linear_cudnn ghstack-source-id: 22cabed Pull Request resolved: #75944

facebook-github-bot · 2022-04-17T02:15:03Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/75944
📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
↩️ [fb-only] Re-run with SSH instructions
❓Need help or want to give feedback on the CI? Visit our office hours

💊 CI failures summary and remediations

As of commit 18948a3 (more details on the Dr. CI page):

1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages

pull / win-vs2019-cuda11.3-py3 / build (1/1)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

2022-04-18T13:42:58.4175014Z mt.exe : general e..."bin\c10_Synchronized_test.exe". Access is denied.

2022-04-18T13:42:57.6344824Z [4450/6284] cmd.exe /C "cd . && C:\Jenkins\Miniconda3\Library\bin\cmake.exe -E vs_link_exe --intdir=c10\test\CMakeFiles\c10_C++17_test.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\mt.exe --manifests  -- C:\PROGRA~2\MICROS~2\2019\BUILDT~1\VC\Tools\MSVC\1428~1.293\bin\Hostx64\x64\link.exe  c10\test\CMakeFiles\c10_C++17_test.dir\util\C++17_test.cpp.obj  /out:bin\c10_C++17_test.exe /implib:lib\c10_C++17_test.lib /pdb:bin\c10_C++17_test.pdb /version:0.0 /machine:x64 /ignore:4049 /ignore:4217 /ignore:4099 /INCREMENTAL:NO /subsystem:console  lib\c10.lib  lib\gmock.lib  lib\gtest.lib  lib\gtest_main.lib  lib\gtest.lib  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
2022-04-18T13:42:57.6498306Z [4451/6284] cmd.exe /C "cd . && C:\Jenkins\Miniconda3\Library\bin\cmake.exe -E vs_link_exe --intdir=c10\test\CMakeFiles\c10_Bitset_test.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\mt.exe --manifests  -- C:\PROGRA~2\MICROS~2\2019\BUILDT~1\VC\Tools\MSVC\1428~1.293\bin\Hostx64\x64\link.exe  c10\test\CMakeFiles\c10_Bitset_test.dir\util\Bitset_test.cpp.obj  /out:bin\c10_Bitset_test.exe /implib:lib\c10_Bitset_test.lib /pdb:bin\c10_Bitset_test.pdb /version:0.0 /machine:x64 /ignore:4049 /ignore:4217 /ignore:4099 /INCREMENTAL:NO /subsystem:console  lib\c10.lib  lib\gmock.lib  lib\gtest.lib  lib\gtest_main.lib  lib\gtest.lib  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
2022-04-18T13:42:58.1443942Z [4452/6284] cmd.exe /C "cd . && C:\Jenkins\Miniconda3\Library\bin\cmake.exe -E vs_link_exe --intdir=c10\test\CMakeFiles\c10_ConstexprCrc_test.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\mt.exe --manifests  -- C:\PROGRA~2\MICROS~2\2019\BUILDT~1\VC\Tools\MSVC\1428~1.293\bin\Hostx64\x64\link.exe  c10\test\CMakeFiles\c10_ConstexprCrc_test.dir\util\ConstexprCrc_test.cpp.obj  /out:bin\c10_ConstexprCrc_test.exe /implib:lib\c10_ConstexprCrc_test.lib /pdb:bin\c10_ConstexprCrc_test.pdb /version:0.0 /machine:x64 /ignore:4049 /ignore:4217 /ignore:4099 /INCREMENTAL:NO /subsystem:console  lib\c10.lib  lib\gmock.lib  lib\gtest.lib  lib\gtest_main.lib  lib\gtest.lib  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
2022-04-18T13:42:58.2170550Z [4453/6284] cmd.exe /C "cd . && C:\Jenkins\Miniconda3\Library\bin\cmake.exe -E vs_link_exe --intdir=c10\test\CMakeFiles\c10_LeftRight_test.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\mt.exe --manifests  -- C:\PROGRA~2\MICROS~2\2019\BUILDT~1\VC\Tools\MSVC\1428~1.293\bin\Hostx64\x64\link.exe  c10\test\CMakeFiles\c10_LeftRight_test.dir\util\LeftRight_test.cpp.obj  /out:bin\c10_LeftRight_test.exe /implib:lib\c10_LeftRight_test.lib /pdb:bin\c10_LeftRight_test.pdb /version:0.0 /machine:x64 /ignore:4049 /ignore:4217 /ignore:4099 /INCREMENTAL:NO /subsystem:console  lib\c10.lib  lib\gmock.lib  lib\gtest.lib  lib\gtest_main.lib  lib\gtest.lib  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
2022-04-18T13:42:58.2400366Z [4454/6284] cmd.exe /C "cd . && C:\Jenkins\Miniconda3\Library\bin\cmake.exe -E vs_link_exe --intdir=c10\test\CMakeFiles\c10_Half_test.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\mt.exe --manifests  -- C:\PROGRA~2\MICROS~2\2019\BUILDT~1\VC\Tools\MSVC\1428~1.293\bin\Hostx64\x64\link.exe  c10\test\CMakeFiles\c10_Half_test.dir\util\Half_test.cpp.obj  /out:bin\c10_Half_test.exe /implib:lib\c10_Half_test.lib /pdb:bin\c10_Half_test.pdb /version:0.0 /machine:x64 /ignore:4049 /ignore:4217 /ignore:4099 /INCREMENTAL:NO /subsystem:console  lib\c10.lib  lib\gmock.lib  lib\gtest.lib  lib\gtest_main.lib  lib\gtest.lib  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
2022-04-18T13:42:58.4170720Z [4455/6284] cmd.exe /C "cd . && C:\Jenkins\Miniconda3\Library\bin\cmake.exe -E vs_link_exe --intdir=c10\test\CMakeFiles\c10_Synchronized_test.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\mt.exe --manifests  -- C:\PROGRA~2\MICROS~2\2019\BUILDT~1\VC\Tools\MSVC\1428~1.293\bin\Hostx64\x64\link.exe  c10\test\CMakeFiles\c10_Synchronized_test.dir\util\Synchronized_test.cpp.obj  /out:bin\c10_Synchronized_test.exe /implib:lib\c10_Synchronized_test.lib /pdb:bin\c10_Synchronized_test.pdb /version:0.0 /machine:x64 /ignore:4049 /ignore:4217 /ignore:4099 /INCREMENTAL:NO /subsystem:console  lib\c10.lib  lib\gmock.lib  lib\gtest.lib  lib\gtest_main.lib  lib\gtest.lib  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
2022-04-18T13:42:58.4171855Z FAILED: bin/c10_Synchronized_test.exe 
2022-04-18T13:42:58.4173071Z cmd.exe /C "cd . && C:\Jenkins\Miniconda3\Library\bin\cmake.exe -E vs_link_exe --intdir=c10\test\CMakeFiles\c10_Synchronized_test.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\mt.exe --manifests  -- C:\PROGRA~2\MICROS~2\2019\BUILDT~1\VC\Tools\MSVC\1428~1.293\bin\Hostx64\x64\link.exe  c10\test\CMakeFiles\c10_Synchronized_test.dir\util\Synchronized_test.cpp.obj  /out:bin\c10_Synchronized_test.exe /implib:lib\c10_Synchronized_test.lib /pdb:bin\c10_Synchronized_test.pdb /version:0.0 /machine:x64 /ignore:4049 /ignore:4217 /ignore:4099 /INCREMENTAL:NO /subsystem:console  lib\c10.lib  lib\gmock.lib  lib\gtest.lib  lib\gtest_main.lib  lib\gtest.lib  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
2022-04-18T13:42:58.4174386Z MT: command "C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\mt.exe /nologo /manifest bin\c10_Synchronized_test.exe.manifest /outputresource:bin\c10_Synchronized_test.exe;#1" failed (exit code 0x1f) with the following output:
2022-04-18T13:42:58.4174727Z 
2022-04-18T13:42:58.4175014Z mt.exe : general error c101008d: Failed to write the updated manifest to the resource of file "bin\c10_Synchronized_test.exe". Access is denied.
2022-04-18T13:42:58.4175334Z 
2022-04-18T13:42:58.4182019Z [4456/6284] cmd.exe /C "cd . && C:\Jenkins\Miniconda3\Library\bin\cmake.exe -E vs_link_exe --intdir=c10\test\CMakeFiles\c10_SmallVectorTest.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\mt.exe --manifests  -- C:\PROGRA~2\MICROS~2\2019\BUILDT~1\VC\Tools\MSVC\1428~1.293\bin\Hostx64\x64\link.exe  c10\test\CMakeFiles\c10_SmallVectorTest.dir\util\SmallVectorTest.cpp.obj  /out:bin\c10_SmallVectorTest.exe /implib:lib\c10_SmallVectorTest.lib /pdb:bin\c10_SmallVectorTest.pdb /version:0.0 /machine:x64 /ignore:4049 /ignore:4217 /ignore:4099 /INCREMENTAL:NO /subsystem:console  lib\c10.lib  lib\gmock.lib  lib\gtest.lib  lib\gtest_main.lib  lib\gtest.lib  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
2022-04-18T13:42:58.5618593Z [4457/6284] cmd.exe /C "cd . && C:\Jenkins\Miniconda3\Library\bin\cmake.exe -E vs_link_exe --intdir=c10\test\CMakeFiles\c10_Metaprogramming_test.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\mt.exe --manifests  -- C:\PROGRA~2\MICROS~2\2019\BUILDT~1\VC\Tools\MSVC\1428~1.293\bin\Hostx64\x64\link.exe  c10\test\CMakeFiles\c10_Metaprogramming_test.dir\util\Metaprogramming_test.cpp.obj  /out:bin\c10_Metaprogramming_test.exe /implib:lib\c10_Metaprogramming_test.lib /pdb:bin\c10_Metaprogramming_test.pdb /version:0.0 /machine:x64 /ignore:4049 /ignore:4217 /ignore:4099 /INCREMENTAL:NO /subsystem:console  lib\c10.lib  lib\gmock.lib  lib\gtest.lib  lib\gtest_main.lib  lib\gtest.lib  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
2022-04-18T13:42:58.5899190Z [4458/6284] cmd.exe /C "cd . && C:\Jenkins\Miniconda3\Library\bin\cmake.exe -E vs_link_exe --intdir=c10\test\CMakeFiles\c10_ThreadLocal_test.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\mt.exe --manifests  -- C:\PROGRA~2\MICROS~2\2019\BUILDT~1\VC\Tools\MSVC\1428~1.293\bin\Hostx64\x64\link.exe  c10\test\CMakeFiles\c10_ThreadLocal_test.dir\util\ThreadLocal_test.cpp.obj  /out:bin\c10_ThreadLocal_test.exe /implib:lib\c10_ThreadLocal_test.lib /pdb:bin\c10_ThreadLocal_test.pdb /version:0.0 /machine:x64 /ignore:4049 /ignore:4217 /ignore:4099 /INCREMENTAL:NO /subsystem:console  lib\c10.lib  lib\gmock.lib  lib\gtest.lib  lib\gtest_main.lib  lib\gtest.lib  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
2022-04-18T13:42:58.5912139Z [4459/6284] cmd.exe /C "cd . && C:\Jenkins\Miniconda3\Library\bin\cmake.exe -E vs_link_exe --intdir=c10\test\CMakeFiles\c10_TypeIndex_test.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\mt.exe --manifests  -- C:\PROGRA~2\MICROS~2\2019\BUILDT~1\VC\Tools\MSVC\1428~1.293\bin\Hostx64\x64\link.exe  c10\test\CMakeFiles\c10_TypeIndex_test.dir\util\TypeIndex_test.cpp.obj  /out:bin\c10_TypeIndex_test.exe /implib:lib\c10_TypeIndex_test.lib /pdb:bin\c10_TypeIndex_test.pdb /version:0.0 /machine:x64 /ignore:4049 /ignore:4217 /ignore:4099 /INCREMENTAL:NO /subsystem:console  lib\c10.lib  lib\gmock.lib  lib\gtest.lib  lib\gtest_main.lib  lib\gtest.lib  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
2022-04-18T13:42:58.7172929Z [4460/6284] cmd.exe /C "cd . && C:\Jenkins\Miniconda3\Library\bin\cmake.exe -E vs_link_exe --intdir=c10\test\CMakeFiles\c10_TypeList_test.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\mt.exe --manifests  -- C:\PROGRA~2\MICROS~2\2019\BUILDT~1\VC\Tools\MSVC\1428~1.293\bin\Hostx64\x64\link.exe  c10\test\CMakeFiles\c10_TypeList_test.dir\util\TypeList_test.cpp.obj  /out:bin\c10_TypeList_test.exe /implib:lib\c10_TypeList_test.lib /pdb:bin\c10_TypeList_test.pdb /version:0.0 /machine:x64 /ignore:4049 /ignore:4217 /ignore:4099 /INCREMENTAL:NO /subsystem:console  lib\c10.lib  lib\gmock.lib  lib\gtest.lib  lib\gtest_main.lib  lib\gtest.lib  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
2022-04-18T13:42:58.7305257Z [4461/6284] cmd.exe /C "cd . && C:\Jenkins\Miniconda3\Library\bin\cmake.exe -E vs_link_exe --intdir=c10\test\CMakeFiles\c10_TypeTraits_test.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\mt.exe --manifests  -- C:\PROGRA~2\MICROS~2\2019\BUILDT~1\VC\Tools\MSVC\1428~1.293\bin\Hostx64\x64\link.exe  c10\test\CMakeFiles\c10_TypeTraits_test.dir\util\TypeTraits_test.cpp.obj  /out:bin\c10_TypeTraits_test.exe /implib:lib\c10_TypeTraits_test.lib /pdb:bin\c10_TypeTraits_test.pdb /version:0.0 /machine:x64 /ignore:4049 /ignore:4217 /ignore:4099 /INCREMENTAL:NO /subsystem:console  lib\c10.lib  lib\gmock.lib  lib\gtest.lib  lib\gtest_main.lib  lib\gtest.lib  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
2022-04-18T13:42:58.7918556Z [4462/6284] cmd.exe /C "cd . && C:\Jenkins\Miniconda3\Library\bin\cmake.exe -E vs_link_exe --intdir=c10\test\CMakeFiles\c10_accumulate_test.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100190~1.0\x64\mt.exe --manifests  -- C:\PROGRA~2\MICROS~2\2019\BUILDT~1\VC\Tools\MSVC\1428~1.293\bin\Hostx64\x64\link.exe  c10\test\CMakeFiles\c10_accumulate_test.dir\util\accumulate_test.cpp.obj  /out:bin\c10_accumulate_test.exe /implib:lib\c10_accumulate_test.lib /pdb:bin\c10_accumulate_test.pdb /version:0.0 /machine:x64 /ignore:4049 /ignore:4217 /ignore:4099 /INCREMENTAL:NO /subsystem:console  lib\c10.lib  lib\gmock.lib  lib\gtest.lib  lib\gtest_main.lib  lib\gtest.lib  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
2022-04-18T13:42:58.7919813Z ninja: build stopped: subcommand failed.
2022-04-18T13:42:58.8094348Z -- Building version 1.12.0a0+git18948a3

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

…o broadcasted_bias tensor in quantized cudnn linear op" Summary: The previous implementation for broadcasted_bias in quantized cudnn linear op has 2 issues. 1) broadcasted_bias is a view of the the input bias tensor. This is not desired as any modifications to broadcasted_bias is also done to the input bias. To remedy this, we clone the input bias tensor. 2) Calling broadcast_to doesn't affect the storage, which is problematic for the cudnn operations. We need to create a fully broadcasted tensor, rather than a view (which is what's returned by broadcast_to). To remedy this, we call contiguous(). Test plan: python test/test_quantization.py -k test_linear_cudnn [ghstack-poisoned]

dzdang · 2022-04-18T14:12:37Z

@dzdang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

jerryzh168

broadcast_to seems to be the same as expand, https://pytorch.org/docs/stable/generated/torch.broadcast_to.html?highlight=broadcast_to#torch.broadcast_to, I feel expand might be slightly more popular than broadcast_to, maybe we can use that

dzdang · 2022-04-18T16:52:02Z

@dzdang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2022-04-18T19:41:28Z

@pytorchbot merge this

(Initiating merge automatically since Phabricator Diff has merged)

…ed_bias tensor in quantized cudnn linear op (#75944) Summary: Pull Request resolved: #75944 The previous implementation for broadcasted_bias in quantized cudnn linear op has 2 issues. 1) broadcasted_bias is a view of the the input bias tensor. This is not desired as any modifications to broadcasted_bias is also done to the input bias. To remedy this, we clone the input bias tensor. 2) Calling broadcast_to doesn't affect the storage, which is problematic for the cudnn operations. We need to create a fully broadcasted tensor, rather than a view (which is what's returned by broadcast_to). To remedy this, we call contiguous(). (Note: this ignores all push blocking failures!) Test Plan: python test/test_quantization.py -k test_linear_cudnn Reviewed By: jerryzh168 Differential Revision: D35717355 Pulled By: dzdang fbshipit-source-id: bc5e47666e4d0a8e1a544a094008520a290e5d25

…ed_bias tensor in quantized cudnn linear op Summary: The previous implementation for broadcasted_bias in quantized cudnn linear op has 2 issues. 1) broadcasted_bias is a view of the the input bias tensor. This is not desired as any modifications to broadcasted_bias is also done to the input bias. To remedy this, we clone the input bias tensor. 2) Calling broadcast_to doesn't affect the storage, which is problematic for the cudnn operations. We need to create a fully broadcasted tensor, rather than a view (which is what's returned by broadcast_to). To remedy this, we call contiguous(). Test plan: python test/test_quantization.py -k test_linear_cudnn Pull Request resolved: #75944 Approved by: https://github.com/jerryzh168 (cherry picked from commit 381e725)

facebook-github-bot added the cla signed label Apr 17, 2022

dzdang mentioned this pull request Apr 18, 2022

[quant][core][gpu][improvement] Set tensors as virtual in quantized cudnn linear op #75967

Closed

dzdang added release notes: quantization release notes category topic: improvements topic category labels Apr 18, 2022

dzdang requested a review from jerryzh168 April 18, 2022 13:27

dzdang mentioned this pull request Apr 18, 2022

[quant][core][gpu] Added additional uid-data_ptr pair for broadcasted_bias in quantized cudnn linear op #75969

Closed

jerryzh168 approved these changes Apr 18, 2022

View reviewed changes

pytorchmergebot closed this in 381e725 Apr 18, 2022

facebook-github-bot deleted the gh/dzdang/88/head branch April 22, 2022 14:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[quant][core][gpu][bux fix] Added clone and contiguous() to broadcasted_bias tensor in quantized cudnn linear op #75944

[quant][core][gpu][bux fix] Added clone and contiguous() to broadcasted_bias tensor in quantized cudnn linear op #75944

Uh oh!

dzdang commented Apr 17, 2022 •

edited

Loading

Uh oh!

facebook-github-bot commented Apr 17, 2022 •

edited

Loading

pull / win-vs2019-cuda11.3-py3 / build (1/1)

Uh oh!

dzdang commented Apr 18, 2022

Uh oh!

jerryzh168 left a comment

Uh oh!

dzdang commented Apr 18, 2022

Uh oh!

facebook-github-bot commented Apr 18, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[quant][core][gpu][bux fix] Added clone and contiguous() to broadcasted_bias tensor in quantized cudnn linear op #75944

[quant][core][gpu][bux fix] Added clone and contiguous() to broadcasted_bias tensor in quantized cudnn linear op #75944

Uh oh!

Conversation

dzdang commented Apr 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented Apr 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful links

💊 CI failures summary and remediations

🕵️ 1 new failure recognized by patterns

pull / win-vs2019-cuda11.3-py3 / build (1/1)

Uh oh!

dzdang commented Apr 18, 2022

Uh oh!

jerryzh168 left a comment

Choose a reason for hiding this comment

Uh oh!

dzdang commented Apr 18, 2022

Uh oh!

facebook-github-bot commented Apr 18, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dzdang commented Apr 17, 2022 •

edited

Loading

facebook-github-bot commented Apr 17, 2022 •

edited

Loading