-
Couldn't load subscription status.
- Fork 25.7k
[reland][quant] QuantizedCUDA implementation #36936
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary: Closes #30813 1. Tensor quantization logic(quantize_*) is moved to the aten/native/quantized. Previously all logic for tensor quantization lived in the aten/quantized/Quantizer.cpp file, and started to become complicated and hard to read. This problem should be addressed in refactoring PR. Still, I reworked this partially because I had to add tensor quantization logic for CUDA, and it was native to move everything to the aten/native/quantized. 2. Requirements to run CUDA_tensor_apply* was eased to process any tenser that lives on the CUDA device(QuantizedCUDA included). 3. All quantized data types now have a default constructor. NVCC refuses to compile any gpu_kernel or CUDA_tensor_apply* without them. 4. Minor changes in many files to register QuantizedCUDA backend. 5. test_quantized_tensor is extended to process QuantizedCUDA backend where possible. [ghstack-poisoned]
Summary: Closes #30813 1. Tensor quantization logic(quantize_*) is moved to the aten/native/quantized. Previously all logic for tensor quantization lived in the aten/quantized/Quantizer.cpp file, and started to become complicated and hard to read. This problem should be addressed in refactoring PR. Still, I reworked this partially because I had to add tensor quantization logic for CUDA, and it was native to move everything to the aten/native/quantized. 2. Requirements to run CUDA_tensor_apply* was eased to process any tenser that lives on the CUDA device(QuantizedCUDA included). 3. All quantized data types now have a default constructor. NVCC refuses to compile any gpu_kernel or CUDA_tensor_apply* without them. 4. Minor changes in many files to register QuantizedCUDA backend. 5. test_quantized_tensor is extended to process QuantizedCUDA backend where possible. [ghstack-poisoned]
Summary: Closes #30813 1. Tensor quantization logic(quantize_*) is moved to the aten/native/quantized. Previously all logic for tensor quantization lived in the aten/quantized/Quantizer.cpp file, and started to become complicated and hard to read. This problem should be addressed in refactoring PR. Still, I reworked this partially because I had to add tensor quantization logic for CUDA, and it was native to move everything to the aten/native/quantized. 2. Requirements to run CUDA_tensor_apply* was eased to process any tenser that lives on the CUDA device(QuantizedCUDA included). 3. All quantized data types now have a default constructor. NVCC refuses to compile any gpu_kernel or CUDA_tensor_apply* without them. 4. Minor changes in many files to register QuantizedCUDA backend. 5. test_quantized_tensor is extended to process QuantizedCUDA backend where possible. ghstack-source-id: 7b5f047 Pull Request resolved: #36936
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lg
|
@jerryzh168 Sorry, haven't visited GitHub for some time. Unfortunately, I don't have Windows machines at hand. I can set up everything from scratch and investigate the problem, but that would take a day or so. Is it worth it? |
I think I have fixed it, I'll land again |
|
@jerryzh168 merged this pull request in 97d3a84. |
|
Unlanding. This appears to have broken CUDA 10 and in some cases CUDA 10.1 builds of PyTorch. While breaking CUDA 10 may be acceptable, how that's done should likely go through review and notice provided to engineers still using CUDA 10 that they'll need to upgrade. The CUDA 10.1 issue is reported as blocking development by some engineers using devfairs. The cause/scope of this build issue is unclear at this time. |
Summary: Closes #30813 Relanding of #35463 1. Tensor quantization logic(quantize_*) is moved to the aten/native/quantized. Previously all logic for tensor quantization lived in the aten/quantized/Quantizer.cpp file, and started to become complicated and hard to read. This problem should be addressed in refactoring PR. Still, I reworked this partially because I had to add tensor quantization logic for CUDA, and it was native to move everything to the aten/native/quantized. 2. Requirements to run CUDA_tensor_apply* was eased to process any tenser that lives on the CUDA device(QuantizedCUDA included). 3. All quantized data types now have a default constructor. NVCC refuses to compile any gpu_kernel or CUDA_tensor_apply* without them. 4. Minor changes in many files to register QuantizedCUDA backend. 5. test_quantized_tensor is extended to process QuantizedCUDA backend where possible. [ghstack-poisoned]
Summary: Closes #30813 Relanding of #35463 1. Tensor quantization logic(quantize_*) is moved to the aten/native/quantized. Previously all logic for tensor quantization lived in the aten/quantized/Quantizer.cpp file, and started to become complicated and hard to read. This problem should be addressed in refactoring PR. Still, I reworked this partially because I had to add tensor quantization logic for CUDA, and it was native to move everything to the aten/native/quantized. 2. Requirements to run CUDA_tensor_apply* was eased to process any tenser that lives on the CUDA device(QuantizedCUDA included). 3. All quantized data types now have a default constructor. NVCC refuses to compile any gpu_kernel or CUDA_tensor_apply* without them. 4. Minor changes in many files to register QuantizedCUDA backend. 5. test_quantized_tensor is extended to process QuantizedCUDA backend where possible. ghstack-source-id: 845ce0b Pull Request resolved: #37081
Summary: Closes #30813 Relanding of #35463 1. Tensor quantization logic(quantize_*) is moved to the aten/native/quantized. Previously all logic for tensor quantization lived in the aten/quantized/Quantizer.cpp file, and started to become complicated and hard to read. This problem should be addressed in refactoring PR. Still, I reworked this partially because I had to add tensor quantization logic for CUDA, and it was native to move everything to the aten/native/quantized. 2. Requirements to run CUDA_tensor_apply* was eased to process any tenser that lives on the CUDA device(QuantizedCUDA included). 3. All quantized data types now have a default constructor. NVCC refuses to compile any gpu_kernel or CUDA_tensor_apply* without them. 4. Minor changes in many files to register QuantizedCUDA backend. 5. test_quantized_tensor is extended to process QuantizedCUDA backend where possible. [ghstack-poisoned]
Summary: Closes #30813 Relanding of #35463 1. Tensor quantization logic(quantize_*) is moved to the aten/native/quantized. Previously all logic for tensor quantization lived in the aten/quantized/Quantizer.cpp file, and started to become complicated and hard to read. This problem should be addressed in refactoring PR. Still, I reworked this partially because I had to add tensor quantization logic for CUDA, and it was native to move everything to the aten/native/quantized. 2. Requirements to run CUDA_tensor_apply* was eased to process any tenser that lives on the CUDA device(QuantizedCUDA included). 3. All quantized data types now have a default constructor. NVCC refuses to compile any gpu_kernel or CUDA_tensor_apply* without them. 4. Minor changes in many files to register QuantizedCUDA backend. 5. test_quantized_tensor is extended to process QuantizedCUDA backend where possible. [ghstack-poisoned]
Summary: Closes #30813 Relanding of #35463 1. Tensor quantization logic(quantize_*) is moved to the aten/native/quantized. Previously all logic for tensor quantization lived in the aten/quantized/Quantizer.cpp file, and started to become complicated and hard to read. This problem should be addressed in refactoring PR. Still, I reworked this partially because I had to add tensor quantization logic for CUDA, and it was native to move everything to the aten/native/quantized. 2. Requirements to run CUDA_tensor_apply* was eased to process any tenser that lives on the CUDA device(QuantizedCUDA included). 3. All quantized data types now have a default constructor. NVCC refuses to compile any gpu_kernel or CUDA_tensor_apply* without them. 4. Minor changes in many files to register QuantizedCUDA backend. 5. test_quantized_tensor is extended to process QuantizedCUDA backend where possible. ghstack-source-id: 447571e Pull Request resolved: #37081
Summary: Closes #30813 Relanding of #35463 1. Tensor quantization logic(quantize_*) is moved to the aten/native/quantized. Previously all logic for tensor quantization lived in the aten/quantized/Quantizer.cpp file, and started to become complicated and hard to read. This problem should be addressed in refactoring PR. Still, I reworked this partially because I had to add tensor quantization logic for CUDA, and it was native to move everything to the aten/native/quantized. 2. Requirements to run CUDA_tensor_apply* was eased to process any tenser that lives on the CUDA device(QuantizedCUDA included). 3. All quantized data types now have a default constructor. NVCC refuses to compile any gpu_kernel or CUDA_tensor_apply* without them. 4. Minor changes in many files to register QuantizedCUDA backend. 5. test_quantized_tensor is extended to process QuantizedCUDA backend where possible. [ghstack-poisoned]
Summary: Closes #30813 Relanding of #35463 1. Tensor quantization logic(quantize_*) is moved to the aten/native/quantized. Previously all logic for tensor quantization lived in the aten/quantized/Quantizer.cpp file, and started to become complicated and hard to read. This problem should be addressed in refactoring PR. Still, I reworked this partially because I had to add tensor quantization logic for CUDA, and it was native to move everything to the aten/native/quantized. 2. Requirements to run CUDA_tensor_apply* was eased to process any tenser that lives on the CUDA device(QuantizedCUDA included). 3. All quantized data types now have a default constructor. NVCC refuses to compile any gpu_kernel or CUDA_tensor_apply* without them. 4. Minor changes in many files to register QuantizedCUDA backend. 5. test_quantized_tensor is extended to process QuantizedCUDA backend where possible. [ghstack-poisoned]
Summary: Closes #30813 Relanding of #35463 1. Tensor quantization logic(quantize_*) is moved to the aten/native/quantized. Previously all logic for tensor quantization lived in the aten/quantized/Quantizer.cpp file, and started to become complicated and hard to read. This problem should be addressed in refactoring PR. Still, I reworked this partially because I had to add tensor quantization logic for CUDA, and it was native to move everything to the aten/native/quantized. 2. Requirements to run CUDA_tensor_apply* was eased to process any tenser that lives on the CUDA device(QuantizedCUDA included). 3. All quantized data types now have a default constructor. NVCC refuses to compile any gpu_kernel or CUDA_tensor_apply* without them. 4. Minor changes in many files to register QuantizedCUDA backend. 5. test_quantized_tensor is extended to process QuantizedCUDA backend where possible. ghstack-source-id: c307ca5 Pull Request resolved: #37081
Summary: Closes #30813 Relanding of #35463 1. Tensor quantization logic(quantize_*) is moved to the aten/native/quantized. Previously all logic for tensor quantization lived in the aten/quantized/Quantizer.cpp file, and started to become complicated and hard to read. This problem should be addressed in refactoring PR. Still, I reworked this partially because I had to add tensor quantization logic for CUDA, and it was native to move everything to the aten/native/quantized. 2. Requirements to run CUDA_tensor_apply* was eased to process any tenser that lives on the CUDA device(QuantizedCUDA included). 3. All quantized data types now have a default constructor. NVCC refuses to compile any gpu_kernel or CUDA_tensor_apply* without them. 4. Minor changes in many files to register QuantizedCUDA backend. 5. test_quantized_tensor is extended to process QuantizedCUDA backend where possible. Differential Revision: [D21206694](https://our.internmc.facebook.com/intern/diff/D21206694) [ghstack-poisoned]
Summary: Closes #30813 Relanding of #35463 1. Tensor quantization logic(quantize_*) is moved to the aten/native/quantized. Previously all logic for tensor quantization lived in the aten/quantized/Quantizer.cpp file, and started to become complicated and hard to read. This problem should be addressed in refactoring PR. Still, I reworked this partially because I had to add tensor quantization logic for CUDA, and it was native to move everything to the aten/native/quantized. 2. Requirements to run CUDA_tensor_apply* was eased to process any tenser that lives on the CUDA device(QuantizedCUDA included). 3. All quantized data types now have a default constructor. NVCC refuses to compile any gpu_kernel or CUDA_tensor_apply* without them. 4. Minor changes in many files to register QuantizedCUDA backend. 5. test_quantized_tensor is extended to process QuantizedCUDA backend where possible. Differential Revision: [D21206694](https://our.internmc.facebook.com/intern/diff/D21206694) [ghstack-poisoned]
Summary: Closes #30813 Relanding of #35463 1. Tensor quantization logic(quantize_*) is moved to the aten/native/quantized. Previously all logic for tensor quantization lived in the aten/quantized/Quantizer.cpp file, and started to become complicated and hard to read. This problem should be addressed in refactoring PR. Still, I reworked this partially because I had to add tensor quantization logic for CUDA, and it was native to move everything to the aten/native/quantized. 2. Requirements to run CUDA_tensor_apply* was eased to process any tenser that lives on the CUDA device(QuantizedCUDA included). 3. All quantized data types now have a default constructor. NVCC refuses to compile any gpu_kernel or CUDA_tensor_apply* without them. 4. Minor changes in many files to register QuantizedCUDA backend. 5. test_quantized_tensor is extended to process QuantizedCUDA backend where possible. Differential Revision: [D21206694](https://our.internmc.facebook.com/intern/diff/D21206694) [ghstack-poisoned]
Summary: Closes #30813 Relanding of #35463 1. Tensor quantization logic(quantize_*) is moved to the aten/native/quantized. Previously all logic for tensor quantization lived in the aten/quantized/Quantizer.cpp file, and started to become complicated and hard to read. This problem should be addressed in refactoring PR. Still, I reworked this partially because I had to add tensor quantization logic for CUDA, and it was native to move everything to the aten/native/quantized. 2. Requirements to run CUDA_tensor_apply* was eased to process any tenser that lives on the CUDA device(QuantizedCUDA included). 3. All quantized data types now have a default constructor. NVCC refuses to compile any gpu_kernel or CUDA_tensor_apply* without them. 4. Minor changes in many files to register QuantizedCUDA backend. 5. test_quantized_tensor is extended to process QuantizedCUDA backend where possible. ghstack-source-id: 4e511be Pull Request resolved: #37081
Summary: Pull Request resolved: #37081 Closes #30813 Relanding of #35463 1. Tensor quantization logic(quantize_*) is moved to the aten/native/quantized. Previously all logic for tensor quantization lived in the aten/quantized/Quantizer.cpp file, and started to become complicated and hard to read. This problem should be addressed in refactoring PR. Still, I reworked this partially because I had to add tensor quantization logic for CUDA, and it was native to move everything to the aten/native/quantized. 2. Requirements to run CUDA_tensor_apply* was eased to process any tenser that lives on the CUDA device(QuantizedCUDA included). 3. All quantized data types now have a default constructor. NVCC refuses to compile any gpu_kernel or CUDA_tensor_apply* without them. 4. Minor changes in many files to register QuantizedCUDA backend. 5. test_quantized_tensor is extended to process QuantizedCUDA backend where possible. Test Plan: Imported from OSS Differential Revision: D21206694 Pulled By: jerryzh168 fbshipit-source-id: c7433aad9c095a34c57e6dddd128b5c5d9292373
Stack from ghstack:
Summary:
Closes #30813
Relanding of #35463
Differential Revision: D21143025