Change model transform fp8 linear op to fbgemm quantize ops

Summary: Switch caller for fp8/int8 quantized gemm ops to fbgemm version. Reviewed By: jiawenliu64 Differential Revision: D56685840 fbshipit-source-id: 466de7adc4d36a6d0d2005a6b9f2cdea724ca63b
pytorch · Apr 29, 2024 · dff5bc2 · dff5bc2
1 parent 53e7da5
commit dff5bc2
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/fbgemm_gpu/experimental/gen_ai/src/quantize/quantize.cpp b/fbgemm_gpu/experimental/gen_ai/src/quantize/quantize.cpp
@@ -79,7 +79,7 @@ at::Tensor get_fp8_per_tensor_scale(
     c10::optional<at::Tensor> bs,
     c10::optional<at::Tensor> scale_ub); // scale upperbound
 
-TORCH_LIBRARY(fbgemm, m) {
+TORCH_LIBRARY_FRAGMENT(fbgemm, m) {
 #ifndef USE_ROCM
   // TODO: on AMD this throws "Undefined symbol" when loading
   // quantize_ops with