NotImplementedError: Could not run 'quantized::linear' with arguments from the 'CPU' backend. 'quantized::linear' is only available for these backends: [QuantizedCPU, QuantizedCUDA, BackendSelect, Python,....] #128578
Labels
oncall: quantization
Quantization support in PyTorch
馃悰 Describe the bug
After QAT training, the following error is reported for inference:
NotImplementedError: Could not run 'quantized::linear' with arguments from the 'CPU' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'quantized::linear' is only available for these backends: [QuantizedCPU, QuantizedCUDA, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradMPS, AutogradXPU, AutogradHPU, AutogradLazy, AutogradMeta, Tracer, AutocastCPU, AutocastCUDA, FuncTorchBatched, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].
training code set:
model.modulexx.ar_predict_layer.qconfig = torch.ao.quantization.get_default_qat_qconfig('x86') [and set quant and dequant]
model_quant = torch.ao.quantization.prepare_qat(model, inplace=True)
model.train()
....
save: torch.save(torch.ao.quantization.convert(model.cpu().eval(), inplace=False).state_dict(), filename)
inference code set:
state_dict = torch.load(filename)
model.ar_predict_layer.qconfig = torch.ao.quantization.get_default_qat_qconfig('x86')
model_fp32_prepared = torch.quantization.prepare_qat(model_ar, inplace=True)
model_int8 = torch.quantization.convert(model_fp32_prepared.eval(), inplace=False)
model_int8.load_state_dict(state_dict['model'])
model_ar = model_int8
model_ar.eval()
Versions
torch_version: 2.1.2+cu121
python:3.10
cc @jerryzh168 @jianyuh @raghuramank100 @jamesr66a @vkuzo @jgong5 @Xia-Weiwen @leslie-fang-intel @msaroufim
The text was updated successfully, but these errors were encountered: