-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Open
Labels
Description
-
Current status
FP32 BF16 IN8 PyTorch Y N Y ONNX Y N Y OpenVINO Y N N Trainer.compile(…, onnx=T/F, quantize=T/F, openvino=T/F)- bind relevant methods/variablesTrainer.quantize(…)- generate quantized model (PyTorch/ONNX)Model.eval(quantize=T/F)- forward using (quantized) PyTorch modelModel.eval_onnx(quantize=T/F)/eval_openvino()/exit_onnx()/exit_openvino()- forward using (quantized) ONNX/OpenVINO model
-
Desired status
- Support all combinations of the above table
- Compile:
Trainer.compile()– just bind all methods/variables? - Quantize:
Trainer.quantize(precision=…, accelerator=…) - Forward:
model.eval(precision=…, accelerator=…)? – need to callquantize()first? - Export/save:
Trainer.openvino.export(precision=…)? – how about onnx/quantized? need to be consistent - Load:
model.load()/model.load_quantized_state_dict()??? - need to have consistent APIs - Status:
model.eval_status()? – every model should maintain current/default mode, and report here? - What's the interactions of there methods? Any other methods needed?