resenet50 ONNX int8 model inference run in ONNXRUNTIME vs IREE-COMPILED CPU run differ #430

kumardeepakamd · 2024-02-12T20:31:46Z

There is 11% match only between ONNXRUNTIME run and IREE compiled inference for VAI quantized int8 model of resenet50.

To reproduce the issue:

git clone https://github.com/nod-ai/SHARK-TestSuite.git
cd e2eshark
(You may need to read the https://github.com/nod-ai/SHARK-TestSuite/blob/main/e2eshark/README.md and make sure you have the right pip install etc.)

Replace the path names for hfhome, -c and -i to point to your Hugging Face Home, Torch MLIR build and IREE builds respectively and run as:

python ./run.py --hfhome /proj/gdba/kumar/HF_HOME -c ../../torch-mlir/build -i ../../mainireee/iree-build --torchtolinalg --tests onnx/models/resnet50_vaiq_int8

cd test-run/onnx/models/resnet50_vaiq_int8 and examine the logs, you can 'cat commands.log' and rerun any step you like. You will see that failedinference.log shows you difference in results

kumardeepakamd mentioned this issue Feb 12, 2024

Shark FE : Support bfloat16/int8 opt/laama2-7b Fx and ONNX model #364

Open

kumardeepakamd assigned rsuderman Feb 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

resenet50 ONNX int8 model inference run in ONNXRUNTIME vs IREE-COMPILED CPU run differ #430

resenet50 ONNX int8 model inference run in ONNXRUNTIME vs IREE-COMPILED CPU run differ #430

kumardeepakamd commented Feb 12, 2024

resenet50 ONNX int8 model inference run in ONNXRUNTIME vs IREE-COMPILED CPU run differ #430

resenet50 ONNX int8 model inference run in ONNXRUNTIME vs IREE-COMPILED CPU run differ #430

Comments

kumardeepakamd commented Feb 12, 2024