Fine tuned Keras VGGNet16 shows no performance advantages. #638

INF800 · 2020-10-20T10:47:42Z

This is the comparision of raw VGG16 keras model inference time and the same model on onnx runtime. Why don't I see any performance advantages?

There is extremely small improvement

Replicate results by running this notebook on colab CPU

Narasimha1997 · 2020-10-20T14:28:00Z

Whenever measuring the performance of AI models please note this :

More CPU cores will not make the model faster, unless the framework supports concurrent execution of layers. On CPU only machine, you can improve the inference speed by loading in more models across multiple cores. ( A model can at max take 100% of single core, even if you have 15 remaining cores, it'll not be used).
GPUs on the other hand can make model inference faster because they have capabilities to parallelize layer operations and matrix multiplications on CUDA cores. If you have more CUDA cores, more faster the inference will be. This is irrespective of any DL framework as they all use cuDNN bindings. GPUs can execute batches of inputs at once because of the nature of GPU hardware design.

So thumb rule => CPU : Concurrency :: GPU : Batching

All these optimizations will obviously will not make a model faster on CPU because the utilisation will never exploit multiple cores.

INF800 · 2020-10-21T04:46:49Z

Hey @Narasimha1997, I do not understand why onnx does not make models faster. Huggingface uses onnx to run large pretained networks on CPU. So, can't I replicate the same using keras-onnx? Or do I have to use onnx models converted from pytorch models?

jiafatom · 2020-10-22T15:21:29Z

When you use onnxruntime to evaluate performance (say run 100 times), please skip the first few runs (for example, 10 times) of evaluations. Especially for the first run, onnxruntime need do some extra work, so it costs much more time than usual.

INF800 · 2020-10-23T08:26:46Z

Hey @jiafatom, The results were smashing for lenet-type architecture (upto 177 times fast) using your method. But VGGNet shows NO improvement. Updated the notebook.

jiafatom · 2020-10-23T14:02:26Z

For this perf issue, I feel that the converter already does its job well, and this is onnxruntime issue. You may need reach onnxruntime repo and post the question there.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine tuned Keras VGGNet16 shows no performance advantages. #638

Fine tuned Keras VGGNet16 shows no performance advantages. #638

INF800 commented Oct 20, 2020 •

edited

Loading

Narasimha1997 commented Oct 20, 2020

INF800 commented Oct 21, 2020

jiafatom commented Oct 22, 2020

INF800 commented Oct 23, 2020

jiafatom commented Oct 23, 2020

Fine tuned Keras VGGNet16 shows no performance advantages. #638

Fine tuned Keras VGGNet16 shows no performance advantages. #638

Comments

INF800 commented Oct 20, 2020 • edited Loading

Narasimha1997 commented Oct 20, 2020

INF800 commented Oct 21, 2020

jiafatom commented Oct 22, 2020

INF800 commented Oct 23, 2020

jiafatom commented Oct 23, 2020

INF800 commented Oct 20, 2020 •

edited

Loading