Does the tvm based CPU get accelerated? #54

lucasjinreal · 2021-02-09T04:01:16Z

Hi, just wonder tvm based deployed model can get accelerated or not?
Compare with vanilla pytorch CPU or onnxruntime or OpenVINO?

zhiqwang · 2021-02-09T05:12:25Z

Hi @jinfagang ,

Thanks for your attention here!

At that time, my first plan is to let each side could compile and infer successfully, and because of time limited, I have not carefully compared the time of each side between libtorch, onnxruntime and tvm. (I have the plan to support openvino, also for the time limited, I haven't done this now).

Now, I'm writing and refactoring the trainer, after this is done, I will add the documents of time consuming and comparison.

BTW, we are welcome for contributions in any ways.

zhiqwang · 2021-02-09T05:41:34Z

And I've uploaded the notebooks of tvm compiling and inference here, within my current experience, the procedure of compiling will consume much more time than libtorch or onnxruntime. ~~the procedure of inference seems normal~~ (Edit: more experiments needed here)

This repo is my first glance on tvm, and I will do more experiments on it.

lucasjinreal · 2021-02-12T10:56:33Z

@zhiqwang That's weried, tvm should be faster if compares on same CPU device? at least should faster than onnxruntime if chosen CPU as provider.

zhiqwang · 2021-02-12T18:14:59Z

tvm should be faster if compares on same CPU device? at least should faster than onnxruntime if chosen CPU as provider.

Hi @jinfagang , I agree with you on this point, and this is my goal. The current realization on the tvm backend is only the initial attempt, let us do more efforts to achieve this goal!

zhiqwang · 2021-02-16T19:42:08Z

Hi @jinfagang

I've added a rough comparison of inference time consumed on Jupyter notebook (iPython).

On the ONNXRuntime backend,

CPU times: user 2.04 s, sys: 0 ns, total: 2.04 s
Wall time: 55.8 ms

On the TorchScript backend,

CPU times: user 2.03 s, sys: 32 ms, total: 2.06 s
Wall time: 60.5 ms

On the PyTorch backend,

CPU times: user 3.87 s, sys: 60 ms, total: 3.93 s
Wall time: 116 ms

On the TVM backend,

CPU times: user 528 ms, sys: 364 ms, total: 892 ms
Wall time: 22.3 ms

You could check the latest updated notebook for more details.

BTW, the displayed time of the onnxruntime notebook is on GPU, I just test it locally.

Although this comparison is a bit rough, we could come to this conclusion that tvm will increase inference speed on CPU device.

So I'll close this issue. If you have more concerns please let me know.

lucasjinreal added the enhancement New feature or request label Feb 9, 2021

zhiqwang added the good first issue Good for newcomers label Feb 9, 2021

zhiqwang added the help wanted Extra attention is needed label Feb 12, 2021

zhiqwang added a commit that referenced this issue Feb 16, 2021

Add inference time consumed (Closes #54)

bfee0b4

zhiqwang closed this as completed in 33f4dc0 Feb 16, 2021

zhiqwang mentioned this issue Feb 16, 2021

Update Jupyter notebooks #63

Merged

3 tasks

zhiqwang added deployment Inference acceleration for production and removed good first issue Good for newcomers labels Feb 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does the tvm based CPU get accelerated? #54

Does the tvm based CPU get accelerated? #54

lucasjinreal commented Feb 9, 2021

zhiqwang commented Feb 9, 2021 •

edited

zhiqwang commented Feb 9, 2021 •

edited

lucasjinreal commented Feb 12, 2021

zhiqwang commented Feb 12, 2021

zhiqwang commented Feb 16, 2021 •

edited

Does the tvm based CPU get accelerated? #54

Does the tvm based CPU get accelerated? #54

Comments

lucasjinreal commented Feb 9, 2021

zhiqwang commented Feb 9, 2021 • edited

zhiqwang commented Feb 9, 2021 • edited

lucasjinreal commented Feb 12, 2021

zhiqwang commented Feb 12, 2021

zhiqwang commented Feb 16, 2021 • edited

zhiqwang commented Feb 9, 2021 •

edited

zhiqwang commented Feb 9, 2021 •

edited

zhiqwang commented Feb 16, 2021 •

edited