can quantized onnx t5 run on GPU #8

gyin94 · 2021-04-08T09:30:34Z

Just wonder whether this quantized onnx t5 can run on GPU.

Ki6an · 2021-04-08T12:24:29Z

unfortunately no. quantized onnx models can only be run on CPU & onnxruntime-gpu does not support quantization.
if you want more details on this question you should create an issue in onnxruntime.

mblank5 · 2021-12-15T03:21:52Z

@Ki6an HI, do we support onnxruntime-gpu mode with V100 or T4 if dont use the quantization?

i see with onnxruntime only we can have x2 speedup ?

Ki6an · 2021-12-18T06:39:52Z

@mblank5 no, this library uses onnxruntime, and to support GPUs you need to have onnxruntime-GPU installed.

BUT you can uninstall onnxruntime after the fastt5 library is installed, and install onnxruntime-gpu and try running the model but not sure you'll get speed up. for more info refer to this issue.

with onnxruntime, you'll get speed up if you are using modern CPUs and more CPU cores. refer benchmark section of README.

gyin94 closed this as completed Apr 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

can quantized onnx t5 run on GPU #8

can quantized onnx t5 run on GPU #8

gyin94 commented Apr 8, 2021

Ki6an commented Apr 8, 2021

mblank5 commented Dec 15, 2021

Ki6an commented Dec 18, 2021

can quantized onnx t5 run on GPU #8

can quantized onnx t5 run on GPU #8

Comments

gyin94 commented Apr 8, 2021

Ki6an commented Apr 8, 2021

mblank5 commented Dec 15, 2021

Ki6an commented Dec 18, 2021