Support GPU INT-8 quantization #15

pommedeterresautee · 2021-12-01T22:08:12Z

fix some stupid bugs use opset 13 (onnx)

# Conflicts: # src/transformer_deploy/convert.py

pommedeterresautee added 9 commits December 1, 2021 22:16

support quantization

8a501d5

fix some stupid bugs use opset 13 (onnx)

add quantization demo

287bab0

add dependency

ea74d9b

qdqroberta

d173186

update quantization notebook

2d21bc4

update quantization notebook

dc3fc19

update quantization notebook

ac81a23

bump VERSION

fc29af8

delete old script

5630c1e

pommedeterresautee marked this pull request as ready for review December 8, 2021 14:28

pommedeterresautee added Hugging Face performance improve performance TensorRT labels Dec 8, 2021

pommedeterresautee self-assigned this Dec 8, 2021

pommedeterresautee changed the title ~~Quantization~~ Support GPU INT-8 quantization Dec 8, 2021

pommedeterresautee added 8 commits December 8, 2021 16:15

cleaning

a8fa397

Merge branch 'main' into quantization

f5a5e15

Merge branch 'main' into quantization

945c181

# Conflicts: # src/transformer_deploy/convert.py

fix ORT to 1.9.0, 1.10.0 seems to be bugged

dc175d8

modify text

8cf0e0d

update tuto

200b4d3

update tuto

1d12dcf

update tuto

524ff6b

pommedeterresautee merged commit ad837a9 into main Dec 8, 2021

pommedeterresautee deleted the quantization branch December 8, 2021 22:45

Provide feedback