Add QAT support to more models #29

pommedeterresautee · 2021-12-21T13:50:18Z

QAT is now done through monkey patching to limit the amount of LOC to manage

Added QAT support to:

add quant onnx export

add quantization tests add deberta v2

refactor ast modif add tests

update notebook

pommedeterresautee added 2 commits December 20, 2021 23:11

first version of QDQ monkey patching

bc53ff4

add Albert, Electra and Distilbert QAT support

358e0fd

pommedeterresautee self-assigned this Dec 21, 2021

pommedeterresautee added performance improve performance quantization GPU/CPU quantization support labels Dec 21, 2021

pommedeterresautee mentioned this pull request Dec 21, 2021

GPU Optimization Ki6an/fastT5#34

Open

pommedeterresautee added 11 commits December 21, 2021 19:02

add QDQDeberta V1

cab7e60

fix distilbert

f826ef7

add ast patch

1e11156

add quant onnx export

simplify quantization process

8ce306d

fix qdq deberta

828ae6f

quantization refactoring

0b5d29e

add documentation

fe9cb1f

add quantization tests add deberta v2

add quant of layernorm

8679530

refactor ast modif add tests

add operator name in quantizer name

57c9207

update notebook

update notebook

4d7d12b

update notebook

f08a4bf

pommedeterresautee marked this pull request as ready for review December 28, 2021 22:25

pommedeterresautee merged commit 404c5ee into main Dec 28, 2021

pommedeterresautee deleted the monkey_patch branch December 28, 2021 22:28

Provide feedback