Skip to content

v1.16.0: Transformers 4.36 compatibility, extended ONNX support, Mixtral GPTQ

Compare
Choose a tag to compare
@fxmarty fxmarty released this 13 Dec 18:23
· 108 commits to main since this release

Transformers 4.36 compatiblity

Notably, the ONNX exports aten::scaled_dot_product_attention in a standardized way for the compatible models.

Extended ONNX support: timm, sentence-transformers, Phi, ESM

GPTQ for Mixtral

Work in progress.

  • add modules_in_block_to_quantize arg for gptq by @SunMarc in #1585

What's Changed

Full Changelog: v1.15.0...v1.16.0