Skip to content

CTranslate2 3.17.1

Compare
Choose a tag to compare
@guillaumekln guillaumekln released this 20 Jul 18:18
· 126 commits to master since this release

Fixes and improvements

  • Fix an error when running models with the new int8_bfloat16 computation type
  • Fix a vocabulary error when converting Llama 2 models with the Transformers converter
  • Update the Transformers converter to correctly convert Llama models using GQA
  • Stop the decoding when the generator returned by the method generate_tokens is closed