Skip to content

CTranslate2 3.8.0

Compare
Choose a tag to compare
@guillaumekln guillaumekln released this 06 Mar 15:39
· 269 commits to master since this release

New features

  • Experimental support of AVX512 in manually vectorized functions: this code path is not enabled by default but can be enabled by setting the environment variable CT2_FORCE_CPU_ISA=AVX512
  • Add Transformers converter option copy_files to copy any files from the Hugging Face model to the converted model directory
  • Expose some Whisper parameters:
    • max_initial_timestamp_index
    • suppress_blank
    • suppress_tokens

Fixes and improvements

  • Reduce conversion time for large models by skipping some weights comparisons
  • Reduce maximum memory usage when converting Transformers models with --quantization float16
  • Set FP32 compute type for FP16 convolutions to match the PyTorch behavior and accuracy
  • Update oneDNN to 3.0.1