Skip to content

CTranslate2 3.6.0

Compare
Choose a tag to compare
@guillaumekln guillaumekln released this 16 Feb 16:23
· 289 commits to master since this release

New features

  • Build the Windows Python wheels with cuDNN to enable GPU execution of Whisper models
  • Add the model attribute Whisper.is_multilingual

Fixes and improvements

  • Reduce the beam search memory usage by not duplicating the decoder states that are the same in each beam (e.g. the projected memory keys and values)
  • Optimize the dot product attention during beam search by moving the query beam dimension to the time dimension
  • Fix support of English-only Whisper models
  • Include the prefix tokens (if they exist) in the output of Whisper.generate
  • Log a warning when the model weights are implicitly converted to another type