Skip to content

v1.3.1

Choose a tag to compare

@AbrahamSanders AbrahamSanders released this 17 Mar 06:22
· 11 commits to main since this release

Updates to codec_bpe.audio_to_codes:

  • Fix incorrect framerate being written to codec_info.json
  • New argument --codec_info_only to skip audio encoding and only output codec_info.json to the output codes path

Updates to codec_bpe.train_tokenizer:

  • Allow --max_token_codebook_ngrams to be set to 0, which will skip tokenizer training and output a tokenizer with just the base codebook vocabulary. Setting --max_token_codebook_ngrams to 1 while --num_codebooks is also 1 has the same effect.

Other:

  • Remove unused --use_special_token_format argument from all modules and functions