v1.3.1

AbrahamSanders released this 17 Mar 06:22

· 11 commits to main since this release

d5c356a

Updates to codec_bpe.audio_to_codes:

Fix incorrect framerate being written to codec_info.json
New argument --codec_info_only to skip audio encoding and only output codec_info.json to the output codes path

Updates to codec_bpe.train_tokenizer:

Allow --max_token_codebook_ngrams to be set to 0, which will skip tokenizer training and output a tokenizer with just the base codebook vocabulary. Setting --max_token_codebook_ngrams to 1 while --num_codebooks is also 1 has the same effect.

Other:

Remove unused --use_special_token_format argument from all modules and functions

Assets 2