v1.3.1
Updates to codec_bpe.audio_to_codes:
- Fix incorrect framerate being written to
codec_info.json - New argument
--codec_info_onlyto skip audio encoding and only outputcodec_info.jsonto the output codes path
Updates to codec_bpe.train_tokenizer:
- Allow
--max_token_codebook_ngramsto be set to 0, which will skip tokenizer training and output a tokenizer with just the base codebook vocabulary. Setting--max_token_codebook_ngramsto 1 while--num_codebooksis also 1 has the same effect.
Other:
- Remove unused
--use_special_token_formatargument from all modules and functions