Skip to content

v1.3.5

Choose a tag to compare

@AbrahamSanders AbrahamSanders released this 17 Jun 06:00
· 5 commits to main since this release
  • Added support for WavTokenizer and SimVQ! Both are single-level codecs that share the same architecture but differ in their VQ strategy. WavTokenizer comes in 40Hz and 75Hz variants with a vocabulary size of 4096. SimVQ variants have a 75Hz framerate with vocabulary sizes ranging from 4096 to 262144 codes. SimVQ also features a causal encoder and partially causal decoder, making it suitable for streaming use cases.
    • Use --codec_model WavTokenizer-large-320-24k-4096 (or any other from the Model column on this table) with codec_bpe.audio_to_codes to encode audio using WavTokenizer.
    • Use --codec_model simvq_4k (or any other from the Model column on this table) with codec_bpe.audio_to_codes to encode audio using SimVQ.