What's Changed
- New multi-resolution (+multi-band) discriminator adopted from Descript Audio Codec.
- Updated recommended hyperparameters for the AdamW optimizer:
lr=5e-4
,betas=(0.8, 0.9)
- Pre-trained models on Hugging Face have been updated:
charactr/vocos-mel-24khz
,charactr/vocos-encodec-24khz
💡 Note: If you'd like to load a previous checkpoint, they have been tagged for easy reference:
Vocos.from_pretrained("charactr/vocos-encodec-24khz", revision="v0.0.4")