Replace Encodec decoding with Vocos #34

v0xie · 2023-08-27T20:44:56Z

This PR replaces the Encodec decoding with Vocos (https://github.com/charactr-platform/vocos).

In "utils/generation.py" file, generate_audio and generate_audio_from_long_text now both use Vocos decoding.

The dependency on Encodec isn't able to be fully replaced because Vocos doesn't have a way to encode.

My limited testing with a 4090 indicates that Vocos decoding is just marginally slower than Encodec decoding, but the increase in quality more than makes up for it I think.

I am attaching some generated audio samples so you can hear the difference the new decoder makes. This PR does not implement upsampling from 24k to 44.1k as recommended by the Vocos examples, but I have included a few generations that were upsampled to 44.1k in the samples zip.

samples.zip

Replace Encodec decoding with Vocos

067888a

Plachtaa merged commit 6db2af0 into Plachtaa:master Aug 28, 2023

zhou20120904 mentioned this pull request Oct 31, 2023

It's useless on mac #129

Open

liucr mentioned this pull request Dec 21, 2023

Unsupported type byte size: ComplexFloat #109

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace Encodec decoding with Vocos #34

Replace Encodec decoding with Vocos #34

v0xie commented Aug 27, 2023

Replace Encodec decoding with Vocos #34

Replace Encodec decoding with Vocos #34

Conversation

v0xie commented Aug 27, 2023