[BUG] the vq result is not well #210

didadida-r · 2024-05-15T03:58:50Z

Feel free to ask any kind of questions in the issues page, but please use English since other users may find your questions valuable.

Describe the bug
A clear and concise description of what the bug is.

Hi,

I debug the gpt result and finally find that the have some problem in the vq stage,

`mispronounce`:  the pronouncation is not right in the vq stage with Self reduction
新的轮回，便会开始 --》 新的轮回，定会开始

`timber change`: the reproduced audio in vq, the timber will be different with the origin audio, 

Can you share some ideas about these cases and how to optimize it. Is making the ar4 or ar8 will help? or changing to vq with vq loss

Thanks

To Reproduce
Steps to reproduce the behavior:

git clone lastest fish code and latest offical model

download the demo page audio as input_audio_path

python tools/vqgan/inference.py \
        -i "$input_audio_path" \
        -o "$vq_restore_wav_path" \
        -ckpt "$vqgan_path"

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots / log
If applicable, add screenshots / logs to help explain your problem.

Additional context
Add any other context about the problem here.

The text was updated successfully, but these errors were encountered:

leng-yue · 2024-05-15T12:18:54Z

That's why we introduced the VITS decoder, which greatly reduced mispronounce and increased timbre similarity.

didadida-r · 2024-05-16T02:15:03Z

Thanks, I'm curious as to why the vq and vits modules are separated. Would there be a specific reason not to combine them into a single module, similar to the approach taken with gpt-sovits? and adding hubert module?

didadida-r · 2024-05-16T02:19:13Z

That's why we introduced the VITS decoder, which greatly reduced mispronounce and increased timbre similarity.

Is the issues with mispronunciation and timber change attributing to a low token rate? As these problems do not seem to be as apparent in codec and dac

didadida-r added the bug Something isn't working label May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] the vq result is not well #210

[BUG] the vq result is not well #210

didadida-r commented May 15, 2024

leng-yue commented May 15, 2024

didadida-r commented May 16, 2024

didadida-r commented May 16, 2024

[BUG] the vq result is not well #210

[BUG] the vq result is not well #210

Comments

didadida-r commented May 15, 2024

leng-yue commented May 15, 2024

didadida-r commented May 16, 2024

didadida-r commented May 16, 2024