Male voice #11

loretoparisi · 2020-07-28T15:58:06Z

First thank you, I have solved the issue opened thanks to you support. In my understanding both Melgan (that I have tried) and Waveglow (not run yet) have a female voice. To have a male voice is it necessary to train from scratch the model? Or add support to a specific vocoder?

Thank you.

ming024 · 2020-07-29T00:54:27Z

@loretoparisi I think that Universal Vocoding meets your requirement. I have a plan to support multi-speaker TTS, but it is not my top priority now. You are welcomed to fork this repository for a male-TTS implementation. I think both the f0_min for PyWorld Vocoder and some parameter related to preprocessing have to be modified for male speakers.

shoegazerstella · 2020-07-29T10:31:57Z

Hi @ming024,
I am trying what you suggested, to use Universal Vocoder for synthesizing the output of FastSpeech2.

The output of FS2 has size:
torch.Size([1, 80, x])

while UV wants something like this as input:
torch.Size([1, x, 80])

I tryed by swapping the axis but of course didn't work, what I got was just noise or silence.
So this confirms what you suggested, that maybe it is worth retraining UV with the same parameters as FS2.

Another thing I tried was playing with the UV params, using the mel_spec computed from the generated wav file by FS2. I got some interesting changes in pitch, but nothing I can really use for my purpose of changing the speaker voice.

If you have any other advice I could use please let me know, thanks a lot!

[EDIT]
Also would you have some thoughts on what approach best fits with FastSpeech between voice cloning and voice conversion? to be integrated or used as a post processing step.

ming024 · 2020-07-30T12:37:46Z

@shoegazerstella Actually I haven't tried Universal Vocoding before so I am not sure where the error come from.

I think the decoder of FastSpeech is similar to the decoder of a voice conversion model. Some VC models use vector quantization or other tricks to learn a discrete embedding space, and find out that phonetic information is contained therein. Maybe there is some way to combine the training of both tasks, no matter jointly training or pretraining, etc.

carankt · 2020-12-01T05:22:48Z

@ming024 Any idea what specific changes one needs to make for Male voice cloning?

joseluismoreira · 2021-02-15T10:25:03Z

@carankt any updates about male voice or how to do it, please?

ming024 · 2021-02-26T04:05:34Z

Guys multi-speaker synthesis is supported now.

ming024 · 2021-05-26T08:37:40Z

closed #11

ming024 closed this as completed May 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Male voice #11

Male voice #11

loretoparisi commented Jul 28, 2020

ming024 commented Jul 29, 2020 •

edited

shoegazerstella commented Jul 29, 2020 •

edited

ming024 commented Jul 30, 2020

carankt commented Dec 1, 2020

joseluismoreira commented Feb 15, 2021

ming024 commented Feb 26, 2021

ming024 commented May 26, 2021

Male voice #11

Male voice #11

Comments

loretoparisi commented Jul 28, 2020

ming024 commented Jul 29, 2020 • edited

shoegazerstella commented Jul 29, 2020 • edited

ming024 commented Jul 30, 2020

carankt commented Dec 1, 2020

joseluismoreira commented Feb 15, 2021

ming024 commented Feb 26, 2021

ming024 commented May 26, 2021

ming024 commented Jul 29, 2020 •

edited

shoegazerstella commented Jul 29, 2020 •

edited