Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

French Tacotron2 DDC release. #539

Closed
erogol opened this issue Oct 12, 2020 · 9 comments
Closed

French Tacotron2 DDC release. #539

erogol opened this issue Oct 12, 2020 · 9 comments
Labels
model-release explanation for new model releases

Comments

@erogol
Copy link
Contributor

erogol commented Oct 12, 2020

Colab notebook: https://colab.research.google.com/drive/16T5avz3zOUNcIbF_dwfxnkZDENowx-tZ?usp=sharing
Model files: https://drive.google.com/drive/folders/1LpsUx08Z3-JgvNLPQY67y8OjE4IlP4f1?usp=sharing

This release uses Tacotron2 DDC with a combination of Universal Fullband-Melgan. Model is trained using MAI-Labs dataset subset fr_FR/by_book/female/etwa/monsieur_lecoq).

Tacotron2 model is trained for 100K steps starting from a pre-trained LJSpeech model.

Tacotron2 and the vocoder model have different sampling rates (16khz vs 24khz) and this is resolved by interpolating Tacotron2 output before feeding into the vocoder as in the sample Colab notebook above.

@erogol erogol added the model-release explanation for new model releases label Oct 12, 2020
@Gaet81
Copy link

Gaet81 commented Nov 15, 2020

Hi @erogol Thanks for this.

Can you elaborate on why you use the phoneme_cleaners over the french_cleaners in your config file?

@erogol
Copy link
Contributor Author

erogol commented Nov 16, 2020

Hi @erogol Thanks for this.

Can you elaborate on why you use the phoneme_cleaners over the french_cleaners in your config file?

because model uses phonemes

@Gaet81
Copy link

Gaet81 commented Nov 17, 2020

If I understand well:

  • it will transform the text to ASCII -> what seems to be an issue for french because then there will be no différence between e, é and (è, ế or ë)
  • the abreviation, symbols will be transform into the english one rather than the french one...

Am I misundertsanding something?

@erogol
Copy link
Contributor Author

erogol commented Nov 18, 2020

I used french phonemes as dictated by phoneme_language in config.json. So I don't think there is any issue.

@WeberJulian
Copy link
Contributor

If I understand well:

  • it will transform the text to ASCII -> what seems to be an issue for french because then there will be no différence between e, é and (è, ế or ë)
  • the abreviation, symbols will be transform into the english one rather than the french one...

Am I misundertsanding something?

I did the implementation of french_cleaners myslef:

def french_cleaners(text):
    '''Pipeline for French text. There is no need to expand numbers, phonemizer already does that'''
    text = lowercase(text)
    text = expand_abbreviations(text, lang='fr')
    text = replace_symbols(text, lang='fr')
    text = remove_aux_symbols(text)
    text = collapse_whitespace(text)
    return text

as you can see, the text doesn't go through the convert_to_ascii method.
I also added support for french abbreviations here.

@srijan14
Copy link

srijan14 commented Jan 16, 2021

@erogol The config present in your shared folder seems to be different from what is present in 72a6ac5(The commit ID corresponding to Tacotron2 DDC in the wiki. Different in terms of fiels present in the json and not the values). Can you please confirm if I am trying to do same sort of thing on different language, which should be my point to start for training Tacotron2 part?

Thanks

@erogol
Copy link
Contributor Author

erogol commented Jan 17, 2021

use the one in the shared folder.

@erogol erogol closed this as completed Feb 17, 2021
@lpierron
Copy link

I'm trying to improve French Tacotron2 DDC, because there is some noises you don't have in English synthesizer made with Tacotron 2. There is also some pronunciation defaults on nasal fricatives, certainly because missing phonemes (ɑ̃, ɛ̃) like in œ̃n ɔ̃ɡl də ma tɑ̃t ɛt ɛ̃kaʁne (Un ongle de ma tante est incarné.)

I started to train text2feat, from scratch on French corpus (MAI_ezwa) but after 10k to 15k the loss increase drastically and the result is not so bad with your vocoder, but you told that you trained the model only for 100k steps, 10 times as I do. How can I arrive to 100k steps and more.

I join the config file (config.json) in a zip file, you will see also the config file for the vocoder, same as yours I suppose, and an example file of French sentences (sentences.txt) with the phonemize-espeak phonetisations (sentences.pho).

Thanks
tts_fr_config_sentences.zip

@lpierron
Copy link

I'm trying to improve French Tacotron2 DDC, because there is some noises you don't have in English synthesizer made with Tacotron 2. There is also some pronunciation defaults on nasal fricatives, certainly because missing phonemes (ɑ̃, ɛ̃) like in œ̃n ɔ̃ɡl də ma tɑ̃t ɛt ɛ̃kaʁne (Un ongle de ma tante est incarné.)

I started to train text2feat, from scratch on French corpus (MAI_ezwa) but after 10k to 15k the loss increase drastically and the result is not so bad with your vocoder, but you told that you trained the model only for 100k steps, 10 times as I do. How can I arrive to 100k steps and more.

I join the config file.

Thanks

If I understand well:

  • it will transform the text to ASCII -> what seems to be an issue for french because then there will be no différence between e, é and (è, ế or ë)
  • the abreviation, symbols will be transform into the english one rather than the french one...

Am I misundertsanding something?

I did the implementation of french_cleaners myslef:

def french_cleaners(text):
    '''Pipeline for French text. There is no need to expand numbers, phonemizer already does that'''
    text = lowercase(text)
    text = expand_abbreviations(text, lang='fr')
    text = replace_symbols(text, lang='fr')
    text = remove_aux_symbols(text)
    text = collapse_whitespace(text)
    return text

as you can see, the text doesn't go through the convert_to_ascii method.
I also added support for french abbreviations here.

There is a big bug in your expand_abbreviations in French. I send you the new one.
I send also a new symbols.py, with the missing nasal vowels.
abbreviations_symbols.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
model-release explanation for new model releases
Projects
None yet
Development

No branches or pull requests

5 participants