Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

Sound was lost in french word rez-de-chaussée #5

Open
alt131 opened this issue Apr 1, 2021 · 31 comments
Open

Sound was lost in french word rez-de-chaussée #5

alt131 opened this issue Apr 1, 2021 · 31 comments

Comments

@alt131
Copy link

alt131 commented Apr 1, 2021

Try to get audio for french word: rez-de-chaussée
Here's command line:
cat << EOF |
fr|rez-de-chaussée.
EOF
/usr/local/bin/larynx
--debug
--csv
--glow-tts /path/fr-fr/siwis-glow_tts
--hifi-gan /path/hifi_gan/universal_large
--output-dir /mnt/d/99/voices/
--language fr-fr
--denoiser-strength 0.001

Debug data:
DEBUG:larynx:Words for 'rez-de-chaussée': ['rez-de-chaussée']
DEBUG:larynx:Phonemes for 'rez-de-chaussée': ['#', 'ʁ', 'e', 'd', 'ʃ', 'o', 's', 'e', '#', '‖', '‖']
Phonemes is OK for this word but there is not sound 'd' in an output audio.

@alt131
Copy link
Author

alt131 commented Apr 1, 2021

The same situation with word "banc"
DEBUG:larynx:Words for 'banc': ['banc']
DEBUG:larynx:Phonemes for 'banc': ['#', 'b', 'ɑ̃', '#', '‖', '‖']
The sound 'b' was lost.

gomme
DEBUG:larynx:Words for 'gomme': ['gomme']
DEBUG:larynx:Phonemes for 'gomme': ['#', 'ɡ', 'ɔ', 'm', '#', '‖', '‖']
The sound 'ɡ' was lost.

Another situation with 'fille'
DEBUG:larynx:Words for 'fille': ['fille']
DEBUG:larynx:Phonemes for 'fille': ['#', 'f', 'i', 'j', '#', '‖', '‖']
Phonemes is OK but at the end laryx adds an additional sound 'e'. It's strange.

@alt131
Copy link
Author

alt131 commented Apr 1, 2021

DEBUG:larynx:Words for 'livre': ['livre']
DEBUG:larynx:Phonemes for 'livre': ['#', 'l', 'i', 'v', 'ʁ', '#', '‖', '‖']
'ʁ' sounds like 'a' but it should be as 'r'

@alt131
Copy link
Author

alt131 commented Apr 2, 2021

DEBUG:larynx:Words for 'table': ['table']
DEBUG:larynx:Phonemes for 'table': ['#', 't', 'a', 'b', 'l', '#', '‖', '‖']

DEBUG:larynx:Words for 'stylo': ['stylo']
DEBUG:larynx:Phonemes for 'stylo': ['#', 's', 't', 'i', 'l', 'o', '#', '‖', '‖']

In both cases the sound 't' was lost.

@alt131
Copy link
Author

alt131 commented Apr 2, 2021

I think I know where is problem, I took phonemes.txt for siwis-glow_tts from kathleen-glow_tts and more words are sound correctly. Please check it.

@synesthesiam
Copy link
Contributor

It seems related to surrounding words. If you have it say "la table" or "le banc", then the "t" and "b" sounds come through. I'm not sure about "rez-de-chaussée", though. If I modify the lexicon to have the pronunciation "ʁ e d d ʃ o s e", then I hear the "d" sound.

@alt131
Copy link
Author

alt131 commented Apr 12, 2021

I don't think the surrounding words should affect on pronunciation if there is a pause between words. I think it's some global bug in neural network training algorithm if we say about "table" or "banc".
About the word "rez-de-chaussée". There are not much words with hyphen in the French language but there are a lot of phrases like "qu'est-ce que c'est", "Y a-t-il", "êtes-vous?", "sont-ils?"etc
And if you just double 'd' it solved that problem but you can get problems in some other phrases.

@synesthesiam
Copy link
Contributor

This might be partially related to #7

Without doing between-word stuff like liasons explicitly in gruut, the model is forced to figure out how to blend across word boundaries (#). I may need some help from a native speaker to understand what needs to really be done here.

@alt131
Copy link
Author

alt131 commented Apr 12, 2021

I don't agree here. I think the liason can be fixed in gruut but this issue didn't have relation with gruut because gruut gave the right pronunciation for all my examples.

@ddavout
Copy link

ddavout commented May 19, 2021

I don't agree on the "fact" that There are not much words with hyphen and as it's a way used to create new words, they should not be neglected..

The pronunciation of some may seem "weird" as the liaison is done despite the hyphen
I am working on a Festival Siwis voice ... j'ai pas mal d'exemples sous le coude pour tester ma voix
the first one regarding hyphen is 'porc-épic'

and your 'porc-épic' is wrong

DEBUG:larynx:Phonemes for 'porc-épic': ['#', 'p', 'ɔ', 'ʁ', 's', 'e', 'e', 'p', 'i', 'k', '#', '‖', '‖']

We have a lot of "noms propres" in this case. 2 examples
"Pont-à-mousson" with liaison despite the "-" and the nomal POS de "à"
"Bourg-en-Bresse" with liaison despite the "-" and LIA often said as a 'k'

@alt131
Copy link
Author

alt131 commented May 20, 2021

@ddavout, I suggested to use Liason for all words with hyphen if there is no other solution. You didn't agree with it, did you?

@ddavout
Copy link

ddavout commented May 21, 2021

Good to ask ;) .. there is a misunderstanding here (my fault, I have lost command of my English :( )
I wanted to say that you should not overlook the importance of hyphens when you study liaison !

I am on your side ! I once was maybe extreme in my thinking, I thought than you can train the voice to make it understand what is a liaison. ( (I use Festival and lts rules)
I was putting between every suitable couple of word "artificial" word of pos "LIA" ( for liaison) , say "tflo" between 'dit' and 'on' and put dit-on in my lexicon .. t the ending of dit, followed by an utf8 char we never use (normally), and o : initial of the second word

It was not so bad, But as my POS became more reliable, (and my understanding of the Token and POS module, a little be less weak) I thought I could elaborate another strategy
but mind, unlike you .. I'm sure :) I am a French old lady with a lot time ... don't expect me to talk about neuronal science ..
I am happy to make my neurons work (even if they are slower now..), to use my still good ears ... etc.

My methods are time consuming.. I will enforce the so called compulsory liaison rules and propose safe ones.
Between us, there is nothing like a liaison rule .. French people love exceptions ! it's why they tolerate pseudo-rules.
Would you say "prix extrême" with a z phoneme ? no. Even if you are used to say it when prix is plural.

I'm tracking every liaison in the Siwis prompts to check my "rules", I have not yet finished job ...

euh .. to come back to hyphen matter.
I have got a list of what I call "locution" needing 1 hyphen or that could logically use one, a list with 2 to 3
locution=ready made expression with or without hyphen
ex: nuit et jour : liaison t
ex: the expression "curriculum vitae" would be read properly with a single entry, the same than the one for "curriculum-vitae" in case somebody else (than me) write it with an hyphen ..

I am not sure if I'm clear, so I'll stop now.
But before just a word .. Yes I think I should have said nothing more in fact ..
To have some success with my clustergen voice, I didn't follow the "English/American" diktat :) _ but I have not yet convinced anybody :)_

**hyphen is not just a punctuation sign, it's a letter" ..

like the French apostrophe is not a whitespace

@alt131
Copy link
Author

alt131 commented May 21, 2021

I think to check every word sequence it's very time-consuming way.
I described some pseudo-rules for liason here:
#7 (comment)
and there
#7 (comment)
the additional rules was described by @tjiho.
I think it's possible to add some additional rules + black & white lists and get an accuracy about 95-99% I believe.

And I agree now to use a liason always when hyphen is encountered without pseudo-rules it's not very good idea.

PS. Hyphen, apostrophe are symbols as whitespace. They are not letters.

@ddavout
Copy link

ddavout commented May 25, 2021

It's was an experiment to see if a voice can be trained to understand something to this "nebulous" matter the French liaison, experiment done at a time where my POS module didn't give "satisfactory" results. (at all)

No I have come back to rules, that I call exception and exclusion, corresponding probably to your pseudo rules with black and white lists.

For the running part of the voice, I will only enable compulsory liaison (without forgetting the case of locutions marked or not by hyphen).

I have not yet finalized to make it simpler, for now my rules have several parameters (the writing form of the word leading to the liaison and of its follower, their respective' POS. So it's just a compilation work.

I don't know your technology, but if it allows to have good phonemes without knowing very precisely the liaisons made in the prompts (compulsory, optional or wrong), the impressive work https://github.com/juliacarbajal/french_phonologizer/blob/master/phonologize.py should be, IMHO, a better guide.

Tell me how to use your French model https://github.com/rhasspy/gruut/releases/tag/v0.10.0, then I may be able to test your liaison solution as with different POS detection, our respective lists are likely to be very different.

Mine is not very good I must say to spot inversions ( particularly inversion for style effect) and there is still work to do to take in account the frequent spelling mistakes (ex: confusion between hyphen and apostrophe, missing hyphen etc.)

@alt131
Copy link
Author

alt131 commented May 25, 2021

Sorry, I'm not the author of gruut, I'm just an user like you. You need to ask the developer.

@ddavout
Copy link

ddavout commented May 26, 2021

Reassure me, you do use it... without POS, you can't apply your liaison rules...

If last symbol of first word is 's', 'x', 'z', 't', 'd' and first symbol of second word is h or any vowel

I am not sure of what you call a symbol and you don't agree with my concept of letter (you are not the only one...) but ...
your list look to me as incomplete, personally I consider 'c' 'q' 'k' 'g' 'd' 'x' 's' 'z' 'n' 'r' 'p' 'y' 'f' ?

it may look to you as excessive but I've meet examples in all these cases and I will know later if I have interest to treat some as exceptional
Do you include 'y' in your list of vowel ?

@alt131
Copy link
Author

alt131 commented May 26, 2021

Yes, "first letter" will be more accurate.
It's not a final list and I think about using only for phrases like "word1_space_word2". For phrases like "c'est" and any other with apostrophe I thought about using liason always. I also didn't decide how to process "neuf heures" when f->v.
Yes, I'd like to see some exceptions.
Yes, 'y' is included in list of vowels.

@ddavout
Copy link

ddavout commented May 27, 2021

For phrases like "c'est" and any other with apostrophe I thought about using liason always
I am not sure to follow you on that point.
...word1_space_word2...
I distinguish what I call, probably wrongly, locution, association of upto 3 words, that can be at least in theory replace by a single word, i.e I can without doubt attribute a POS, I declare them in my PosLex, and they have an entry in my lexbook, if .. as often, they don't follow the ordinary rules of pronunciation, or liaison . or can be personalized (ex without liaison latin locution curriculum vitae, )
that's true that the transformation f->v is not so current, and at running time raise a very bearable mistake. Personally, II am not sure if I am wrong to say 'neuf années' without phoneme v, and I have not even think to check what our 'Academiciens' have ruled :)

@alt131
Copy link
Author

alt131 commented May 27, 2021

...word1_space_word2...

Two words must be separated only by space (not an apostrophe, not a hyphen etc). The rules is only for that case. For hyphen it's other rules etc. Maybe they can be combined I don't know yet.

If it's possible I prefer to work with 2 words at once because 3 words will give much more options.

I can without doubt attribute a POS, I declare them in my PosLex

Are you sure PosLex is 100% accurate?

@ddavout
Copy link

ddavout commented May 29, 2021

My Poslex is not 100% accurate, far from it :)
Time to times we disagree, I need to bring some corrections. Last exemple, I've not yet solved:

"à moins d'être..." 'être' is seen as a verb, not a big deal... but for "à moins d'interviewer" the fault is really audible

@ddavout
Copy link

ddavout commented May 30, 2021

to come back to apostrophe and hyphen ..
@alt131 , you said
PS. Hyphen, apostrophe are symbols as whitespace. They are not letters.

but don't you feel the need to follow your own tokenizer ?

echo "l'amour rend aveugle"| python3 -m gruut fr-fr tokenize | python3 -m gruut fr-fr phonemize {"id": "", "raw_text": "l'amour rend aveugle", "raw_words": ["l'amour", "rend", "aveugle"], "clean_words": ["l'amour", "rend", "aveugle"], "tokens": [{"text": "l'amour", "pos": "NOUN"}, {"text": "rend", "pos": "VERB"}, {"text": "aveugle", "pos": "ADJ"}], "clean_text": "l'amour rend aveugle", "sentences": [{"raw_text": "l'amour rend aveugle", "raw_words": ["l'amour", "rend", "aveugle"], "clean_words": ["l'amour", "rend", "aveugle"], "tokens": [{"text": "l'amour", "pos": "NOUN"}, {"text": "rend", "pos": "VERB"}, {"text": "aveugle", "pos": "ADJ"}]}], "pronunciations": [["l", "a", "m", "u", "ʁ"], ["ʁ", "ɑ̃"], ["a", "v", "œ", "ɡ", "l"]], "pronunciation": [["l", "a", "m", "u", "ʁ"], ["ʁ", "ɑ̃"], ["a", "v", "œ", "ɡ", "l"]], "pronunciation_text": "l a m u ʁ ʁ ɑ̃ a v œ ɡ l", "mapped_phonemes": {}}

"l'amour" is seen as as "clean_word" and a word is *composed" of letters, isn't it ?

By the way, I doubt you will be able to apply fine rules using POS without working on the tokenizer beforehand
just an example, in the sentence 'non-désiré par sa mère, il est resté le mal-aimé',
you got
{"text": "non-désiré", "pos": "PROPN"}

without hyphen, you got (IMHO a better)

{"text": "désiré", "pos": "VERB"}

@alt131
Copy link
Author

alt131 commented May 31, 2021

It depends. You can see on it from 2 points of view: the linguistics and NLP (natural language processing).
From position of NLP, "l'amour" and "non-désiré" are 3 words (article+apostrophe+amour and non+hypen+désiré), after parsing an original sentence you can use a post processing and combine "non-désiré" in single word, but I will not do it for "l'amour".

The author of gruut can have his own opinion about it. And we have different goals. He wants to process a text to speech and for him to work with "l'amour" or "I've" (I have) as one word it's easy way. I need to separated them on single words and maybe later combine them in phrases because it's more comfortable from the point of view of translation.

PS. "l'amour" and "I've", they are not very good examples here. There are a lot of word combinations in French when the dictionary will become huge. Like these: je t'ouvre; la légende s'écrire; vous n'allez pas m'envoyer au bagne...

@ddavout
Copy link

ddavout commented May 31, 2021

The size of the dictionary doesn't frighten me; once my LTS is trained accordingly, the lexbook will dramatically shrink.
I am more concerned about the POSlex, but I will *help" it to recognize "l'amour" the way I want ..
but everything as a price.
If I want a reliable POS ... to have simpler run-time liaison rules .. but I understand your point of view .. I am often excessive...
and may become more reasonable with ""l'amour"... I will think about it, but for "s'" "m'" "n'" etc... I will not move.

the point of view of foreign language helped to take this decision . Why in English I can use a single word, and I've to use 2 in French ?
and I am happy when I've got straight away something like that

id _5 ; name il ;  pos PRO:per ; pbreak NB ; liaisonvocalic no ;
id _6 ; name s ; pos CON ; pbreak NB ; liaisonvocalic no ;
id _7 ; name s_en ; pos PRO:ind ; pbreak NB ; liaisonvocalic yes ;
id _8 ; name amuse ; pos VER ; pbreak BB ;

( name s ; pos CON is mute, I keep it to not disturb the POSlex I trained a long time ago... gruut doesn't have this legacy problem).

I would have thought that, from the point of view of translation, see "n'allez" as a verb will help... but apparently I'm wrong.

@alt131
Copy link
Author

alt131 commented May 31, 2021

You work only with French but I work with several languages so I always separate words with apostrophe, hyphen etc and then I use a post-processing for phrases like these too

"tout de suite", "salle de bains", "salle à manger"

as a single entity because this often translates as a single word or a similar phrase.

@synesthesiam
Copy link
Contributor

Sorry to have been out of this discussion for a while. I'm getting close to a refactored release of gruut (in the refactor branch) as version 1.0. The tokenizer/phonemizer code has been simplified, and a lot more tests have been added.

The French liason code is here. It only handles a handful of cases, but it will hopefully provide a good start.

Regarding apostrophes and hyphens: gruut's tokenizer has a set of "punctuations" that vary by language (here is the French set). Text is split into tokens by whitespace first, and then further split by punctuation characters (except for some special cases like numbers). The goal is for the final token to be something present in the lexicon.

The final set of tokens are run through a POS tagger model that was trained on the Universal Dependencies CONLLU files for each language (my French model was trained on the upos label). If there's a misalignment between the tokenizer and this model, it could definitely cause problems for the liason code.

Maybe at least the hyphen should be a "punctuation" character for French, so that those words get split into multiple tokens?

@alt131
Copy link
Author

alt131 commented Jun 2, 2021

Maybe at least the hyphen should be a "punctuation" character for French, so that those words get split into multiple tokens?

I don't know. Even if word with hyphen doesn't have a liason it will pronounce a little faster than 2 single words.

@synesthesiam
Copy link
Contributor

Actually, I take that back. The hyphen shows up in the lexicon as part of words, so it needs to be left in.

@alt131
Copy link
Author

alt131 commented Jun 8, 2021

If you process a hyphen as punctuation character how do you plan to process a liason in this case?

@synesthesiam
Copy link
Contributor

I decided not to process the hyphen as a punctuation character. It would make things too difficult.

@ddavout
Copy link

ddavout commented Jun 16, 2021

"Will you go so ad far as to say
**hyphen is not just a punctuation sign, it's a letter" .. ;)

@synesthesiam
Copy link
Contributor

From gruut's point of view, yes, a French hyphen is a letter ;)

@ddavout
Copy link

ddavout commented Jun 16, 2021

i's a pity that we use the same sign to cut the words at the end of a line.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants