You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to use epitran to obtain the correct phonetic pronunciations of French words. I did get it working eventually through the use of the fra-Latn preprocessor, however its performance is lackluster. It seems to give me very literal translations, and ones that never use the uvular "ʁ" sound or the sound separating ".":
"acteur" ("actor") comes out as "atyr" (should be "ak.tœʁ")
"actrice" ("actress") comes out as "aktriz" (should be "ak.tʁis")
"chat" ("cat") comes out as "ʃa", which is correct, but at least one time when I tried it I got trailing symbols, like "ʃat"
"chien" ("dog") comes out as "ʃjâ" when it should be "ʃjɛ̃"
So after having mixed performance with that, I looked at the documentation and noticed there was a more phonetic translator "fra-Latn-np". Upon attempting to use this to translate any given word, I get the following error:
Traceback (most recent call last):
File "main.py", line 6, in <module>
epi = epitran.Epitran('fra-Latn-np')
File "/home/callum/.local/share/virtualenvs/first625-xxpZk1TH/lib/python3.8/site-packages/epitran/_epitran.py", line 46, in __init__
self.epi = SimpleEpitran(code, preproc, postproc, ligatures, rev, rev_preproc, rev_postproc, tones=tones)
File "/home/callum/.local/share/virtualenvs/first625-xxpZk1TH/lib/python3.8/site-packages/epitran/simple.py", line 43, in __init__
self.g2p = self._load_g2p_map(code, False)
File "/home/callum/.local/share/virtualenvs/first625-xxpZk1TH/lib/python3.8/site-packages/epitran/simple.py", line 100, in _load_g2p_map
raise DatafileError('Header is ["{}", "{}"] instead of ["Orth", "Phon"].'.format(orth, phon))
epitran.exceptions.DatafileError: Header is ["Prth", "Phon"] instead of ["Orth", "Phon"].
I'm not sure what causes it, but looking in that directory there is also an undocumented "fra-Lang-p" preprocessor, which does better at other times and worse than others. Could you please explain what is going on here?
Here is my code:
importsysfromgoogle_trans_newimportgoogle_translatorimportepitrantranslator=google_translator()
epi=epitran.Epitran('fra-Latn-np')
# Translate the first system argument#translated_text = translator.translate(sys.argv[1], lang_src='en', lang_tgt='fr')# Get the IPA pronunciation#ipa_symbols = epi.transliterate(translated_text)#print(translated_text)#print(ipa_symbols)print(epi.transliterate(sys.argv[1]))
The text was updated successfully, but these errors were encountered:
As is noted in the README, support for French is not very good. This is partly due to ambiguities in the French orthography and partly do to insufficient work being devoted to the modules. The use of /r/ rather than /ʁ/ is intentional, however. One use of Epitran, early in its history, but producing representations that were relatively close to etymologically related forms in other languages. Since French was historically /r/ (and still is in some dialects), the distance between French and other languages was reduced by treating in this way.
If you can provide me with more test cases, I can update 'fra-Latn' so it passes them.
I should have read the README file further before attempting what I was attempting the other day. I'm currently working on a test case module for the fr-Latn preprocessor that should be available at https://github.com/ItsSeaJay/epitran-fr-Latn-testcases
I should also mention that for my purposes I need to use uvular /ʁ/. The whole reason I'm working with this module is so I can automagically produce language flashcards for anki, and for that I need correct native pronunciations. I notice that in the README file you mention a downloadable bilingual or monolingual dictionary for chinese. Is there anything like that for the French language? Cheers.
I'm trying to use epitran to obtain the correct phonetic pronunciations of French words. I did get it working eventually through the use of the
fra-Latn
preprocessor, however its performance is lackluster. It seems to give me very literal translations, and ones that never use the uvular "ʁ" sound or the sound separating ".":So after having mixed performance with that, I looked at the documentation and noticed there was a more phonetic translator "fra-Latn-np". Upon attempting to use this to translate any given word, I get the following error:
I'm not sure what causes it, but looking in that directory there is also an undocumented "fra-Lang-p" preprocessor, which does better at other times and worse than others. Could you please explain what is going on here?
Here is my code:
The text was updated successfully, but these errors were encountered: