A TTS is too chatty and "smart" and at most situations a builtin converter from acronims - abriviations to words is not accurate. #1014

gregjozk · 2021-09-14T03:48:53Z

Hello,

I'm raising this issue, due to on the NVDA's site on Github there are more and more issues, which are firstly linked to TTS and a default TTS in NVDA is espeak NG or Onecore, which seems to inherit most code from espeak NG due to same mistakes that it is doing.

here is a problem: a synth treats both acronims and abriviations the same so "cm" is always "centimeter" but in most cases it is not correct.
there are many more cases, where this behaviour is incorrect. e.g.:
in slavic languages "you" in plural is "vi", but espeak it treats always as "roman 6".

also test, how is spelled "x64". in most cases it is pronounced as "ten sixty four".

possible solution: do not convert abriviations to words, because in different situations they have different meanings an leave tts to spell it or speak it as usual word if it is written with lowercase included.

thanks.

regards,
Jožef

valdisvi · 2021-09-14T09:14:27Z

About which language are you talking about?

jaacoppi · 2021-09-14T09:15:11Z

All TTS software pronounce words based on assumptions about the language and context of the sentence. The modern solution would be to pass meta information about the sentence to the TTS. One such solution is part of speech tagging (POS). Since the origins of espeak NG are in 1995, these modern solutions are not used. Instead, all choices are rule based. It would be great to give user more configuration options to choose the preferred behavior. That is a work in progress with LoadVoice() and LoadConfig() functions. Unfortunately the progress is slow because the original code wasn't designed to support all these languages and situations. Currently, the only option for the user is to edit the config files and rebuild the code. That's probably too difficult for most users.

gregjozk · 2021-09-14T09:27:17Z

It is general observation, but seen mostly during english and slovenian use, which have been used daily. 2021-09-14 11:15 GMT+02.00, jaacoppi ***@***.***>:

…

All TTS software pronounce words based on assumptions about the language and context of the sentence. The modern solution would be to pass meta information about the sentence to the TTS. One such solution is part of speech tagging (POS). Since the origins of espeak NG are in 1995, these modern solutions are not used. Instead, all choices are rule based. It would be great to give user more configuration options to choose the preferred behavior. That is a work in progress with LoadVoice() and LoadConfig() functions. Unfortunately the progress is slow because the original code wasn't designed to support all these languages and situations. Currently, the only option for the user is to edit the config files and rebuild the code. That's probably too difficult for most users. -- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: #1014 (comment)

valdisvi · 2021-09-14T09:31:03Z

Because of reasons @gregjozk mentioned, particularly for Latvian, there are almost no expansion of abbreviations except very few unique ones, because they are different in different domains. Therefore, for most of cases, there is no acute need to adjust behavior dynamically, unneeded expansions of abbreviations just need to be removed from rules.

valdisvi · 2021-09-15T19:26:55Z

I can't reproduce described behavior in development version of eSpeak NG for Slovenian:

espeak-ng -x -vsl "vi x64 10 cm"
S'e:st 'iks St'i:RiinS'e:zddEsEd dEs'e:t ts,@m'@

For English only vi is pronounced as "roman six":

espeak-ng -x -ven "vi x64 10 cm"
r,oUm@n_ s'Iks 'Eks s'Iksti f'o@ t'En s,i:;'Em

gregjozk · 2021-09-16T00:43:08Z

Hello in english "vi" is spoken as roman six exactly an in slovenian it is spoken as "šest" instead of "vi" meaning "you". tested with latest NVDA using espeak NG. so to clarify, "vi" should be spoken in slovenian as "vi" not as "šesst or rimska šest" etc, because "vi" in slavic languages means "you" for plural. thanks. 2021-09-15 21:27 GMT+02.00, Valdis Vitolins ***@***.***>:

…

I can't reproduce described behavior in development version of eSpeak NG for Slovenian: ``` espeak-ng -x -vsl "vi x64 10 cm" S'e:st 'iks St'i:RiinS'e:zddEsEd dEs'e:t ***@***.***'@ ``` For English only `vi` is pronounced as "roman six": ``` espeak-ng -x -ven "vi x64 10 cm" ***@***.***_ s'Iks 'Eks s'Iksti f'o@ t'En s,i:;'Em ``` -- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: #1014 (comment)

jaacoppi · 2021-09-16T03:30:12Z

roman numbers are set in tr_languages.c with the keywords NUM_ROMAN, NUM_ROMAN_CAPITALS and NUM_ROMAN_ORDINAL. They can alse be set in the language files like espeak-ng-data/lang/zls/sl with the keyword 'numbers', but the documentation is poor. There are many other tickets which discuss completely removing the roman number support for all languages since using them is often the wrong assumption. No decision has been made yet. Should we do it now? My goal is to make LoadConfig() call LoadVoice() in such a way that the settings could be updated runtime instead of compilation time. That way the roman numbers and many other options could be off by default and set on by users if they need them. Unfortunately I don't have time right now.

valdisvi mentioned this issue Sep 19, 2021

Fix several Microsoft-specific terms #1017

Merged

stllfe mentioned this issue Jun 9, 2023

Stress and pronunciation modifications on-the-fly #1750

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A TTS is too chatty and "smart" and at most situations a builtin converter from acronims - abriviations to words is not accurate. #1014

A TTS is too chatty and "smart" and at most situations a builtin converter from acronims - abriviations to words is not accurate. #1014

gregjozk commented Sep 14, 2021

valdisvi commented Sep 14, 2021

jaacoppi commented Sep 14, 2021 via email

gregjozk commented Sep 14, 2021 via email

valdisvi commented Sep 14, 2021

valdisvi commented Sep 15, 2021

gregjozk commented Sep 16, 2021 via email

jaacoppi commented Sep 16, 2021 via email

A TTS is too chatty and "smart" and at most situations a builtin converter from acronims - abriviations to words is not accurate. #1014

A TTS is too chatty and "smart" and at most situations a builtin converter from acronims - abriviations to words is not accurate. #1014

Comments

gregjozk commented Sep 14, 2021

valdisvi commented Sep 14, 2021

jaacoppi commented Sep 14, 2021 via email

gregjozk commented Sep 14, 2021 via email

valdisvi commented Sep 14, 2021

valdisvi commented Sep 15, 2021

gregjozk commented Sep 16, 2021 via email

jaacoppi commented Sep 16, 2021 via email