Set number handling on by default #888

jaacoppi · 2021-03-04T19:01:19Z

Can you check my findings and assumptions. This commit will change many languages.

I was instructing someone who is currently adding a new language. They had problems getting numbers working. Nothing in _list was processed. The reason is that number processing is disabled by default. I think it should be on by default so adding a new language is easier. Also see the reasoning in the commit message.

There can be errors if the number definitions are incomplete. The solution would be to fix the number handling code instead of disabling it by default.

I don't speak most of the language affected here. I used a combination of google translate and manually reading the diffs and _list files to figure it out.

If we don't want to change the default behavior we should turn on number processing for those languages that benefit from it and improve documentation in docs/add_language.md to make sure contributors realize to enable number processing.

docs: add details about number flags to the documentation. It's clearly intended to be enabled by default: - it's defined as default behaviour translate.h (NUM_DEFAULT) - tr_languages.c sets many default values related to number processing that have no meaning unless langopts.numbers == 1. It is also a more sensible default since most languages will want to have number processing on. This makes adding new languages easier because adding an entry to tr_languages.c is unnecessary. A negative side effect is that languages with partial number defines might experience bugs when reading undefined numbers. This is a bug and should be fixed. This will have the side effect of enabling number processing for languages that currently have it disabled. However, there shouldn't be any. Here's a way to check affected languages: for voice in $(ESPEAK_DATA_PATH=`pwd` LD_LIBRARY_PATH=src:${LD_LIBRARY_PATH} src/espeak-ng --voices | grep -v Languages | awk '{print $2}'); do OUTPUT=$(ESPEAK_DATA_PATH=`pwd` LD_LIBRARY_PATH=src:${LD_LIBRARY_PATH} src/espeak-ng -qx -v $voice "1 - 2 - 3 - 12 - 123") && echo "$voice: $OUTPUT" ; done These voices clearly benefit from enabling numbers (they already have number rules in *_list): ba, cmn (zh), hak, haw, ja, kok, nb, nci Some languages are missing some definitions (like _12) in _list files. It causes the program to skip some numbers. Numbering needs to be turned off explicitly for: jbo, mi, my, piqd, py, qu, quc, th, uz Languages with no number rules at all: chr, cv, he, nog, tk, ug

valdisvi · 2021-03-08T19:02:37Z

I think, this is ok. In future this and similar settings should be set in ../espeak-ng-data/lang/.. configuration files, but this change looks like evolution in that direction.

jaacoppi added 3 commits March 4, 2021 18:54

add macro L4() needed by Klingon (piqd)

b0a7ef2

Update changelog

1914d39

jaacoppi merged commit 4c3fe18 into espeak-ng:master Mar 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set number handling on by default #888

Set number handling on by default #888

jaacoppi commented Mar 4, 2021

valdisvi commented Mar 8, 2021

Set number handling on by default #888

Set number handling on by default #888

Conversation

jaacoppi commented Mar 4, 2021

valdisvi commented Mar 8, 2021