Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handling of line breaks and clause detection #301

Open
vimalan-sakthivel opened this issue Aug 23, 2017 · 5 comments
Open

handling of line breaks and clause detection #301

vimalan-sakthivel opened this issue Aug 23, 2017 · 5 comments

Comments

@vimalan-sakthivel
Copy link

@vimalan-sakthivel vimalan-sakthivel commented Aug 23, 2017

When the word NEW is in caps, eSpeak NG reads it by letters as N.E.W.

For example: the below headings are read out as N E W release.
<h3>NEW RELEASE</h3>
<h3 style="text-transform: uppercase">new release</h3>

This behavior is not exhibited by other synthesizers, Example: Microsoft Speech API version 5

Note: I am testing this with NVDA v 2017.2

@jaacoppi

This comment has been minimized.

Copy link
Contributor

@jaacoppi jaacoppi commented Oct 2, 2017

This is not specifically about the word new. Also, it doesn't happen with espeak-ng -v en "NEW STUFF" but depends on the next line in the file.

compare espeak-ng -v en -f FILE with FILE being these two lines:

HOT SENTENCE - NEW POSSIBILITIES
I wonder how to make this happen again

(This causes HOT and NEW to be H.O.T and N.E.W.)

HOT SENTENCE - NEW POSSIBILITIES
I wonder how to make this happen ag

(These two lines cause hot and new to be read correctly.)

HOT LINE
a b c d e f g h i j

(is read as H.OT)

HOT SENTENCE
a b c d e f g h i j

(is read as hot)

Looks like this has something to do with char count, word count or word length but I can't seem to find a pattern.

@valdisvi

This comment has been minimized.

Copy link
Member

@valdisvi valdisvi commented Oct 2, 2017

I couldn't reproduce it using sample text from file on Linux. For me eSpeakNG always spells it as word. The same as on online testing page. What operating system do you use?

@jaacoppi

This comment has been minimized.

Copy link
Contributor

@jaacoppi jaacoppi commented Oct 3, 2017

latest espeak-ng (commit c4cab2e) and espeak version 1.48.03 04.Mar.14 have the same problem.

Linux Mint 18 Sarah, xfce edition. uname -a:
Linux (hostname) 4.4.0-45-generic #66-Ubuntu SMP Wed Oct 19 14:12:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Also, this is not specific to english. I tested with 8 different languages, same thing happened in each.

Sample file attached.

samplefile.txt

@jaacoppi

This comment has been minimized.

Copy link
Contributor

@jaacoppi jaacoppi commented Jan 9, 2018

The code causing this issue seems to be here:

} else if (!found && !(dictionary_flags[0] & FLAG_SKIPWORDS) && (word_length < 4) && (tr->clause_lower_count > 3)
&& (tr->clause_upper_count <= tr->clause_lower_count)) {
// An upper case word in a lower case clause. This could be an abbreviation.
spell_word = 1;

The comment says it all, an upper case word in a lower case clause. It looks like the code is meant to detect words like BBS, IRC and TLA within a sentence.

The actual problem is the way speak-ng handles clauses. In all of the examples above, each line is a clause. It seems that espeak-ng fails to recognise them as such.

What could be the solution? Simply starting a new clause after a line break would probably break a lot of text.

I suggest this issue be renamed to "handling of line breaks and clause detection" or something similar.

@vimalan-sakthivel vimalan-sakthivel changed the title eSpeak reads the word NEW (in CAPs) as N.E.W handling of line breaks and clause detection Jun 27, 2019
@vimalan-sakthivel

This comment has been minimized.

Copy link
Author

@vimalan-sakthivel vimalan-sakthivel commented Jun 27, 2019

Issue title updated. We still seem to be facing this issue. Can someone look into this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.