Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CI #1

Merged
merged 7 commits into from
Mar 11, 2022
Merged

Add CI #1

merged 7 commits into from
Mar 11, 2022

Conversation

sthibaul
Copy link
Owner

No description provided.

@sthibaul sthibaul merged commit 433e385 into master Mar 11, 2022
@sthibaul sthibaul deleted the CI branch March 11, 2022 00:43
sthibaul added a commit that referenced this pull request Mar 15, 2022
Otherwise asan reports this during make check:

testing en ibm mit ibms mits IBM MIT APH CES ITX IBMs MIT's APHs CES's ITXs
==3733154==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffe420233ef at pc 0x7f2e8a30aef1 bp 0x7ffe42022c80 sp 0x7ffe42022c78
READ of size 1 at 0x7ffe420233ef thread T0
    #0 0x7f2e8a30aef0 in utf8_in2 src/libespeak-ng/translate.c:281
    #1 0x7f2e8a2a6db1 in MatchRule src/libespeak-ng/dictionary.c:2058
    espeak-ng#2 0x7f2e8a2a89e9 in TranslateRules src/libespeak-ng/dictionary.c:2301
    espeak-ng#3 0x7f2e8a30cc77 in addPluralSuffixes src/libespeak-ng/translate.c:393
    espeak-ng#4 0x7f2e8a30e2c9 in TranslateWord3 src/libespeak-ng/translate.c:684
    espeak-ng#5 0x7f2e8a31210b in TranslateWord src/libespeak-ng/translate.c:1100
    espeak-ng#6 0x7f2e8a313ef2 in TranslateWord2 src/libespeak-ng/translate.c:1361
    espeak-ng#7 0x7f2e8a31f4e2 in TranslateClause src/libespeak-ng/translate.c:2623
    espeak-ng#8 0x7f2e8a305010 in SpeakNextClause src/libespeak-ng/synthesize.c:1569
    espeak-ng#9 0x7f2e8a2e390e in Synthesize src/libespeak-ng/speech.c:457
    espeak-ng#10 0x7f2e8a2e552a in sync_espeak_Synth src/libespeak-ng/speech.c:570
    espeak-ng#11 0x7f2e8a2e5d1f in espeak_ng_Synthesize src/libespeak-ng/speech.c:678
    espeak-ng#12 0x7f2e8a2af2fd in espeak_Synth src/libespeak-ng/espeak_api.c:90
    espeak-ng#13 0x5618104c9137 in main src/espeak-ng.c:691
    espeak-ng#14 0x7f2e8953d7fc in __libc_start_main ../csu/libc-start.c:332
    espeak-ng#15 0x5618104c6569 in _start (/home/samy/ens/projet/1/speech/espeak-ng-git/src/.libs/espeak-ng+0x6569)

Address 0x7ffe420233ef is located in stack of thread T0 at offset 47 in frame
    #0 0x7f2e8a30cb3b in addPluralSuffixes src/libespeak-ng/translate.c:380

  This frame has 3 object(s):
    [32, 36) 'word_zz' (line 381)
    [48, 52) 'word_iz' (line 382) <== Memory access at offset 47 underflows this variable
    [64, 68) 'word_ss' (line 383)

and indeed, RULE_NOVOWELS keeps looking back until it finds a spacing
character, so we have to provide it with one.
sthibaul added a commit that referenced this pull request Mar 15, 2022
strlen(p) may be arbitrarily long, that would underflow the word, for
instance:

testing fr Latn
=================================================================
==3741805==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffd733c1329 at pc 0x7ff5ffbad2de bp 0x7ffd733bf310 sp 0x7ffd733bf308
READ of size 1 at 0x7ffd733c1329 thread T0
    #0 0x7ff5ffbad2dd in IsLetterGroup src/libespeak-ng/dictionary.c:714
    #1 0x7ff5ffbbe425 in MatchRule src/libespeak-ng/dictionary.c:1979
    espeak-ng#2 0x7ff5ffbc09e9 in TranslateRules src/libespeak-ng/dictionary.c:2301
    espeak-ng#3 0x7ff5ffc26656 in TranslateWord3 src/libespeak-ng/translate.c:733
    espeak-ng#4 0x7ff5ffc2a10b in TranslateWord src/libespeak-ng/translate.c:1100
    espeak-ng#5 0x7ff5ffc2bef2 in TranslateWord2 src/libespeak-ng/translate.c:1361
    espeak-ng#6 0x7ff5ffc374e2 in TranslateClause src/libespeak-ng/translate.c:2623
    espeak-ng#7 0x7ff5ffc1d010 in SpeakNextClause src/libespeak-ng/synthesize.c:1569
    espeak-ng#8 0x7ff5ffbfbd46 in Synthesize src/libespeak-ng/speech.c:492
    espeak-ng#9 0x7ff5ffbfd52a in sync_espeak_Synth src/libespeak-ng/speech.c:570
    espeak-ng#10 0x7ff5ffbfdd1f in espeak_ng_Synthesize src/libespeak-ng/speech.c:678
    espeak-ng#11 0x7ff5ffbc72fd in espeak_Synth src/libespeak-ng/espeak_api.c:90
    espeak-ng#12 0x5627511a3137 in main src/espeak-ng.c:691
    espeak-ng#13 0x7ff5fee557fc in __libc_start_main ../csu/libc-start.c:332
    espeak-ng#14 0x5627511a0569 in _start (/home/samy/ens/projet/1/speech/espeak-ng-git/src/.libs/espeak-ng+0x6569)

Address 0x7ffd733c1329 is located in stack of thread T0 at offset 1177 in frame
    #0 0x7ff5ffc2f760 in TranslateClause src/libespeak-ng/translate.c:1941

  This frame has 16 object(s):
    [48, 52) 'cc' (line 1944)
    [64, 68) 'source_index' (line 1945)
    [80, 84) 'prev_in' (line 1948)
    [96, 100) 'prev_out' (line 1949)
    [112, 116) 'next_in' (line 1952)
    [128, 132) 'char_inserted' (line 1954)
    [144, 148) 'word_flags' (line 1963)
    [160, 164) 'charix_top' (line 1975)
    [176, 180) 'tone' (line 1985)
    [192, 196) 'next2_in' (line 2294)
    [208, 212) 'c_temp' (line 2518)
    [224, 374) 'number_buf' (line 2522)
    [448, 1048) 'num_wtab' (line 2523)
    [1184, 1984) 'sbuf' (line 1982) <== Memory access at offset 1177 underflows this variable
    [2112, 3720) 'charix' (line 1977)
    [3856, 7456) 'words' (line 1978)

sbuf is however properly '\0'-header, so we can make IsLetterGroup
carefully walk back in the word and issue a mismatch if it walks back
too much.

Fixes espeak-ng#1108
sthibaul added a commit that referenced this pull request Mar 15, 2022
strlen(p) may be arbitrarily long, that would underflow the word, for
instance:

testing fr Latn
=================================================================
==3741805==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffd733c1329 at pc 0x7ff5ffbad2de bp 0x7ffd733bf310 sp 0x7ffd733bf308
READ of size 1 at 0x7ffd733c1329 thread T0
    #0 0x7ff5ffbad2dd in IsLetterGroup src/libespeak-ng/dictionary.c:714
    #1 0x7ff5ffbbe425 in MatchRule src/libespeak-ng/dictionary.c:1979
    espeak-ng#2 0x7ff5ffbc09e9 in TranslateRules src/libespeak-ng/dictionary.c:2301
    espeak-ng#3 0x7ff5ffc26656 in TranslateWord3 src/libespeak-ng/translate.c:733
    espeak-ng#4 0x7ff5ffc2a10b in TranslateWord src/libespeak-ng/translate.c:1100
    espeak-ng#5 0x7ff5ffc2bef2 in TranslateWord2 src/libespeak-ng/translate.c:1361
    espeak-ng#6 0x7ff5ffc374e2 in TranslateClause src/libespeak-ng/translate.c:2623
    espeak-ng#7 0x7ff5ffc1d010 in SpeakNextClause src/libespeak-ng/synthesize.c:1569
    espeak-ng#8 0x7ff5ffbfbd46 in Synthesize src/libespeak-ng/speech.c:492
    espeak-ng#9 0x7ff5ffbfd52a in sync_espeak_Synth src/libespeak-ng/speech.c:570
    espeak-ng#10 0x7ff5ffbfdd1f in espeak_ng_Synthesize src/libespeak-ng/speech.c:678
    espeak-ng#11 0x7ff5ffbc72fd in espeak_Synth src/libespeak-ng/espeak_api.c:90
    espeak-ng#12 0x5627511a3137 in main src/espeak-ng.c:691
    espeak-ng#13 0x7ff5fee557fc in __libc_start_main ../csu/libc-start.c:332
    espeak-ng#14 0x5627511a0569 in _start (/home/samy/ens/projet/1/speech/espeak-ng-git/src/.libs/espeak-ng+0x6569)

Address 0x7ffd733c1329 is located in stack of thread T0 at offset 1177 in frame
    #0 0x7ff5ffc2f760 in TranslateClause src/libespeak-ng/translate.c:1941

  This frame has 16 object(s):
    [48, 52) 'cc' (line 1944)
    [64, 68) 'source_index' (line 1945)
    [80, 84) 'prev_in' (line 1948)
    [96, 100) 'prev_out' (line 1949)
    [112, 116) 'next_in' (line 1952)
    [128, 132) 'char_inserted' (line 1954)
    [144, 148) 'word_flags' (line 1963)
    [160, 164) 'charix_top' (line 1975)
    [176, 180) 'tone' (line 1985)
    [192, 196) 'next2_in' (line 2294)
    [208, 212) 'c_temp' (line 2518)
    [224, 374) 'number_buf' (line 2522)
    [448, 1048) 'num_wtab' (line 2523)
    [1184, 1984) 'sbuf' (line 1982) <== Memory access at offset 1177 underflows this variable
    [2112, 3720) 'charix' (line 1977)
    [3856, 7456) 'words' (line 1978)

sbuf is however properly '\0'-header, so we can make IsLetterGroup
carefully walk back in the word and issue a mismatch if it walks back
too much.

Fixes espeak-ng#1108
sthibaul added a commit that referenced this pull request Mar 19, 2022
The memory sanitizer would complain:

==4157154==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x7fc191d0a85b in TranslateWord3 /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1065:7
    #1 0x7fc191d02916 in TranslateWord /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1100:14
    espeak-ng#2 0x7fc191d1b324 in TranslateWord2 /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1448:15
    espeak-ng#3 0x7fc191d14ebc in TranslateClause /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:2623:17
    espeak-ng#4 0x7fc191cfbc9b in SpeakNextClause /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/synthesize.c:1569:2
    espeak-ng#5 0x7fc191cd52fc in Synthesize /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:457:2
    espeak-ng#6 0x7fc191cd6d7c in sync_espeak_Synth /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:570:29
    espeak-ng#7 0x7fc191cd6d7c in espeak_ng_Synthesize /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:678:10
    espeak-ng#8 0x7fc191ca0340 in espeak_Synth /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/espeak_api.c:90:32
    espeak-ng#9 0x4a4381 in main /home/samy/brl/speech/espeak-ng-git/src/espeak-ng.c:691:3
    espeak-ng#10 0x7fc19168b7fc in __libc_start_main csu/../csu/libc-start.c:332:16
    espeak-ng#11 0x421449 in _start (/home/samy/ens/projet/1/speech/espeak-ng-git/src/.libs/espeak-ng+0x421449)

  Uninitialized value was created by a heap allocation
    #0 0x45000d in malloc (/home/samy/ens/projet/1/speech/espeak-ng-git/src/.libs/espeak-ng+0x45000d)
    #1 0x7fc191d1ca29 in NewTranslator /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/tr_languages.c:242:26
    espeak-ng#2 0x7fc191d1ca29 in SelectTranslator /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/tr_languages.c:482:7

(and similar for expect_verb_sn expect_noun, expect_past,
clause_upper_count, clause_lower_count)

Indeed TranslateWord3 doesn't always initialize these fields. Better
just initialize them directly from the Translator creation.
sthibaul added a commit that referenced this pull request Mar 19, 2022
phonemes_name is only initialized when V_LANGUAGE is met. This is not
necessarily the case, notably with

testing espeak_SetVoiceByName("!v/Annie") (language variant; intonation)
Cannot set intonation: language not set, or is invalid.
Uninitialized bytes in __interceptor_strcmp at offset 0 inside [0x7fff8a875e30, 1)
==4169902==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x4c6a49 in LookupPhonemeTable /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/synthdata.c:363:7
    #1 0x4c6a49 in SelectPhonemeTableName /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/synthdata.c:380:12
    espeak-ng#2 0x5098a9 in LoadVoice /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/voices.c:950:34
    espeak-ng#3 0x50edcf in espeak_ng_SetVoiceByName /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/voices.c:1585:7
    espeak-ng#4 0x4aad63 in espeak_SetVoiceByName /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/espeak_api.c:125:32
    espeak-ng#5 0x4a3fe1 in test_espeak_set_voice_by_name_language_variant_intonation_parameter /home/samy/brl/speech/espeak-ng-git/tests/api.c:356:2
    espeak-ng#6 0x4a3fe1 in main /home/samy/brl/speech/espeak-ng-git/tests/api.c:567:2
    espeak-ng#7 0x7f26e88cb7fc in __libc_start_main csu/../csu/libc-start.c:332:16
    espeak-ng#8 0x4213a9 in _start (/home/samy/ens/projet/1/speech/espeak-ng-git/tests/api.test+0x4213a9)

  Uninitialized value was created by an allocation of 'phonemes_name' in the stack frame of function 'LoadVoice'
    #0 0x504290 in LoadVoice /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/voices.c:519

so better catch it properly rather than relying on uninitialized data.
sthibaul added a commit that referenced this pull request Mar 20, 2022
LookupDict2 looks forward in the wtab array, it should still stop at its
end. Otherwise the memory sanitizer reports this:

testing en A. B C, D. E: F.
==65960==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x7ff9d7ef0de8 in LookupDict2 /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/dictionary.c:2676:11
    #1 0x7ff9d7eec2ec in LookupDictList /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/dictionary.c:2899:10
    espeak-ng#2 0x7ff9d802860a in TranslateWord3 /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:588:12
    espeak-ng#3 0x7ff9d80249d4 in TranslateWord /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1100:14
    espeak-ng#4 0x7ff9d8051fe0 in TranslateWord2 /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1361:11
    espeak-ng#5 0x7ff9d804885c in TranslateClause /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:2623:17
    espeak-ng#6 0x7ff9d800e4e9 in SpeakNextClause /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/synthesize.c:1569:2
    espeak-ng#7 0x7ff9d7fa50e6 in Synthesize /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:457:2
    espeak-ng#8 0x7ff9d7fa41b3 in sync_espeak_Synth /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:570:29
    espeak-ng#9 0x7ff9d7fa872f in espeak_ng_Synthesize /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:678:10
    espeak-ng#10 0x7ff9d7f06584 in espeak_Synth /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/espeak_api.c:90:32
    espeak-ng#11 0x4a8be3 in main /home/samy/brl/speech/espeak-ng-git/src/espeak-ng.c:691:3
    espeak-ng#12 0x7ff9d78297fc in __libc_start_main csu/../csu/libc-start.c:332:16
    espeak-ng#13 0x421449 in _start (/home/samy/ens/projet/1/speech/espeak-ng-git/src/.libs/espeak-ng+0x421449)

  Uninitialized value was created by an allocation of 'words' in the stack frame of function 'TranslateClause'
    #0 0x7ff9d8035380 in TranslateClause /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1941
sthibaul added a commit that referenced this pull request Mar 20, 2022
Special characters such as N, S1, etc. are not actually eating
characters. Their treatment should thus *not* update pre_ptr and post_ptr,
otherwise those would underflow/overflow, e.g. in the case

@) s (_NS1  [z]

this would overflow. This for instance noticeable with the memory sanitizer:

ESPEAK_DATA_PATH=$PWD ./src/espeak-ng -qX "capitals"
Translate 'capitals'
  1	c        [k]

  1	a        [a]

  1	p        [p]

  1	i        [I]

  1	t        [t]

  1	a        [a]

  1	l        [l]
 20	l (C     [l]

==2837201==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x7f7f4422744b in utf8_in2 /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:281:2
    #1 0x7f7f442281bc in utf8_in /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:332:9
    espeak-ng#2 0x7f7f440e0d31 in MatchRule /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/dictionary.c:1767:21
    espeak-ng#3 0x7f7f440d937f in TranslateRules /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/dictionary.c:2320:6
    espeak-ng#4 0x7f7f44230e5f in TranslateWord3 /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:733:15
    espeak-ng#5 0x7f7f44229844 in TranslateWord /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1100:14
    espeak-ng#6 0x7f7f44256e50 in TranslateWord2 /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1361:11
    espeak-ng#7 0x7f7f4424d6cc in TranslateClause /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:2623:17
    espeak-ng#8 0x7f7f44213359 in SpeakNextClause /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/synthesize.c:1569:2
    espeak-ng#9 0x7f7f441a9f56 in Synthesize /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:457:2
    espeak-ng#10 0x7f7f441a9023 in sync_espeak_Synth /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:570:29
    espeak-ng#11 0x7f7f441ad59f in espeak_ng_Synthesize /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:678:10
    espeak-ng#12 0x7f7f4410b3f4 in espeak_Synth /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/espeak_api.c:90:32
    espeak-ng#13 0x4a8be3 in main /home/samy/brl/speech/espeak-ng-git/src/espeak-ng.c:691:3
    espeak-ng#14 0x7f7f43a2e7fc in __libc_start_main csu/../csu/libc-start.c:332:16
    espeak-ng#15 0x421449 in _start (/home/samy/ens/projet/1/speech/espeak-ng-git/src/.libs/espeak-ng+0x421449)

  Uninitialized value was created by an allocation of 'sbuf' in the stack frame of function 'TranslateClause'
    #0 0x7f7f4423a1f0 in TranslateClause /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1941

While trying to match _NS1, MatchRule is overflowing the buffer.

It happens that this had not usually posed problem because rules usually
have these non-eating special characters last in the rule and thus it wasn't
mattering that post_ptr is pointing outside valid text.
sthibaul added a commit that referenced this pull request Mar 20, 2022
Special characters such as N, S1, etc. are not actually eating
characters. Their treatment should thus *not* update pre_ptr and post_ptr,
otherwise those would underflow/overflow, e.g. in the case

@) s (_NS1  [z]

this would overflow. This for instance noticeable with the memory sanitizer:

ESPEAK_DATA_PATH=$PWD ./src/espeak-ng -qX "capitals"
Translate 'capitals'
  1	c        [k]

  1	a        [a]

  1	p        [p]

  1	i        [I]

  1	t        [t]

  1	a        [a]

  1	l        [l]
 20	l (C     [l]

==2837201==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x7f7f4422744b in utf8_in2 /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:281:2
    #1 0x7f7f442281bc in utf8_in /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:332:9
    espeak-ng#2 0x7f7f440e0d31 in MatchRule /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/dictionary.c:1767:21
    espeak-ng#3 0x7f7f440d937f in TranslateRules /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/dictionary.c:2320:6
    espeak-ng#4 0x7f7f44230e5f in TranslateWord3 /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:733:15
    espeak-ng#5 0x7f7f44229844 in TranslateWord /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1100:14
    espeak-ng#6 0x7f7f44256e50 in TranslateWord2 /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1361:11
    espeak-ng#7 0x7f7f4424d6cc in TranslateClause /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:2623:17
    espeak-ng#8 0x7f7f44213359 in SpeakNextClause /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/synthesize.c:1569:2
    espeak-ng#9 0x7f7f441a9f56 in Synthesize /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:457:2
    espeak-ng#10 0x7f7f441a9023 in sync_espeak_Synth /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:570:29
    espeak-ng#11 0x7f7f441ad59f in espeak_ng_Synthesize /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:678:10
    espeak-ng#12 0x7f7f4410b3f4 in espeak_Synth /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/espeak_api.c:90:32
    espeak-ng#13 0x4a8be3 in main /home/samy/brl/speech/espeak-ng-git/src/espeak-ng.c:691:3
    espeak-ng#14 0x7f7f43a2e7fc in __libc_start_main csu/../csu/libc-start.c:332:16
    espeak-ng#15 0x421449 in _start (/home/samy/ens/projet/1/speech/espeak-ng-git/src/.libs/espeak-ng+0x421449)

  Uninitialized value was created by an allocation of 'sbuf' in the stack frame of function 'TranslateClause'
    #0 0x7f7f4423a1f0 in TranslateClause /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1941

While trying to match _NS1, MatchRule is overflowing the buffer.

It happens that this had not usually posed problem because rules usually
have these non-eating special characters last in the rule and thus it wasn't
mattering that post_ptr is pointing outside valid text.
sthibaul added a commit that referenced this pull request Mar 20, 2022
The memory sanitizer would complain:

==4157154==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x7fc191d0a85b in TranslateWord3 /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1065:7
    #1 0x7fc191d02916 in TranslateWord /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1100:14
    espeak-ng#2 0x7fc191d1b324 in TranslateWord2 /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1448:15
    espeak-ng#3 0x7fc191d14ebc in TranslateClause /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:2623:17
    espeak-ng#4 0x7fc191cfbc9b in SpeakNextClause /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/synthesize.c:1569:2
    espeak-ng#5 0x7fc191cd52fc in Synthesize /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:457:2
    espeak-ng#6 0x7fc191cd6d7c in sync_espeak_Synth /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:570:29
    espeak-ng#7 0x7fc191cd6d7c in espeak_ng_Synthesize /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:678:10
    espeak-ng#8 0x7fc191ca0340 in espeak_Synth /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/espeak_api.c:90:32
    espeak-ng#9 0x4a4381 in main /home/samy/brl/speech/espeak-ng-git/src/espeak-ng.c:691:3
    espeak-ng#10 0x7fc19168b7fc in __libc_start_main csu/../csu/libc-start.c:332:16
    espeak-ng#11 0x421449 in _start (/home/samy/ens/projet/1/speech/espeak-ng-git/src/.libs/espeak-ng+0x421449)

  Uninitialized value was created by a heap allocation
    #0 0x45000d in malloc (/home/samy/ens/projet/1/speech/espeak-ng-git/src/.libs/espeak-ng+0x45000d)
    #1 0x7fc191d1ca29 in NewTranslator /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/tr_languages.c:242:26
    espeak-ng#2 0x7fc191d1ca29 in SelectTranslator /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/tr_languages.c:482:7

(and similar for expect_verb_sn expect_noun, expect_past,
clause_upper_count, clause_lower_count)

Indeed TranslateWord3 doesn't always initialize these fields. Better
just initialize them directly from the Translator creation.
sthibaul added a commit that referenced this pull request Mar 20, 2022
phonemes_name is only initialized when V_LANGUAGE is met. This is not
necessarily the case, notably with

testing espeak_SetVoiceByName("!v/Annie") (language variant; intonation)
Cannot set intonation: language not set, or is invalid.
Uninitialized bytes in __interceptor_strcmp at offset 0 inside [0x7fff8a875e30, 1)
==4169902==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x4c6a49 in LookupPhonemeTable /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/synthdata.c:363:7
    #1 0x4c6a49 in SelectPhonemeTableName /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/synthdata.c:380:12
    espeak-ng#2 0x5098a9 in LoadVoice /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/voices.c:950:34
    espeak-ng#3 0x50edcf in espeak_ng_SetVoiceByName /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/voices.c:1585:7
    espeak-ng#4 0x4aad63 in espeak_SetVoiceByName /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/espeak_api.c:125:32
    espeak-ng#5 0x4a3fe1 in test_espeak_set_voice_by_name_language_variant_intonation_parameter /home/samy/brl/speech/espeak-ng-git/tests/api.c:356:2
    espeak-ng#6 0x4a3fe1 in main /home/samy/brl/speech/espeak-ng-git/tests/api.c:567:2
    espeak-ng#7 0x7f26e88cb7fc in __libc_start_main csu/../csu/libc-start.c:332:16
    espeak-ng#8 0x4213a9 in _start (/home/samy/ens/projet/1/speech/espeak-ng-git/tests/api.test+0x4213a9)

  Uninitialized value was created by an allocation of 'phonemes_name' in the stack frame of function 'LoadVoice'
    #0 0x504290 in LoadVoice /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/voices.c:519

so better catch it properly rather than relying on uninitialized data.
sthibaul added a commit that referenced this pull request Mar 20, 2022
LookupDict2 looks forward in the wtab array, it should still stop at its
end. Otherwise the memory sanitizer reports this:

testing en A. B C, D. E: F.
==65960==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x7ff9d7ef0de8 in LookupDict2 /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/dictionary.c:2676:11
    #1 0x7ff9d7eec2ec in LookupDictList /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/dictionary.c:2899:10
    espeak-ng#2 0x7ff9d802860a in TranslateWord3 /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:588:12
    espeak-ng#3 0x7ff9d80249d4 in TranslateWord /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1100:14
    espeak-ng#4 0x7ff9d8051fe0 in TranslateWord2 /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1361:11
    espeak-ng#5 0x7ff9d804885c in TranslateClause /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:2623:17
    espeak-ng#6 0x7ff9d800e4e9 in SpeakNextClause /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/synthesize.c:1569:2
    espeak-ng#7 0x7ff9d7fa50e6 in Synthesize /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:457:2
    espeak-ng#8 0x7ff9d7fa41b3 in sync_espeak_Synth /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:570:29
    espeak-ng#9 0x7ff9d7fa872f in espeak_ng_Synthesize /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:678:10
    espeak-ng#10 0x7ff9d7f06584 in espeak_Synth /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/espeak_api.c:90:32
    espeak-ng#11 0x4a8be3 in main /home/samy/brl/speech/espeak-ng-git/src/espeak-ng.c:691:3
    espeak-ng#12 0x7ff9d78297fc in __libc_start_main csu/../csu/libc-start.c:332:16
    espeak-ng#13 0x421449 in _start (/home/samy/ens/projet/1/speech/espeak-ng-git/src/.libs/espeak-ng+0x421449)

  Uninitialized value was created by an allocation of 'words' in the stack frame of function 'TranslateClause'
    #0 0x7ff9d8035380 in TranslateClause /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1941
sthibaul added a commit that referenced this pull request Mar 20, 2022
Special characters such as N, S1, etc. are not actually eating
characters. Their treatment should thus *not* update pre_ptr and post_ptr,
otherwise those would underflow/overflow, e.g. in the case

@) s (_NS1  [z]

this would overflow. This for instance noticeable with the memory sanitizer:

ESPEAK_DATA_PATH=$PWD ./src/espeak-ng -qX "capitals"
Translate 'capitals'
  1	c        [k]

  1	a        [a]

  1	p        [p]

  1	i        [I]

  1	t        [t]

  1	a        [a]

  1	l        [l]
 20	l (C     [l]

==2837201==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x7f7f4422744b in utf8_in2 /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:281:2
    #1 0x7f7f442281bc in utf8_in /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:332:9
    espeak-ng#2 0x7f7f440e0d31 in MatchRule /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/dictionary.c:1767:21
    espeak-ng#3 0x7f7f440d937f in TranslateRules /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/dictionary.c:2320:6
    espeak-ng#4 0x7f7f44230e5f in TranslateWord3 /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:733:15
    espeak-ng#5 0x7f7f44229844 in TranslateWord /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1100:14
    espeak-ng#6 0x7f7f44256e50 in TranslateWord2 /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1361:11
    espeak-ng#7 0x7f7f4424d6cc in TranslateClause /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:2623:17
    espeak-ng#8 0x7f7f44213359 in SpeakNextClause /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/synthesize.c:1569:2
    espeak-ng#9 0x7f7f441a9f56 in Synthesize /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:457:2
    espeak-ng#10 0x7f7f441a9023 in sync_espeak_Synth /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:570:29
    espeak-ng#11 0x7f7f441ad59f in espeak_ng_Synthesize /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:678:10
    espeak-ng#12 0x7f7f4410b3f4 in espeak_Synth /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/espeak_api.c:90:32
    espeak-ng#13 0x4a8be3 in main /home/samy/brl/speech/espeak-ng-git/src/espeak-ng.c:691:3
    espeak-ng#14 0x7f7f43a2e7fc in __libc_start_main csu/../csu/libc-start.c:332:16
    espeak-ng#15 0x421449 in _start (/home/samy/ens/projet/1/speech/espeak-ng-git/src/.libs/espeak-ng+0x421449)

  Uninitialized value was created by an allocation of 'sbuf' in the stack frame of function 'TranslateClause'
    #0 0x7f7f4423a1f0 in TranslateClause /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1941

While trying to match _NS1, MatchRule is overflowing the buffer.

It happens that this had not usually posed problem because rules usually
have these non-eating special characters last in the rule and thus it wasn't
mattering that post_ptr is pointing outside valid text.
sthibaul added a commit that referenced this pull request Mar 20, 2022
Special characters such as N, S1, etc. are not actually eating
characters. Their treatment should thus *not* update pre_ptr and post_ptr,
otherwise those would underflow/overflow, e.g. in the case

@) s (_NS1  [z]

this would overflow. This for instance noticeable with the memory sanitizer:

ESPEAK_DATA_PATH=$PWD ./src/espeak-ng -qX "capitals"
Translate 'capitals'
  1	c        [k]

  1	a        [a]

  1	p        [p]

  1	i        [I]

  1	t        [t]

  1	a        [a]

  1	l        [l]
 20	l (C     [l]

==2837201==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x7f7f4422744b in utf8_in2 /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:281:2
    #1 0x7f7f442281bc in utf8_in /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:332:9
    espeak-ng#2 0x7f7f440e0d31 in MatchRule /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/dictionary.c:1767:21
    espeak-ng#3 0x7f7f440d937f in TranslateRules /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/dictionary.c:2320:6
    espeak-ng#4 0x7f7f44230e5f in TranslateWord3 /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:733:15
    espeak-ng#5 0x7f7f44229844 in TranslateWord /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1100:14
    espeak-ng#6 0x7f7f44256e50 in TranslateWord2 /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1361:11
    espeak-ng#7 0x7f7f4424d6cc in TranslateClause /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:2623:17
    espeak-ng#8 0x7f7f44213359 in SpeakNextClause /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/synthesize.c:1569:2
    espeak-ng#9 0x7f7f441a9f56 in Synthesize /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:457:2
    espeak-ng#10 0x7f7f441a9023 in sync_espeak_Synth /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:570:29
    espeak-ng#11 0x7f7f441ad59f in espeak_ng_Synthesize /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:678:10
    espeak-ng#12 0x7f7f4410b3f4 in espeak_Synth /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/espeak_api.c:90:32
    espeak-ng#13 0x4a8be3 in main /home/samy/brl/speech/espeak-ng-git/src/espeak-ng.c:691:3
    espeak-ng#14 0x7f7f43a2e7fc in __libc_start_main csu/../csu/libc-start.c:332:16
    espeak-ng#15 0x421449 in _start (/home/samy/ens/projet/1/speech/espeak-ng-git/src/.libs/espeak-ng+0x421449)

  Uninitialized value was created by an allocation of 'sbuf' in the stack frame of function 'TranslateClause'
    #0 0x7f7f4423a1f0 in TranslateClause /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1941

While trying to match _NS1, MatchRule is overflowing the buffer.

It happens that this had not usually posed problem because rules usually
have these non-eating special characters last in the rule and thus it wasn't
mattering that post_ptr is pointing outside valid text.
sthibaul pushed a commit that referenced this pull request Aug 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant