Callback commands are not handled correctly. #22

mltony · 2019-10-06T00:07:07Z

Hello, This is mltony, I am working on NVDA add-on called phonetic punctuation:
https://github.com/mltony/nvda-phonetic-punctuation
IBM TTS driver doesn't seem to handle callback commands in python 3 version correctly in some cases. Here are steps to reproduce:

Install phonetic punctuation add-on:
https://github.com/mltony/nvda-phonetic-punctuation/releases/download/v0.2dev/phoneticPunctuation-0.2dev.nvda-addon
Note that it requires NVDA 2019.3 alpha.
Speak the following phrase:
Test test!!!

Expected behavior: Ding-ding-ding sounds should start playing after "test test"
Actual behavior: ding-ding sounds play at the same time "test test" utterance is spoken.

Phonetic punctuation converts this utterance into:

["Test test", <CallBackCommand that actually plays those ding sounds>, <BreakCommand for the duration of the sound>]

SO I suspect that your driver triggers callback sooner then the utterance has been fully spoken.

With other synthesizers phonetic punctuation works correctly; I tested with espeak, SAPI, OneCore and the other version of eloquence.

There is also another small problem, it seems like the duration of the pause in break command must be multiplied by some coefficient, that seem to be equal to 3. If you try to speak this phrase with phonetic punctuation:
!!!!!!!!Test
You will hear the word "test" much sooner than the dings end, because of that problem.

The text was updated successfully, but these errors were encountered:

Neurrone · 2019-10-12T11:47:07Z

@mltony which other version of Eloquence were you testing with? CodeFactory's?

davidacm · 2019-10-12T13:55:38Z

Thanks. Its urgent to solve but I can't at this time because I'm outside of my country. I'll fix it when get back to my country. If someone can fix it, I can review the pull request and accept it.

mltony · 2019-10-12T18:46:14Z

I was testing with this one:
https://github.com/pumper42nickel/eloquence_threshold/
With this one phonetic punctuation works fine.

davidacm · 2019-10-27T11:46:20Z

Hi, I need your collaboration. Let me know if you can find a solution for this issue, please read my entire long comment.

The situation:

The other driver that you mentioned has the same cracking issue, I don't want to introduce another issue to solve this.
The issue: IBMTTS driver use a stream to buffer certain quantity of audio. When that buffer is full, the audio is send to the NVDA's player. All indexes received are sent also. By this way, we avoid voice breakage but the index accuracy is lost. On the IBMTTS driver the indexes are sent early, I could change this behavior but then the indexes will be sent delayed due to the audio bufferr.
The solution I tried: send the audio stream when the buffer is full or when an index is received.

results:

The issue of point 2 appeared for many sentences in spanish language, the cases are different for each language. I tried with some english cases for you.

steps to reproduce:

I don't know if this issue depends on hardware specs, maybe on your computer you need to adjust it to distinct parameters. but here are my main computer specs:

environment:

Application: Notepad.
Operating system: windows 10 pro (1903) 64 bits.
Computer Brand: MSI.
Model: gs65.
CPU: intel i7 8750h.
ram: 32gb kingston HiperX.
Drive: ssd 512gb samsung 970 pro.
GPU: NVIDIA 1070.

Steps:

The breaks happen at the end of a string with specific speeds and sentences. I can mention many cases in spanish (my language) but in english you need to find them. Although here are some that I found in 5 mins using american english language.

Set Eloquence driver to american english language.
Adjust the eloquence driver at the specified speed. You can test it with IBMTTS also to test that the issue is not present in the second driver.
Read the following sentences.

rate at 0%:

rate 0

at 10%:

rate 10

at 15%:

this is the number 20

at 20%:

eco
comma
number
papa
alpha

at 30%:

colon
50

at 50%:

rate 50
rate 0

Mohamed00 · 2020-08-02T20:23:05Z

I believe I may have found a solution to this on the eloquence_threshold side of things. Adding buffered=True to the nvwave.WavePlayer constructor, and setting nvwave.WavePlayer.MIN_BUFFER_MS to at least 900 seems to fix the issue.

Locale fix

davidacm · 2021-05-09T18:28:07Z

Hi, has this issue been fixed?
I can't find the code fix, but I tried the proposed solution and it introduced another issue for me. Sometimes the synth has a lag of some MS if I use buffered=True.

Mohamed00 · 2021-05-09T18:34:52Z

Pretty much. I got a report on another repository that the solution I used worked pretty well for someone. For a time I was considering making an accurate indexing option, noting that it could cause crackles, but I'm not sure how practical that would be.

…

On 5/9/2021 2:28 PM, David CM wrote: Hi, has this issue been fixed? I can't find the code fix, but I tried the proposed solution and it introduced another issue for me. Sometimes the synth has a lag of some MS if I use buffered=True. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#22 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADY4AYEQWZTWPT3KPDO55YTTM3H4NANCNFSM4I5Z2O6A>.

ultrasound1372 · 2022-05-11T19:03:34Z

I saw a commit up a ways that apparently did something related to this, do we still have this issue? Have we examined how other synthesizers handle accurate indexing without crackling like that? Could it have to do with the fact that NVWave also has to do some internal resampling as the audio is sent to the output? Eloquence runs at 11025Hz, while most contemporary synthesizers run at 22050Hz. Some 16000. eSpeak might actually run higher. If your system samplerate is set to 44100 upsampling from 11025 is easy, as integer ratios always are. Just some brief interpolation. But if your system is set to 48000 perhaps it has to do more work? Or does it pass that off to Windows? Have we looked at the DECtalk access32 drivers to see if they have accurate indexing, and if they do, what settings do they use? They are another known set of synths that run at 11025.

ultrasound1372 · 2023-06-04T16:55:45Z

May be worth revisiting this discussion in relation to the NVDA alphas that add WASAPI support and the accompanying refactor of NVWave, as this might mitigate the crackling altogether. We can then either choose to install support for both the current method and a new, more accurate method depending on NVDA's version, or make a release of that add-on after that version is put out as an RC that has it as a minimum. @davidacm What do you think?

davidacm closed this as completed Oct 27, 2019

davidacm reopened this Oct 27, 2019

davidacm pushed a commit that referenced this issue Mar 18, 2021

Merge pull request #22 from Mohamed00/personal

9d52c20

Locale fix

This was referenced Jun 25, 2023

Fixing timing of index events #94

Closed

Fixing timing of index events without causing crackling #96

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Callback commands are not handled correctly. #22

Callback commands are not handled correctly. #22

mltony commented Oct 6, 2019 •

edited

Neurrone commented Oct 12, 2019

davidacm commented Oct 12, 2019

mltony commented Oct 12, 2019

davidacm commented Oct 27, 2019

Mohamed00 commented Aug 2, 2020

davidacm commented May 9, 2021

Mohamed00 commented May 9, 2021 via email

ultrasound1372 commented May 11, 2022

ultrasound1372 commented Jun 4, 2023

Callback commands are not handled correctly. #22

Callback commands are not handled correctly. #22

Comments

mltony commented Oct 6, 2019 • edited

Neurrone commented Oct 12, 2019

davidacm commented Oct 12, 2019

mltony commented Oct 12, 2019

davidacm commented Oct 27, 2019

The situation:

results:

steps to reproduce:

environment:

Steps:

rate at 0%:

at 10%:

at 15%:

at 20%:

at 30%:

at 50%:

Mohamed00 commented Aug 2, 2020

davidacm commented May 9, 2021

Mohamed00 commented May 9, 2021 via email

ultrasound1372 commented May 11, 2022

ultrasound1372 commented Jun 4, 2023

mltony commented Oct 6, 2019 •

edited