Texts with multiple line spacings are voiced with NVDA + down arrow and voices crack #11061

SeanTolstoyevski · 2020-04-27T01:55:47Z

Steps to reproduce:

Create text as follows in the notepad.

test




test 123 123


test 111




test   ooooooo




test one



test two





teeest


wauw







test




test 123 123


test 111




test   ooooooo




test one



test two





teeest


wauw






test




test 123 123


test 111




test   ooooooo




test one



test two





teeest


wauw

Go to the beginning of the text.
Press the NVDA + down arrow.

Actual behavior:

Voices crack in some places.

Expected behavior:

System configuration

NVDA installed/portable/running from source:

installed

NVDA version:

2019.3.1

Windows version:

windows 10 pro, 64 bit, 18363.778

Name and version of other software in use when reproducing the issue:

Other information about your system:

Other questions

Does the issue still occur after restarting your computer?

Yes

Have you tried any other versions of NVDA? If so, please report their behaviors.

Tested with versions after 2019.3.1.
Sounds are cracking.

If addons are disabled, is your problem still occuring?

Yes

Did you try to run the COM registry fixing tool in NVDA menu / tools?

Yes

The text was updated successfully, but these errors were encountered:

fisher729 · 2020-04-27T11:40:16Z

Hi.

Which synthesizer is in use here? I can confirm a small crackle when using eSpeak, but I'm not sure if continuous reading of long text can cause this in eSpeak under normal circumstances, as I don't use it on the regular.

SeanTolstoyevski · 2020-04-27T16:17:36Z

Synthesizer is Espeak.

Adriani90 · 2020-04-27T22:17:39Z

I can reproduce this only with eSpeak, and there also depending on which variant you are using. for example using Variant Quincy there is much less crackling than using for example variant "Robert".
I am not sure if NVDA can do something about this. The sound of the voices are comming actually from the eSpeak project. I suggest you to open also an issue on github to their repository at
https://github.com/espeak-ng/espeak-ng

This issue might also be due to the sonic library used by eSpeak as the underlying driver. Maybe reporting this to the developer of Sonic driver would help as well.

amirsol81 · 2020-04-28T09:17:28Z

As much as I tried, I had no luck breaking Eloquence, Windows OneCore, and eSpeak set to Persian which is based on British English, which, IMO, means good luck this time around.

feerrenrut · 2020-04-28T13:05:20Z

This may be improved by: #11024 which has a build you can test with. Note that this build still has some issues, this PR is still only a draft.

SeanTolstoyevski · 2020-04-28T16:08:34Z

OK,

I will share this issue on Espeak.
Maybe it's Espeak's problem.

I will write the result here.

SeanTolstoyevski · 2020-05-03T21:24:03Z

Hi NVDA Team, I hope you are fine.

I shared Issue on Espeak.
There is no problem in Espeak.
But I can't be sure about this because Espeak pauses between the lines.
NVDA skips between the lines and voices like a straight line.
Additionally, the problem is felt more when Boost on is active. So sounds are cracking more.
I know almost nothing about Espeak and synthesizer. My knowledge is very little.
So I don't think I can say anything exactly.

Issue is here.
Maybe you can write to Espeak for a better test.:
espeak-ng/espeak-ng#742

Adriani90 · 2020-05-06T22:57:28Z

This might also be related to #7769.

tspivey · 2020-08-06T18:39:36Z

Simpler STR:

Set eSpeak to voice English (Great Britain), variant none, rate 20, rate boost on.
Read this with notepad and say all:

equals

The room

Note that this doesn't happen with 2019.2.1.

SeanTolstoyevski · 2020-08-09T20:32:38Z

hi @tspivey ,

I finished the last test.

Sound cracks in 2020.2.

Unfortunately, I often have to run 2019.2 as portable.
I cannot read a book. Cracking sounds disturbing.

And english :) . Sound, audio, voice...
It's a complex subject for someone whose native language is not English.
I hope you can understand me.

Adriani90 · 2020-08-10T18:31:45Z

Could you please test with the last alpha version which has been issued today?

tspivey · 2020-08-10T18:38:41Z

Test results: My example still breaks.

Adriani90 · 2020-08-10T21:15:29Z

cc: @jcsteh maybe you have some thoughts on this?

jcsteh · 2020-08-10T23:26:35Z

This is related to the indexes (or marks) sent to the synthesiser for cursor tracking, accurate synchronisation of sounds and synth changes, etc. Say all uses these to mark each line for cursor tracking. However, it did this in 2019.2 as well.

There are a couple of possibilities:

We're now sending more indexes than we used to and eSpeak chokes on that for some reason. To determine that, we'd need to look at the markup sent to eSpeak between 2019.2 and a current release. We're probably sending at least one more index than we used to.
In 2019.2, we sent indexes and then polled periodically to see whether an index had been reached. The problem with that is inaccuracy; it's (mostly) acceptable for say all, but totally unacceptable for synchronisation of sounds, etc. Now, we send the audio in such a way that we get accurate callbacks when an index has been reached. However, it's possible this might lead to a situation where a chunk of audio is too small, so the next chunk isn't ready yet by the time the previous chunk is finished (buffer underrun). That will cause audio glitches.
The question is why that's occurring here. Some of these chunks are probably pretty short, but they shouldn't be short enough to reliably cause buffer underrun.
To figure out whether this has something to do with our audio code and not eSpeak, we could try preventing nvwave.WavePlayer._feedUnbuffered from being called when onDone is provided, though this would completely break indexing and I'm not sure how well SpeechManager would cope with that.
If it is our audio code, I'd start by looking at the sizes of buffers being pushed into nvwave.WavePlayer._feedUnbuffered. If we're getting some really small chunks there, we want to know why, perhaps starting at nvwave.WavePlayer.feed and working backwards from there.

jcsteh · 2020-08-10T23:28:23Z

It's also possible it's a combination of 1 and 2; i.e. this isn't an eSpeak bug, but we are sending two indexes very close together for some reason and that causes a buffer underrun. That raises the question of why we're sending indexes so close together. Either way, I think it's worth looking at the markup being sent to eSpeak.

josephsl · 2020-08-11T18:32:41Z

Hi, not that I’m opposed to this (we had a long discussion about Cython a while back), but I think it would be better to focus on one thing at a time (mental health takes priority). Thanks.

jcsteh · 2020-08-11T19:17:23Z

Actually, NVDA uses only a single background thread for synths and audio. I very much doubt the GIL is the bottleneck here. Of course, if you can prove otherwise, that's good info to have and solutions can then be considered... but let's work out the root cause before diving into solutions that may well not fix the problem.

SeanTolstoyevski · 2020-08-12T06:46:01Z

I'm not sure about that.

When I test with Python's performance profiler, I can see 30-40 ms delay for some functions.

I will do something. I will inform you about this when I move to the city.

NVDA's existing audio output code (nvwave) is largely very old and uses WinMM, a very old legacy Windows audio API. It is also written in pure Python, contains quite a few threading locks necessitated by WinMM, and parts of it have become rather difficult to reason about. There are several known stability and audio glitching issues that are difficult to solve with the existing code. Description of user facing changes At the very least, this fixes audio glitches at the end of some utterances as described in #10185 and #11061. I haven't noticed a significant improvement in responsiveness on my system, but my system is also very powerful. It's hard to know whether the stability issues (e.g. #11169) are fixed or not. Time will tell as I run with this more. Description of development approach 1. The bulk of the WASAPI implementation is written in C++. The WASAPI interfaces are easy to access in C++ and difficult to access in Python. In addition, this allows for the best possible performance, given that we regularly and continually stream audio data. 2. The WinMM code fired callbacks by waiting for the previous chunk to finish playing before sending the next chunk, which could result in buffer underruns (glitches) if callbacks were close together (Python 3 versions of NVDA produce a scratch in the speech when finishing the end of a line #10185 and Texts with multiple line spacings are voiced with NVDA + down arrow and voices crack #11061). In contrast, the WASAPI code uses the audio playback clock to fire callbacks independent of data buffering, eliminating glitches caused by callbacks. 3. The WinMM WavePlayer class is renamed to WinmmWavePlayer. The WASAPI version is called WasapiWavePlayer. Rather than having a common base class, this relies on duck-typing. I figured it didn't make sense to have a base class given that WasapiWavePlayer will likely replace WinmmWavePlayer altogether at some point. 4. WavePlayer is set to one of these two classes during initialisation based on a new advanced configuration setting. WASAPI defaults to disabled. 5. WasapiWavePlayer.feed can take a ctypes pointer and size instead of a Python bytes object. This avoids the overhead of additional memory copying and Python objects in cases where we are given a direct pointer to memory anyway, which is true for most (if not all) speech synthesisers. 6. For compatibility, WinmmWavePlayer.feed supports a ctypes pointer as well, but it just converts it to a Python bytes object. 7. eSpeak and oneCore have been updated to pass a ctypes pointer to WavePlayer.feed. 8. When playWaveFile is used asynchronously, it now feeds audio on the background thread, rather than calling feed on the current thread. This is necessary because the WASAPI code blocks once the buffer (400 ms) is full, rather than having variable sized buffers. Even with the WinMM code, playWaveFile code could block for a short time (nvwave.playWaveFile not fully async #10413). This should improve that also. 9. WasapiWavePlayer supports associating a stream with a specific audio session, which allows that session to be separately configurable in the system Volume Mixer. NVDA tones and wave files have been split into a separate "NVDA sounds" session. WinmmWavePlayer has a new setSessionVolume method that can be used to set the volume of a session. This at least partially addresses Ability to adjust volume of sounds #1409.

CyrilleB79 · 2023-07-25T15:33:21Z

Reopening since WASAPI is not enabled by default anymore (#15172).

Reintroduces #14697 Closes #10185 Closes #11061 Closes #11615 Summary of the issue: WASAPI usage should be reenabled by default on alpha so wider testing can occur Description of user facing changes WASAPI is re-enabled - refer to #14697 for benefits Description of development approach change feature flag default value to enabled

SeanTolstoyevski mentioned this issue Apr 30, 2020

Cracks when voicing multi-line text espeak-ng/espeak-ng#742

Closed

jcsteh mentioned this issue Aug 12, 2020

Python 3 versions of NVDA produce a scratch in the speech when finishing the end of a line #10185

Closed

This was referenced Mar 4, 2023

nvWave: Use Windows Audio Session API #11615

Closed

Support for audio output using WASAPI #14697

Merged

michaelDCurran closed this as completed in #14697 Apr 30, 2023

nvaccessAuto added this to the 2023.2 milestone Apr 30, 2023

CyrilleB79 reopened this Jul 25, 2023

seanbudd modified the milestones: 2023.2, 2023.3 Jul 26, 2023

seanbudd mentioned this issue Jul 26, 2023

Re-enable WASAPI by default #15195

Merged

6 tasks

seanbudd closed this as completed in #15195 Jul 26, 2023

CyrilleB79 mentioned this issue Sep 20, 2023

UX fixes for WASAPI GUI and doc #15478

Merged

5 tasks

jcsteh mentioned this issue Dec 10, 2023

NVDA 2023.3 is not reading certain lists appropriately across Microsoft office. #15743

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Texts with multiple line spacings are voiced with NVDA + down arrow and voices crack #11061

Texts with multiple line spacings are voiced with NVDA + down arrow and voices crack #11061

SeanTolstoyevski commented Apr 27, 2020

fisher729 commented Apr 27, 2020

SeanTolstoyevski commented Apr 27, 2020

Adriani90 commented Apr 27, 2020

amirsol81 commented Apr 28, 2020

feerrenrut commented Apr 28, 2020

SeanTolstoyevski commented Apr 28, 2020

SeanTolstoyevski commented May 3, 2020

Adriani90 commented May 6, 2020

tspivey commented Aug 6, 2020

SeanTolstoyevski commented Aug 9, 2020 •

edited

Loading

Adriani90 commented Aug 10, 2020 via email •

edited by feerrenrut

Loading

tspivey commented Aug 10, 2020

Adriani90 commented Aug 10, 2020

jcsteh commented Aug 10, 2020 •

edited

Loading

jcsteh commented Aug 10, 2020

josephsl commented Aug 11, 2020 via email •

edited by feerrenrut

Loading

jcsteh commented Aug 11, 2020

SeanTolstoyevski commented Aug 12, 2020

CyrilleB79 commented Jul 25, 2023

Texts with multiple line spacings are voiced with NVDA + down arrow and voices crack #11061

Texts with multiple line spacings are voiced with NVDA + down arrow and voices crack #11061

Comments

SeanTolstoyevski commented Apr 27, 2020

Steps to reproduce:

Actual behavior:

Expected behavior:

System configuration

NVDA installed/portable/running from source:

NVDA version:

Windows version:

Name and version of other software in use when reproducing the issue:

Other information about your system:

Other questions

Does the issue still occur after restarting your computer?

Have you tried any other versions of NVDA? If so, please report their behaviors.

If addons are disabled, is your problem still occuring?

Did you try to run the COM registry fixing tool in NVDA menu / tools?

fisher729 commented Apr 27, 2020

SeanTolstoyevski commented Apr 27, 2020

Adriani90 commented Apr 27, 2020

amirsol81 commented Apr 28, 2020

feerrenrut commented Apr 28, 2020

SeanTolstoyevski commented Apr 28, 2020

SeanTolstoyevski commented May 3, 2020

Adriani90 commented May 6, 2020

tspivey commented Aug 6, 2020

SeanTolstoyevski commented Aug 9, 2020 • edited Loading

Adriani90 commented Aug 10, 2020 via email • edited by feerrenrut Loading

tspivey commented Aug 10, 2020

Adriani90 commented Aug 10, 2020

jcsteh commented Aug 10, 2020 • edited Loading

jcsteh commented Aug 10, 2020

josephsl commented Aug 11, 2020 via email • edited by feerrenrut Loading

jcsteh commented Aug 11, 2020

SeanTolstoyevski commented Aug 12, 2020

CyrilleB79 commented Jul 25, 2023

SeanTolstoyevski commented Aug 9, 2020 •

edited

Loading

Adriani90 commented Aug 10, 2020 via email •

edited by feerrenrut

Loading

jcsteh commented Aug 10, 2020 •

edited

Loading

josephsl commented Aug 11, 2020 via email •

edited by feerrenrut

Loading