Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESpeak Voice Sounds Harsher in Master and Next Versions #5868

Closed
dgoldfield opened this issue Apr 7, 2016 · 15 comments
Closed

ESpeak Voice Sounds Harsher in Master and Next Versions #5868

dgoldfield opened this issue Apr 7, 2016 · 15 comments
Milestone

Comments

@dgoldfield
Copy link

The new ESpeak voices sound harsher, particularly with words with the letter V, such as "level." It sounds like it is saying "lebel" which becomes obvious when moving through headings on the Web and I hear items such as "heading lebel 1." Some words, such as "Internet" almost have a slight pop sound at the beginning, as though effects on my sound card were enabled. This is with English U.S. voices. This happened once before and it was addressed/I could try and locate the ticket if it would help.

@derekriemer
Copy link
Collaborator

Is this the same as the command line flag that was added a couple of
years ago? I remember that coming across the list. I remember it had to
do with the phoneme data being clipped at the end or something.

On 4/7/2016 5:51 PM, David Goldfield wrote:

The new ESpeak voices sound harsher, particularly with words with the
letter V, such as "level." It sounds like it is saying "lebel" which
becomes obvious when moving through headings on the Web and I hear
items such as "heading lebel 1." Some words, such as "Internet" almost
have a slight pop sound at the beginning, as though effects on my
sound card were enabled. This is with English U.S. voices. This
happened once before and it was addressed/I could try and locate the
ticket if it would help.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#5868


Derek Riemer
  • Department of computer science, third year undergraduate student.
  • Proud user of the NVDA screen reader.
  • Open source enthusiast.
  • Member of Bridge Cu
  • Avid skiier.

Websites:
Honors portfolio http://derekriemer.com
Awesome little hand built weather app!
http://django.derekriemer.com/weather/

email me at derek.riemer@colorado.edu mailto:derek.riemer@colorado.edu
Phone: (303) 906-2194

@Brian1Gaff
Copy link

Actually, I cannot hear this on the internal realtek hardware, but a
behringer sound card on usb has the start and end clicks and a little more
bass which does, on some voices mave Level sound a little like Label.
MY guess is that its a sound card driver issue of some sort. Its not major
but at times also tends to hide the end of a piece of speech with a click so
it sounds truncated.
Brian

bglists@blueyonder.co.uk
Sent via blueyonder.
Please address personal email to:-
briang1@blueyonder.co.uk, putting 'Brian Gaff'
in the display name field.
----- Original Message -----
From: "David Goldfield" notifications@github.com
To: "nvaccess/nvda" nvda@noreply.github.com
Sent: Friday, April 08, 2016 12:51 AM
Subject: [nvaccess/nvda] ESpeak Voice Sounds Harsher in Master and Next
Versions (#5868)

The new ESpeak voices sound harsher, particularly with words with the
letter V, such as "level." It sounds like it is saying "lebel" which
becomes obvious when moving through headings on the Web and I hear items
such as "heading lebel 1." Some words, such as "Internet" almost have a
slight pop sound at the beginning, as though effects on my sound card were
enabled. This is with English U.S. voices. This happened once before and
it was addressed/I could try and locate the ticket if it would help.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
#5868

@dgoldfield
Copy link
Author

I'm willing to see if there is a driver update for my sound card but I can tell you that this is not occurring in 2016.1.
On 4/8/2016 5:17 AM, Brian1Gaff wrote:
Actually, I cannot hear this on the internal realtek hardware, but a
behringer sound card on usb has the start and end clicks and a little more
bass which does, on some voices mave Level sound a little like Label.
MY guess is that its a sound card driver issue of some sort. Its not major
but at times also tends to hide the end of a piece of speech with a click so
it sounds truncated.
Brian

bglists@blueyonder.co.ukmailto:bglists@blueyonder.co.uk
Sent via blueyonder.
Please address personal email to:-
briang1@blueyonder.co.ukmailto:briang1@blueyonder.co.uk, putting 'Brian Gaff'
in the display name field.
----- Original Message -----
From: "David Goldfield" notifications@github.commailto:notifications@github.com
To: "nvaccess/nvda" nvda@noreply.github.commailto:nvda@noreply.github.com
Sent: Friday, April 08, 2016 12:51 AM
Subject: [nvaccess/nvda] ESpeak Voice Sounds Harsher in Master and Next
Versions (#5868)

The new ESpeak voices sound harsher, particularly with words with the
letter V, such as "level." It sounds like it is saying "lebel" which
becomes obvious when moving through headings on the Web and I hear items
such as "heading lebel 1." Some words, such as "Internet" almost have a
slight pop sound at the beginning, as though effects on my sound card were
enabled. This is with English U.S. voices. This happened once before and
it was addressed/I could try and locate the ticket if it would help.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
#5868


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHubhttps://github.com//issues/5868#issuecomment-207343256

@dgoldfield
Copy link
Author

#3860 is exactly the same issue as what I'm hearing now, if that is helpful.

@jcsteh
Copy link
Contributor

jcsteh commented Apr 8, 2016

Any ideas, @michaelDCurran? Seems we have #3860 again. Reading briefly, that was apparently due to badly compiled phoneme data.

@michaelDCurran
Copy link
Member

@dgoldfield What exact eSpeak settings are you using? I.e. rate, rate boost, variant, pitch. Also, what kind of sound card? Can you confirm that the issue is not seen in 2016.1?

@dgoldfield
Copy link
Author

  1. I can confirm that the change I've reported is not at all present in
    2016.1.
  2. Using the English United States voice. Variant: male3 (although it
    does not just happen with this variant.) No rate boost. Rate:50. Pitch:
  3. Inflection: 100. Volume: 100.
  4. The only thing listed under sound card in Device Manager is "high
    definition audio device." Same is listed in System Information. If you
    believe that locating a more current driver will help I am willing to
    pursue this but again, this issue has not existed for years and only
    surfaced in the newer master and next branches.

@michaelDCurran
Copy link
Member

@dgoldfield Any possibility of getting a recording of both 2016.1 and next? Annoyingly I cannot reproduce the issue yet on at least 2 machines. I remember the old bug, and that was caused by using an incorrect version of eSpeakEdit. However now eSpeak has the ability to compile the phoneme data itself. It is very possible that there is a bug in the compilation code... but Nothing can be done until it can be reproduced.
what kind of an impact does this issue have? I.e. does it make eSpeak unusable for you?

@dgoldfield
Copy link
Author

I can probably attach two audio samples, both from 2016.1 and a newer
master. I wouldn't say it makes ESpeak unusable but it makes it
unpleasant. I'll work on this but I won't be able to get to it until the
weekend, unfortunately. Thanks for at least trying to track it down.

@michaelDCurran
Copy link
Member

For me, if I do hear anything, it sounds as if next/master is slightly louder, and perhaps slightly compressed, compared to 2016.1.
@dgoldfield Does the issue become less if you decrease eSpeak's volume to say 95 or 90?
For now I'm thinking that eSpeak is simply producing audio louder than it should, and some sound cards are then compressing the audio.

@michaelDCurran
Copy link
Member

Next/master certainly also has slightly different EQing. Less trebble perhaps. Also some kind of low shelf.

@rhdunn
Copy link

rhdunn commented Apr 23, 2016

It would be useful to track down what is causing this difference.

On my GitHub espeak branch, I have been able to compile espeak on Linux for a long time (all the tags should be buildable). They will likely require some work to get them to build on Windows, but that could be useful trying to track down the cause of the issue.

Some things I want to test are:

  1. the version of espeak NVDA 2006.1 is using compared with the 1.48.15 version (the latest release from Jonathan);
  2. espeak 1.48.15 compared with the espeak-ng bulid;
  3. building the phoneme data on Windows and on Linux;
  4. building the phoneme data from the command line vs the espeakedit application.

This should help isolate where the issue is being introduced.

@dgoldfield
Copy link
Author

I have recorded two separate .wav files. This system will not allow me to upload them, saying this type of file is not accepted.
Here is a Dropbox link you can use to get the files.

https://www.dropbox.com/sh/jx7d0kfac0pm2rh/AADyorRuZl3zCRf0Eiq8HQnXa?dl=0

The current build audio file is using 2016.1 and the master build audio file is using a master from April 21. In the file, I alt-tab into the Jarte text editor which contains the following sentence.

Welcome to heading level 1. I am testing this synthesizer as I dialog with all of you about the various NVDA issues.

I have NVDA read the file and then I navigate through some of it word by word and then character by character. I then alt-tab back into Audacity and stop the recording.
Both builds are using U.S. English, the Male 3 variant, rate at 45, pitch 51, volume 100 and inflection 100. Decreasing the volume in the master build does not solve the problem.

@michaelDCurran
Copy link
Member

michaelDCurran commented Apr 24, 2016

Please try the latest NVDA Next snapshot (13300,9ab71476 or later). This contains the latest update to espeak-ng that apparently may fix the issue. On my system the difference I was hearing seem to have gone away.

@dgoldfield
Copy link
Author

Congratulations to both NV Access and the ESpeak development team for this work. Yes, I believe it is fixed. At first, I wasn't sure as the two versions do have some differences but I think the differences I'm now hearing are changes to some of the phonemes. However, most of that harshness is gone and, in some ways, I think I'm even liking the new version a bit better. Thank you to all of you for your willingness to track this down. ESpeak is actually my preferred synthesizer when using NVDA and it's nice to know I won't need to switch to something else. Many thanks.

@jcsteh jcsteh closed this as completed in 499c3d0 May 6, 2016
@jcsteh jcsteh added this to the 2016.2 milestone May 6, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants