Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide support for new Windows 10 One Core voices #6159

Closed
leonardder opened this issue Jul 8, 2016 · 60 comments
Closed

Provide support for new Windows 10 One Core voices #6159

leonardder opened this issue Jul 8, 2016 · 60 comments
Assignees
Labels
p3 https://github.com/nvaccess/nvda/blob/master/projectDocs/issues/triage.md#priority
Milestone

Comments

@leonardder
Copy link
Collaborator

Windows 10 has mobile voices which can formally be accessed only from windows store apps. However, it is possible to copy the registry keys for these new voices from "HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech_OneCore\Voices\Tokens" to "HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\Voices\Tokens". You have to do this in c:\windows\syswow64\regedit.exe in order for NVDA not to throw a com error upon selection.

Of course, this is the ugly way. However, it might be possible to create a new synth driver which gets the voice tokens from the particular registry location.

@leonardder
Copy link
Collaborator Author

After some investigation, there seems to be a reference to a SAPI_OneCore.SpVoice com object in the registry. Alas:

>>> comtypes.client.CreateObject("SAPI_OneCore.SpVoice")
<POINTER(IUnknown) ptr=0x899cf80 at 534f440>

@leonardder leonardder changed the title Provide support for new Windows 10 mobile voices Provide support for new Windows 10 One Core voices Jul 9, 2016
@josephsl
Copy link
Collaborator

josephsl commented Jul 11, 2016

Hi,

In next.13417, I get the following traceback.

STR:

  1. Select SAPI Mobile.
  2. Use Control+NvDA+up arrow to select different voices until you arrive at the last one.
  3. Press the command once more.

Traceback:

ERROR - scriptHandler.executeScript (22:10:43):
error executing script: <bound method GlobalCommands.script_increaseSynthSetting of <globalCommands.GlobalCommands object at 0x04DDF230>> with gesture u'ctrl+NVDA+up arrow'
Traceback (most recent call last):
File "scriptHandler.pyc", line 186, in executeScript
File "globalCommands.pyc", line 227, in script_increaseSynthSetting
File "synthSettingsRing.pyc", line 118, in increase
File "synthSettingsRing.pyc", line 17, in increase
File "synthSettingsRing.pyc", line 53, in _set_value
File "synthDriverHandler.pyc", line 29, in changeVoice
File "synthDrivers\sapi5.pyc", line 193, in _set_voice
File "synthDrivers\sapi5.pyc", line 174, in _initTts
COMError: (-2147221164, 'Class not registered', (None, None, None, 0, None))

Setup:

  • NVDA version: next.13417.
  • OS: Windows 10 Insider build 14385
  • Synthesizer: SAPI Mobile.

It might be that future commits may have fixed this. Thanks.

@michaelDCurran
Copy link
Member

Most likely the speech data for that language is not installed.

If you know its language you can go to Windows 10's language settings,
choose options on that language, and download he speech pack.

We can't really do much about the error. The voice is registered, but
does not contain its data. There is no real way to filter that voice out
with out actually trying it.

Mick

On 11/07/2016 3:13 PM, Joseph Lee wrote:

Hi,

In next.13417, I get the following traceback.

STR:

  1. Select SAPI Mobile.
  2. Use Control+NvDA+up arrow to select different voices until you
    arrive at the last one.
  3. Press the command once more.

Traceback:

ERROR - scriptHandler.executeScript (22:10:43):
error executing script: > with gesture u'ctrl+NVDA+up arrow'
Traceback (most recent call last):
File "scriptHandler.pyc", line 186, in executeScript
File "globalCommands.pyc", line 227, in script_increaseSynthSetting
File "synthSettingsRing.pyc", line 118, in increase
File "synthSettingsRing.pyc", line 17, in increase
File "synthSettingsRing.pyc", line 53, in _set_value
File "synthDriverHandler.pyc", line 29, in changeVoice
File "synthDrivers\sapi5.pyc", line 193, in _set_voice
File "synthDrivers\sapi5.pyc", line 174, in _initTts
COMError: (-2147221164, 'Class not registered', (None, None, None, 0,
None))

Setup:
8 NVDA version: next.13417.

  • OS: Windows 10 Insider build 14385
  • Synthesizer: SAPI Mobile.

It might be that future commits may have fixed this. Thanks.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#6159 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ANf5nTbUALorKEZ1FN45v4UD7BOLoaWVks5qUdEWgaJpZM4JIR_w.

Michael Curran
Executive Director, NV Access Limited
Phone: +61 7 3149 3306
Website: http://www.nvaccess.org/
Twitter: @nvaccess
Facebook: http://www.facebook.com/NVAccess

@leonardder
Copy link
Collaborator Author

leonardder commented Jul 11, 2016 via email

@josephsl
Copy link
Collaborator

Hi,

Workaround confirmed.

For the record, the workaround is:

  1. Press Windows+I to open Settings.
  2. Select Time & language/Region & language.
  3. Select the language you need to download the speech pack for, then select options.
  4. Select one of the download buttons (one of them is speech pack, and you need to use object navigation to confirm).

Issue at hand: I was using Korean, hence that explains why I got COM error. Thanks.

@michaelDCurran
Copy link
Member

I think there is a bit of indirection between voice tokens and
filenames. But I'll have a look. It may not be that simple.

On 11/07/2016 3:36 PM, Leonard de Ruijter wrote:

wouldn't it be possible to check the existence of de voice in
c:\windows\system32\speech_onecore?


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#6159 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ANf5nc1skG4pNt1ocOnXhTkYeNbfZx_Vks5qUdZrgaJpZM4JIR_w.

Michael Curran
Executive Director, NV Access Limited
Phone: +61 7 3149 3306
Website: http://www.nvaccess.org/
Twitter: @nvaccess
Facebook: http://www.facebook.com/NVAccess

@MichelSuch
Copy link
Contributor

Loading this new synth fails when output device is not Microsoft sound mapper.
the problem also occurs with Sapi 5 synths in this next release.

@michaelDCurran
Copy link
Member

Thanks. Fixed in commit 4810483

@leonardder
Copy link
Collaborator Author

leonardder commented Jul 11, 2016

I think it is possible to fix the problem reported by @josephsl.

  1. In HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech_OneCore\Voices\Tokens, there is an item "MSTTS_V110_enAU_CatherineM" in my case
  2. This key has an attribute called VoicePath which contains the path to the voice data file. In my case: %windir%\Speech_OneCore\Engines\TTS\en-AU\M3081Catherine
  3. When checking this paths existence, it doesn't exist. So in other words, when the path can not be found, ignore this voice, since it is not complete.

it's a mystery to me why Windows has these voices in the registry though. My Windows is English GB, so I have no English Australian stuff installed.

@derekriemer
Copy link
Collaborator

The speech rate isn't preserved when using synth settings ring.

  1. Select microsoft x voice.
  2. press nvda+ctrl+dn, or up, when on voice. Observe your speech rate
    setting isn't preserved.

On 7/11/2016 12:27 PM, Leonard de Ruijter wrote:

I think it is possible to fix the problem reported by @josephsl
https://github.com/josephsl.

a. In
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech_OneCore\Voices\Tokens,
there is an item "MSTTS_V110_enAU_CatherineM" in my case

b. This key has an attribute called VoicePath which contains the path
to the voice data file. In my case:
%windir%\Speech_OneCore\Engines\TTS\en-AU\M3081Catherine

c. When checking this paths existence, it doesn't exist. So in other
words, when the path can not be found, ignore this voice, since it is
not complete.

it's a mystery to me why Windows has these voices in the registry
though. My Windows is English GB, so I have no English Australian
stuff installed.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#6159 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AFGivZV67O8QOPYYooTEUgRDx06BPJ0Rks5qUosYgaJpZM4JIR_w.


Derek Riemer
  • Department of computer science, third year undergraduate student.
  • Proud user of the NVDA screen reader.
  • Open source enthusiast.
  • Member of Bridge Cu
  • Avid skiier.

Websites:
Honors portfolio http://derekriemer.com
Awesome little hand built weather app!
http://django.derekriemer.com/weather/

email me at derek.riemer@colorado.edu mailto:derek.riemer@colorado.edu
Phone: (303) 906-2194

@leonardder
Copy link
Collaborator Author

leonardder commented Jul 11, 2016 via email

@michaelDCurran
Copy link
Member

Technically that is not a complete path. That is the basename of one of
the files .apm, .ini etc. I guess checking for the .ini would be safe
enough I think.

On 12/07/2016 4:27 AM, Leonard de Ruijter wrote:

I think it is possible to fix the problem reported by @josephsl
https://github.com/josephsl.

a. In
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech_OneCore\Voices\Tokens,
there is an item "MSTTS_V110_enAU_CatherineM" in my case

b. This key has an attribute called VoicePath which contains the path
to the voice data file. In my case:
%windir%\Speech_OneCore\Engines\TTS\en-AU\M3081Catherine

c. When checking this paths existence, it doesn't exist. So in other
words, when the path can not be found, ignore this voice, since it is
not complete.

it's a mystery to me why Windows has these voices in the registry
though. My Windows is English GB, so I have no English Australian
stuff installed.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#6159 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ANf5nSvKxFhxYEDicwUVfpVgoJ-ETcQEks5qUosZgaJpZM4JIR_w.

Michael Curran
Executive Director, NV Access Limited
Phone: +61 7 3149 3306
Website: http://www.nvaccess.org/
Twitter: @nvaccess
Facebook: http://www.facebook.com/NVAccess

@mohdshara
Copy link

Hi.

in the anniversary update of Windows 10 Microsoft introduced an Arabic mobile voice. This is the first free Arabic tts. Fixing this issue will bring NVDA to many Arab users who can't afford to buy a commercial tts and will give many others a legal voice to use. what would this take to be committed?

@dkager
Copy link
Collaborator

dkager commented Dec 5, 2016

Likewise, would be great to use the new Dutch voice.

@feerrenrut
Copy link
Contributor

@michaelDCurran It looks like this is already in progress. I'm going to set this to priority 2 to finish if off.

@feerrenrut feerrenrut added the p3 https://github.com/nvaccess/nvda/blob/master/projectDocs/issues/triage.md#priority label Dec 8, 2016
@jcsteh
Copy link
Contributor

jcsteh commented Dec 12, 2016 via email

@jcsteh jcsteh self-assigned this Dec 12, 2016
@vortex1024
Copy link

vortex1024 commented Apr 16, 2017

The Windows 10 Creators update has brought another wave of mobile languages and voices. Here's the full list:
https://support.microsoft.com/en-gb/help/22797/windows-10-narrator-tts-voices
among them there's my native one. As far as I know, this is the only free, high quality voice for it. Couldn't wait for that driver, too bad a feature specific donation doesn't exist...
It seems using the SAPI mobile voice with the registry hack introduces some lag. I tried to use it from C#, the microsoft recommended way, but it still has that lag. the only app I've found not to have that latency is... Narrator. I hope NVDA will be able to do whatever Narrator's doing to obtain the same speed.

@leonardder
Copy link
Collaborator Author

@jcsteh: It seems that the Windows speech settings in the settings app influence the speech rate range of NVDA.

Str:

  1. Go to Windows 10 settings > time and language > speech
  2. Change the speech rate

You will see that, when one core is selected in NVDA, changing the speech rate in the Windows settings app allows you to set the rate much faster.

@jcsteh
Copy link
Contributor

jcsteh commented May 5, 2017 via email

@derekriemer
Copy link
Collaborator

derekriemer commented May 5, 2017 via email

@Mohamed00
Copy link

Mohamed00 commented May 5, 2017 via email

@PratikP1
Copy link

PratikP1 commented May 5, 2017 via email

jcsteh added a commit that referenced this issue May 18, 2017
@PratikP1
Copy link

PratikP1 commented May 19, 2017 via email

@PratikP1
Copy link

@jcsteh I found a case where One Core voices seem to be generating errors. steps and log below. Tested in Win 10 Pro insider 16199. NVDA version next-14051,e102a7e0

STR:

  1. Open the following URL in Chrome Canary or latest Firefox. http://www.fundbox.com
  2. Once the page is loaded, read the contents by pressing NVDA+a.
  3. Follow steps 1 and 2 in Edge.

NVDA generates frequent error tones. See log for results.
nvda One Core Fundbox.TXT

@jcsteh
Copy link
Contributor

jcsteh commented May 23, 2017 via email

@jcsteh
Copy link
Contributor

jcsteh commented May 30, 2017

@PratikP1 commented on May 23, 2017, 12:48 AM GMT+10:

  1. Open the following URL in Chrome Canary or latest Firefox. http://www.fundbox.com
  2. Once the page is loaded, read the contents by pressing NVDA+a.

This should now be fixed in the latest next snapshot (next-14066,9f903895).

@PratikP1
Copy link

@jcsteh commented

This should now be fixed in the latest next snapshot (next-14066,9f903895).
Thank you. verified in NVDA version next-14067,067d2d17

@PratikP1
Copy link

Here are some additional issues I've noticed.

  • Setting One Core voices as default and copying setting to secure desktop does not carry over One Core Voices' speed and other settings.
  • There are some pronunciation issues that I don't notice when I use the same voice with Narrator. For instance, Names starting with Mc should be pronounced as Mik xxxx. But NVDA consistently pronounces MC before the second part of the name.
  • I notice that certain voice entries in the dictionary seem to make no difference in how a particular thing is pronounced by NVDA. MacBook comes to mind. It's pronounced properly by Narrator. No dictionary entry required.
  • The strange behavior about the previous two instances is that when traversing through the voice dictionary for a particular voice where the change is made, the entry to be replaced is actually pronounced correctly. Try entering McFail as a correction in the voice dictionary and save the change. Now return to the entry in the list of changes. The actual Entry is pronounced correctly. The incorrect pronunciation occurs while reading or navigating.

@jcsteh
Copy link
Contributor

jcsteh commented Jun 11, 2017

@PratikP1 commented on 11 Jun 2017, 05:54 GMT+10:

  • Setting One Core voices as default and copying setting to secure desktop does not carry over One Core Voices' speed and other settings.

Did you save your settings before copying to secure desktop? Arguably, we should do this automatically, but I don't think we do right now, which means settings you changed since your last save won't apply.

  • There are some pronunciation issues that I don't notice when I use the same voice with Narrator. For instance, Names starting with Mc should be pronounced as Mik xxxx. But NVDA consistently pronounces MC before the second part of the name.

NVDA has a builtin dictionary rule to break up camel case words (e.g. CamelCase), since this is reasonably common. Unfortunately, this does break examples like McFail. It's a tricky problem. On one hand, we don't want to break things like McFail. On the other hand, that rule was added precisely because camel case words were causing users problems.

@PratikP1
Copy link

@jcsteh WROTE:

Did you save your settings before copying to secure desktop? Arguably, we should do this automatically, but I don't think we do right now, which means settings you changed since your last save won't apply. <

I did save the settings. I had Core One Voices running as default on the desktop for quite a bit before I applied the settings to secure desktop. I didn't want to rely on an unstable synthesizer on secure desktop because of its critical role.

NVDA has a builtin dictionary rule to break up camel case words (e.g. CamelCase), since this is reasonably common. Unfortunately, this does break examples like McFail. It's a tricky problem. On one hand, we don't want to break things like McFail. On the other hand, that rule was added precisely because camel case words were causing users problems. <

What confuses me about this is that synthesizers such as Espeak or Codefactory eloquence don't seem to be affected by this type of issue.

@jcsteh
Copy link
Contributor

jcsteh commented Jun 11, 2017

@PratikP1 commented on 12 Jun 2017, 09:19 GMT+10:

I did save the settings. I had Core One Voices running as default on the desktop for quite a bit before I applied the settings to secure desktop.

What settings were lost? Did you lose the voice or just rate and pitch?

Note that rate is affected by the rate you set in system speech settings, which obviously can't be set for secure screens. There's nothing we can do about this right now, I'm afraid. Microsoft are working on exposing API to make this better.

What confuses me about this [camel case rule] is that synthesizers such as Espeak or Codefactory eloquence don't seem to be affected by this type of issue.

eSpeak seems to pronounce "Mc Fail" (with a space) so it sounds the same as "McFail" (no space). I'm guessing Eloquence does the same.

@PratikP1
Copy link

@jcsteh wrote:

What settings were lost? Did you lose the voice or just rate and pitch?
Note that rate is affected by the rate you set in system speech settings, which obviously can't be set for secure screens. There's nothing we can do about this right now, I'm afraid. Microsoft are working on exposing API to make this better.

I lost the speed and pitch settings. The voice is kept. A Microsoft team member is following up with feedback I provided to them regarding the exposure of these settings via API.

@nvaccessAuto nvaccessAuto added this to the 2017.3 milestone Jun 13, 2017
jcsteh added a commit that referenced this issue Jun 13, 2017
… issue #6159)

This uses a C++/CX dll to access the UWP SpeechSynthesizer class. There are other UWP APIs we might like to access in future (e.g. OCR), so rather than making this dll specific to OneCore speech, it's called nvdaHelperLocalWin10. The build system for this dll makes it easy to add other components in future.
In addition, this required code to generate balanced XML from an NVDA speech sequence. Although we use SSML for eSpeak, eSpeak happily accepts unbalanced (malformed) XML. OneCore speech does not. This code is in the speechXml module. This might eventually be reused to replace the ugly balanced XML code in the SAPI5 driver.

Note that NVDA can no longer be built with Visual Studio 2015 Express. You must use Visual Studio 2015 Community, as you also need the Windows 10 Tools and SDK. See the updated dependencies in the readme for details.
Also, we now bundle the VC 2015 runtime, as some systems don't have it and it is needed for OneCore Speech.
@PratikP1
Copy link

I'm not sure whether this is an NVDA issue but I doubt it. Just for reference, it appears that Win 10 build 16226 in combination with NVDA version next-14129,104d2ff9 changes the voice name, changing the voice dictionary. I had a large dictionary built up with a particular One Core voice until now. With the latest update, The dictionary no longer refers to the One Core voice as mobile. "oneCore-Microsoft Zira Mobile.dic" is no longer active with the Zira voice. Rather "oneCore-Microsoft Zira.dic" is now active. Anyone testing these voices in NVDA builds will need to rename the old mobile dictionary to the newer name if they don't wish to lose dictionary changes for a particular One Core Voice.

To do this, use the following steps:

  1. Press Windows logo/start.
  2. Explore to the NVDA program folder.
  3. Inside the folder, go to the "explore NVDA User Configuration" item and press enter.
  4. Navigate to the "speech dics" folder and press enter.
  5. Navigate to the particular OneCore Mobile Voice dictionary file.
  6. Press f2 (or choose rename from the context menu.)
  7. Remove the "mobile" portion of the file name including the preceding space character and press enter. Note that you will have to either delete or rename a new file of the same name if you've created one before renaming the mobile voice dictionary file.

@mohdshara
Copy link

I find an official list of windows core voices. I think it would be very beneficial that those are referenced in the user manual or wiki?
https://support.microsoft.com/en-gb/help/22797/windows-10-narrator-tts-voices

@jcsteh
Copy link
Contributor

jcsteh commented Jul 10, 2017

@mohdshara commented on 26 Jun 2017, 20:00 GMT+10:

I find an official list of windows core voices. I think it would be very beneficial that those are referenced in the user manual or wiki?

Nice find; thanks. I updated the User Guide (22602cd #7366) to reference this link instead of the one we were referencing before, as it's more relevant and up to date.

@mohdshara
Copy link

in the user guide you say:
"Please note that the faster rates available with Narrator are not currently available with NVDA." I understand from this that even if you set the rate to %100 in text to speech settings and also set the rate to %100 in NVDA's voice settings, Narrator could still speak faster. IMHO, if my understanding isn't correct, this sentence should be removed. NVDA speaks really really fast in that case and I don't think Narrator can still speak faster.

@jcsteh
Copy link
Contributor

jcsteh commented Jul 11, 2017

I have my rate set to 100% in Settings and 100% in NVDA. It is fairly fast, but I can still understand it. In contrast, if I set the rate to 100% in Narrator, it is much faster; I cannot understand it.

@derekriemer
Copy link
Collaborator

I have my rate set to 100% in Settings and 100% in NVDA. It is fairly fast, but I can still understand it. In contrast, if I set the rate to 100% in Narrator, it is much faster; I cannot understand it.

You can in fact get fast rates with NVDA, go into settings and change the speech speed slider to 100% there and then set NVDA to 100%

@jcsteh
Copy link
Contributor

jcsteh commented Jul 11, 2017

That's exactly what I'm saying. Even with both of those set, it's not as fast as Narrator's 100%. It's fast, but not "as" fast.

@leonardder
Copy link
Collaborator Author

leonardder commented Jul 17, 2017

@jcsteh: Is there a particular reason why One Core speech doesn't work on UAC and other secure screens? STR:

  1. Press CTRL+ALT+Delete
  2. Press NVDA+control+S
  3. Select Windows OneCore voices

Result: I get a "could not load" message, even though OneCore is able to talk using Narrator just fine.

@jcsteh
Copy link
Contributor

jcsteh commented Jul 17, 2017

@leonardder commented on 18 Jul 2017, 05:49 GMT+12:

Is there a particular reason why One Core speech doesn't work on UAC and other secure screens?

I'm not sure. Would you mind filing a follow up issue for this? Thanks.

@leonardder
Copy link
Collaborator Author

Never mind my previous comment, it turned out that I had an old test driver in my synthDrivers folder for system config.

@fernando-jose-silva
Copy link

There to all.
I would like to thank you for the great work.
I did not use the one-core voices, but I decided to use them for testing.
I'm using windows 10 15063.483, and the nvda next 14215.
On Saturday I used nvda normally without any errors using the vocalizer 1.1 driver.

On Sunday I started using the one-core voices and one day the nvda stopped for 2 times.
Once when opening chrome and again when opening the Outlook.
I saw that the nvda was showing possible crashes in previous comments, but had been corrected.
But as I have seen these errors, I would like to report on possible investigations.
If the nvda loses its voice, but the machine continues to function, I asked a person who has vision to see and the machine responds normally, but with the mute nvda, it is necessary to restart the nvda for the
falha voz one core nvda-old.log.txt

device to return.
I saved a log when the ero occurs, maybe it will help.
Thank you one more time.

@TheWookieWay
Copy link

I really want to use catherine mobile australian voice for my TTS if i have to listen to it every day when I stream. The only instructions for this require hacking the registry, but are outdated. It would be nice to NOT have to hack the registry to do this, but even if I have to, if I had correct instructions, i could do it.
Anyone have any info for using Catherine mobile as my default for a program like restream.io chat app? (it only sees non mobile voices to select)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
p3 https://github.com/nvaccess/nvda/blob/master/projectDocs/issues/triage.md#priority
Projects
None yet
Development

No branches or pull requests