Skip to content

Speech Refactor Use Case Notes

James Teh edited this page Sep 13, 2017 · 2 revisions

Speech Refactor Use Case Notes

Here are some quick notes about how the new speech framework implemented in #7599 can be used to implement some of the more tricky use cases. They are by no means complete.

Profile switching use cases

Anything that needs to change voice, synth, etc. can be implemented using ConfigProfileTriggerCommand in speech sequences. Rather than having specific, restrictive settings (e.g. set synth for specific language, set rate for math, etc.), profile triggers can be used instead. This allows for a lot more flexibility. For example, rather than only being able to set the rate for math, you could also change the voice if you wish; anything you can change in a profile is possible. We might want to provide wizards or similar to help set up common use cases, though.

Relevant use cases include:

  • Switch to specific synths for specific languages #279
  • Changing speeds for different languages #4738
  • Voice aliases #4433
  • Introduce a special Math speech rate #7274

The general idea is:

  1. Create an appropriate profile trigger; e.g. a LanguageProfileTrigger which is used whenever a LangChangeCommand is encountered.
  2. Use speech.ConfigProfileTriggerCommand to enter/exit this trigger in a speech sequence when appropriate. For example, we'd probably add commands for LanguageProfileTrigger in speech.speak. For math, we might output commands for MathProfileTrigger in speech._speakTextInfo_addMath.
  3. Provide GUI for users to configure this trigger in the New Profile/Config Profile Triggers dialogs. This is pretty easy for something like math; it can be done similar to say all. However, we might need some kind of UI to add additional triggers for things with a lot of possible options such as languages.

NVDA Remote

I (@jcsteh) discussed this a little with @tspivey. He suggested an issue be filed against NVDA Remote covering this, but I haven't done that because I won't be able to follow this through. Here are some notes for when we're ready to tackle this.

We probably want NVDA Remote to still support older synths that don't implement the new API. Unfortunately, that means NVDA Remote will still need to use its existing patching/lastIndex polling code for those synths. You can test for those synths using speech.shouldUseCompatCodeForIndexing. This existing code won't work at all for newer synths which use the new framework.

Regarding how to implement things for the new framework:

  • For the most part, the slave should just relay speech sequences passed to speech._SpeechManager.speak to the master.
    • An extension point will needed to be added for that once we figure out exactly where.
    • Commands like BeepCommand, EndUtteranceCommand and WaveFileCommand can be serialized pretty easily. Custom callbacks can't be serialized, though.
    • This means NVDA Remote will get capital letter indication, since spelling is now one speech sequence which uses BeepCommand, PitchCommand, etc. See NVDARemote/NVDARemote#110.
  • @leonardder noted that we'll need some way to differentiate standalone calls to tones.beep and nvwave.playWaveFile from calls made due to speech commands. Otherwise, they'll double up. This should be pretty trivial with an additional argument.
  • Say all is a bit trickier. We want to sync the cursor with the speech from the master, not the slave, since the master is the one primarily doing the controlling.
    • I think this can be done by having the slave remove say all callbacks from the speech sequence and storing them in a map with an identifier. We'll need a filter in core for this. Also, we'll need a way to distinguish say all callbacks; right now, they're just CallbackCommands. We could probably create a simple SayAllCallbackCommand subclass.
    • The slave would pass this identifier to the master as part of the speech sequence.
    • The master would wrap this in a callback. When called, the callback would notify the slave that this identifier was reached.
    • The slave would then grab the original callback from the map and call it, thus syncing the cursor, pushing more speech, etc.

Determine whether NVDA is speaking

As requested by: Add the ability to determine if NVDA is speaking in NVDA Controller.dll #5638

It should be pretty trivial to add a function to do this now. It can check speech._manager._curPriQueue. If it's None, there is no speech in progress.

Clone this wiki locally