Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the ability to determine if NVDA is speaking in NVDA Controller.dll #5638

Closed
Neurrone opened this issue Dec 29, 2015 · 15 comments · Fixed by #15734
Closed

Add the ability to determine if NVDA is speaking in NVDA Controller.dll #5638

Neurrone opened this issue Dec 29, 2015 · 15 comments · Fixed by #15734

Comments

@Neurrone
Copy link

See title.

@josephsl
Copy link
Collaborator

josephsl commented Dec 29, 2015

Hi there,

Can you provide use cases for this (such as parameters you need and how this functionality can benefit other developers)? Thanks.

@Neurrone
Copy link
Author

Achieving synchronization of speech with something else, for example, sound. A major category of applications that needs this are audiogames. For example, during turn-based combat, you would want to synchronize the announcements of actions taken by units with sound playback. There are probably other use cases though.

@nishimotz
Copy link
Contributor

Actually NVDA Japanese team received this kind of request from application developers.
The major reason I have heard is that: the commercial screen reader provides such API for third party developers.
We did experimental work of "isSpeaking" API in NVDA Japanese version.
Following is the document regarding our work:
https://osdn.jp/projects/nvdajp/wiki/ControllerClientEnhancement

However, I am not satisfied with our work so far.

Firstly, it is difficult to cover various speech synthesizers to support the API.
We should use methods, properties or callbacks to meet the demand, and the implementation should be done for each synthesis APIs.
I think audio ducking work can be a good chance to find better way of improve our isSpeaking API.

Secondly, such API does not work correctly for some cases.
For example, in some cases, our isSpeaking API returns wrong value right after the speakText is performed.
Most TTS engine does not respond to the speakText command immediately and the state transitions should be 'preparing,' 'speaking,' and 'idle'.
This kind of API should be designed very carefully.

The last point is that, with my observation, NVDA developers prefer 'universal solutions' rather than enhancing NVDA-dependent ControlClient API.
Some of the users of ControlClient API is neglecting to consider other solutions, such as to use relevant accessibility API at the application side, or provide appModule for the NVDA side.

Anyway, I am happy to discuss the issue here.

@jcsteh
Copy link
Contributor

jcsteh commented Jan 3, 2016

This isn't currently possible. NVDA core does not actually have this information at the moment. #4877 might provide the basis for this. Even if it does, doing this would be extremely low priority for us, but that doesn't stop someone else from taking it up.

@Adriani90
Copy link
Collaborator

Is there anyone still considering this? I can imagine that it would help alot in implementing Sound and speech based Support for diagrams.

@derekriemer
Copy link
Collaborator

Probably not. This is non-trivial.

@Adriani90
Copy link
Collaborator

@Neurrone could you please fill the feature template for this one? Please include use cases and how you imagine this feature to work and which alternatives you think of. As of now, I will close this and hope that you address Joseph's comment above in the feature template. Thanks.

@lukaszgo1
Copy link
Contributor

@Adriani90 Have you looked at the entire conversation, or just at the first two commends? The issue is valid, and even no one expressed interest in fixing it at the moment there is no reason to close it in my opinion. It isn't created according to the feature request template, but the commends above explained why it is needed for some.

@Adriani90
Copy link
Collaborator

Oh sorry my browser didn't display all comment for what ever reason. Reopenning.

@Adriani90 Adriani90 reopened this Feb 12, 2019
@feerrenrut
Copy link
Contributor

Some guidance on this in https://github.com/nvaccess/nvda/wiki/Speech-Refactor-Use-Case-Notes

It should be pretty trivial to add a function to do this. Checking speech._manager._pendingSequences, if it's empty, there is no speech in progress.

@feerrenrut feerrenrut added this to To do in [Project] Speech Refactor via automation Jan 15, 2020
@BarryJRowe
Copy link

This feature would be very useful in my case. I'm making a service that parses a game screen and outputs a description of where the player is and what's around them (for example: west wall, north town). This is done continuously, so the service needs to know if NVDA has finished speaking what was sent to it before sending the updated game screen.

@mzanm
Copy link
Contributor

mzanm commented Mar 4, 2021

this feature would be really useful

@feerrenrut
Copy link
Contributor

I'd like feedback from people on how they intend to use this. The simplest mechanism would be adding a function such as bool nvdaController_isSpeaking but this will require client software to poll while waiting for speech to finish.
A more complicated addition might be to allow registration of a callback to notify that speech is finished:
void nvdaController_onFinishedSpeaking(void (*onFinishedSpeakingCallback)(void))

Is there a desire to be notified, or be able to poll for some arbitrary "amount" of speech remaining? It isn't clear how we would define an "amount" of speech to external software. I'd prefer not going down this path without strong justification.

@LeonarddeR
Copy link
Collaborator

I'm currently writing a prototype to provide speech to NVDA Controller using SSML. It allows the function to execute in blocking mode, which basically means that the function blocks until speech is done. As SSML supports marks in the provided sequence, you can also register a callback that is called for every mark in the speech. @Neurrone Would that cover your case?

@Neurrone
Copy link
Author

Neurrone commented Nov 5, 2023

Yeah, that would be more than sufficient.

[Project] Speech Refactor automation moved this from To do to Done Nov 23, 2023
seanbudd pushed a commit that referenced this issue Nov 23, 2023
)

Fixes #11028
Fixes #5638

Summary of the issue:
The NVDA Controller client has been stable for a long time, but it lacked support for modern speech features, such as priority and callbacks.

Description of user facing changes
None.

Description of development approach
Added the following functions to the controller client:

nvdaController_getProcessId: To get the process id (PID) of the current instance of NVDA the controller client is using.
nvdaController_speakSsml: To instruct NVDA to speak according to the given SSML. This function also supports:
Providing the symbol level.
Providing the priority of speech to be spoken.
speaking both synchronously (blockking) and asynchronously (instant return).
nvdaController_setOnSsmlMarkReachedCallback: To register a callback of type onSsmlMarkReachedFuncType that is called in synchronous mode for every <mark /> tag encountered in the SSML sequence provided to nvdaController_speakSsml.
@nvaccessAuto nvaccessAuto added this to the 2024.1 milestone Nov 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging a pull request may close this issue.