Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Optional ducking of other audio #3830

Closed
nvaccessAuto opened this Issue Jan 31, 2014 · 35 comments

Comments

Projects
None yet
7 participants

Reported by jteh on 2014-01-31 01:54
Windows 8 allows us to request that other audio be ducked. This means we can duck the volume of other audio while NVDA is speaking. This should be optional, as it isn't always desirable. Also, I wonder whether it might be useful to have a command to duck audio even if this option is disabled for times when unexpected loud audio clobbers speech.

See the audioDuckingPrototype branch.

Comment 1 by k_kolev1985 on 2014-03-16 18:38
Hi Jamie,

I want to ask a question about the volume ducking feature. Will it duck the volume constantly when NVDA is running, or it will duck it only when NVDA is speaking? I'm asking, because Narrator ducks it constantly, witch is not an ideal solution.

Comment 2 by mdcurran on 2014-03-25 00:35
Our prototype ducks only when NVDA is speaking. However, as the fade time that Windows uses when starting to duck is a bit slower than we'd like, we have to slightly delay the start of speech otherwise NVDA will start speaking while the background audio is still at full volume.

Comment 3 by zahari_bgr on 2014-03-25 01:39
Hi,
It will be a very nice feature.
Is it possible to detect whether there is actual background audio currently playing?
Cause a constant delay in speech will not be nice.
It will be useful if toggling of this option could be bound to an input gesture.
Also, what about Windows 7?

Comment 4 by mdcurran on 2014-03-25 03:16
We only plan to support Windows 8 at this point in time as the Operating System has the support already built in. We have no way of detecting whether there is background audio or not, therefore we probably will have a guesture to toggle it on and off.

Comment 5 by mdcurran on 2014-12-02 02:56
A prototype of this stupport can be found in branch t3830

Comment 6 by mdcurran on 2014-12-03 07:27
A try build, now supporting no ducking, ducking for NVDA speech and sounds, and ducking always, (NVDA+shift+d and combobox in synth dialog) can be found at:
http://community.nvda-project.org/try/t3830/nvda_snapshot_try-t3830-10573,7f8c9e1.exe

Comment 7 by leonarddr on 2015-04-24 20:40
Is there any idea of when this work could find its way to next? I tried the functionality from source on windows 8.1, but am getting the message 'ducking not supported'

Comment 8 by mdcurran on 2015-04-26 18:56
This functionality can only work for installed copies due to Windows security.
I think there were still some race conditions where audio would accidentially remain ducked. I'll take another look soon.

Comment 9 by camlorn on 2015-05-18 19:33
This is almost certainly blocked by #5096 as, if we move NVWave to C/C++, we'll need to redo some of this there.

Comment 10 by jteh on 2015-10-28 00:36
We need to make sure this cooperates with prevention of our audio being ducked by other audio (#5443).

Incubated in ca7e036.

Collaborator

leonardder commented Nov 23, 2015

It seems that ducking and configuration profiles don't like each other.

To reproduce, enable ducking in a configuration profile with a trigger and disable it in the default configuration. When switching to the application with the ducking enable profile, ducking isn't enabled as expected.

Contributor

jcsteh commented Nov 23, 2015

@michaelDCurran: You need to switch ducking modes when config profiles are switched. Introduce a handleConfigProfileSwitch function in nvwave and call it from config.ConfigManager._handleProfileSwitch (config/init.py around line 510).

@jcsteh jcsteh added this to the 2016.1 milestone Nov 24, 2015

I want to report a bug with the current (available in the "next" snapshots) implementation of the audio ducking feature. It seams not to work with SAPI5. I've tried with 2 bulgarian voices and with Microsoft Zira voice, but it did not work. The option is set to "Duck when outputting speech and sounds". If the option is set to "Always duck", the feature works as expected. Is this a limitation with SAPI5 or a bug in the current implementation of the audio ducking feature? It works with other TTS engines like "Speech Player" and "RHVoice" (the older one - I don't like the newer one very much) witch are as add-ons for NVDA.

Test environment:

  • Operating system: Windows 10 Pro (build 10586.11), 64-bit, in Bulgarian with all locale settings set to "Bulgarian".
  • NVDA version: next-12831,b1f00f4.
  • Processor: Intel Core i5-2320 at 3.00GHz.
  • RAM Memory: 4.00 GB.
  • Sound Card: Realtek ALC662 at Intel Cougar Point PCH - High Definition Audio Controller.
Contributor

jcsteh commented Nov 25, 2015

Incubated in 071a653.

Incubated in fe9c939.

Incubated in 9fa91c2.

Contributor

jcsteh commented Dec 8, 2015

I think it'd be really good if we can not delay the audio if there isn't audio playing (or better still, if audio is quiet). So, I've been looking into this a bit.

It seems there are two APIs you can use to get the peak levels for an audio device:

  1. The Windows Core Audio APIs in Vista and later. These are COM based. Basically, you start with IMMDeviceEnumerator, get an IMMDevice and call IMMDevice::Activate and request IAudioMeterInformation.
    • Unfortunately, as nice as being COM based sounds, calling CoCreateInstance for IMMDeviceEnumerator results in class not registered for some reason. mmdevapi.dll doesn't seem to have a typelib, so I tried generating one from mmdeviceapi.idl. When I tried to use that with comtypes, it threw a weird assertion error.
    • This StackOverflow thread shows it's possible to use this API with comtypes, but the interfaces would all have to be written out.
  2. The old Audio Mixer API from winmm. This article provides details about how this could be used to get peak levels.
    • Unfortunately, there would be a hell of a lot of structs to write out if we wanted to do this in Python.

Of course, I guess we could write the code in C++ and avoid the Python porting bit. After all, all we want is one damned value.

Contributor

jcsteh commented Dec 8, 2015

However, the above should perhaps be done separately, as it looks like it's pretty complicated to implement and it is nice-to-have rather than essential. Thoughts, @michaelDCurran?

Contributor

michaelDCurran commented Dec 8, 2015

Certainly sounds cool, but:

Windows ducks audio of all sound cards, not just the default. I assume
we'll want to check if there is audio playing on any available card? Or
should we just use NVDA's currently configured output device?

Also, depending on the speed of the delay in the peak meters, it may
inadvertently detect NVDA's own speech. Though if we do the check at the
lowest level (i.e.) audioDucking._setDuckingState, this is guaranteed
not to be called for at least a second after any speech finishes.
_setDuckingState will also need to return True or false based on
whether it actually did change state. If asking to duck and
_setDuckingState returns false, _requestDucking should not do a

time.sleep.

Contributor

michaelDCurran commented Dec 8, 2015

Actually, my last idea was a little too over engineered, and incorrect.

Rather than changing _setduckingState, _requestDucking simply does
not have to do the time.sleep if no other audio is playing.
_setDuckingState will still technically duck, in case other audio does
start playing while NVDA is speaking.

On 9/12/2015 1:28 AM, Michael Curran wrote:

Certainly sounds cool, but:

Windows ducks audio of all sound cards, not just the default. I assume
we'll want to check if there is audio playing on any available card?
Or should we just use NVDA's currently configured output device?

Also, depending on the speed of the delay in the peak meters, it may
inadvertently detect NVDA's own speech. Though if we do the check at
the lowest level (i.e.) audioDucking._setDuckingState, this is
guaranteed not to be called for at least a second after any speech
finishes. _setDuckingState will also need to return True or false
based on whether it actually did change state. If asking to duck and
_setDuckingState returns false, _requestDucking should not do a
time.sleep.

On 8/12/2015 9:53 PM, James Teh wrote:

I think it'd be really good if we can not delay the audio if there
isn't audio playing (or better still, if audio is quiet). So, I've
been looking into this a bit.

It seems there are two APIs you can use to get the peak levels for an
audio device:

  1. The Windows Core Audio APIs
    https://msdn.microsoft.com/en-gb/library/windows/desktop/dd370784%28v=vs.85%29.aspx
    in Vista and later. These are COM based. Basically, you start
    with IMMDeviceEnumerator
    https://msdn.microsoft.com/en-gb/library/windows/desktop/dd371399%28v=vs.85%29.aspx,
    get an IMMDevice
    https://msdn.microsoft.com/en-gb/library/windows/desktop/dd371395%28v=vs.85%29.aspx
    and call IMMDevice::Activate
    https://msdn.microsoft.com/en-gb/library/windows/desktop/dd371405%28v=vs.85%29.aspx
    and request IAudioMeterInformation
    https://msdn.microsoft.com/en-gb/library/windows/desktop/dd368227%28v=vs.85%29.aspx.
    • Unfortunately, as nice as being COM based sounds, calling
      CoCreateInstance for IMMDeviceEnumerator results in class not
      registered for some reason. mmdevapi.dll doesn't seem to have
      a typelib, so I tried generating one from mmdeviceapi.idl.
      When I tried to use that with comtypes, it threw a weird
      assertion error.
    • This StackOverflow thread
      http://stackoverflow.com/questions/32149809/read-and-or-change-windows-8-master-volume-in-python
      shows it's possible to use this API with comtypes, but the
      interfaces would all have to be written out.
  2. The old Audio Mixer API
    https://msdn.microsoft.com/en-us/library/dd756701%28v=vs.85%29.aspx
    from winmm. This article
    https://support.microsoft.com/en-us/kb/181550 provides details
    about how this could be used to get peak levels.
    • Unfortunately, there would be a hell of a lot of structs to
      write out if we wanted to do this in Python.

Of course, I guess we could write the code in C++ and avoid the
Python porting bit. After all, all we want is one damned value.


Reply to this email directly or view it on GitHub
#3830 (comment).

Michael Curran
Executive Director, NV Access Limited
Phone: +61 7 3149 3306
Website:http://www.nvaccess.org/
Twitter: @nvaccess
Facebook:http://www.facebook.com/NVAccess

Michael Curran
Executive Director, NV Access Limited
Phone: +61 7 3149 3306
Website: http://www.nvaccess.org/
Twitter: @nvaccess
Facebook: http://www.facebook.com/NVAccess

Contributor

jcsteh commented Dec 8, 2015

Contributor

michaelDCurran commented Dec 9, 2015

Using the winmm technique, no waveOut devices of my two sound cards support the peak metre control. Several posts on the web suggest that cards and/or Windows has not supported this for years.
The stack Exchange article also mentions that winmm is now pretty much local to each application (i.e. the mixer API is specific for the process calling it).
And the stack exchange example itself, only dealt with setting/getting volume control levels, not peak levels as such.
I shall have a look at the COM interfaces in more detail to see if any kind of peak metre functionality exists...

Contributor

jcsteh commented Dec 9, 2015

Damn.

The stack Exchange article also mentions that winmm is now pretty much local to each application (i.e. the mixer API is specific for the process calling it).
And the stack exchange example itself, only dealt with setting/getting volume control levels, not peak levels as such.

Not that this matters, but out of interest, I assume you're referring to some other article you found (and I probably saw as well)? The article I linked here was from support.microsoft.com and dealt with metering, not volume. The StackOverflow article i linked was for Core Audio. Still, the support article was quite old.

I shall have a look at the COM interfaces in more detail to see if any kind of peak metre functionality exists...

IAudioMeterInformation has this to say:

If the audio device lacks a hardware peak meter, the audio engine automatically implements the peak meter in software, transparently to the client.

So I presume that means we can get what we need.

Incubated in 4b7eeb7.

Contributor

jcsteh commented Dec 10, 2015

Very nice!

Further info on the duck delay not happening sometimes as discussed yesterday. There are two issues.

STR for the first issue:

  1. Play some audio in the background.
  2. Switch to ducking for speech and sounds.
  3. Wait until audio unducks.
  4. Press NVDA+shift+d to switch to always duck.
    • Expected: Sleep before speech.
    • Actual: No sleep.

STR for the second issue:

  1. Play some audio in the background.
  2. Switch to no ducking.
  3. After the audio unducks but before 1 second has elapsed, press NVDA+shift+d to switch to ducking for speech and sounds.
    • Expected: Sleep before speech.
    • Actual: No sleep.
    • Audio unducks immediately, which makes sense, but the callLater still seems to apply, so it doesn't think it has to sleep.

Incubated in dfa3261.

Collaborator

leonardder commented Dec 15, 2015

I found a little issue with the current ducking implementation in next.

STR in Windows 10:

  1. Set NVDA to no ducking
  2. Start Narrator with ctrl+win+u. Narrator ducks all other audio, except for NVDA
  3. Disable Narrator again with ctrl+win+U
    • Expected: Ducking is entirely disabled again
    • Actual: NVDA ducks all other audio until closed or ducking is disabled again with nvda+shift+d
Contributor

michaelDCurran commented Dec 15, 2015

Annoyingly this is not a bug we will be able to fix. The issue is how
the AccSetRunningUtilityState function in Windows is implemented. Any
two assistive technologies that try and control audio ducking at the
same time will most likely cause this.

It should also be noted that we do not recommend running another Screen
Reader at the same time as NVDA (including narrator). Of course people
do it (us certainly as developers do from time to time) however we
cannot promise system stability if doing so.

On 15/12/2015 7:59 PM, Leonard de Ruijter wrote:

I found a little issue with the current ducking implementation in next.

STR in Windows 10:

  1. Set NVDA to no ducking
  2. Start Narrator with ctrl+win+u. Narrator ducks all other audio,
    except for NVDA
  3. Disable Narrator again with ctrl+win+U
    • Expected: Ducking is entirely disabled again
    • Actual: NVDA ducks all other audio until closed or ducking is
      disabled again with nvda+shift+d


Reply to this email directly or view it on GitHub
#3830 (comment).

Michael Curran
Executive Director, NV Access Limited
Phone: +61 7 3149 3306
Website: http://www.nvaccess.org/
Twitter: @nvaccess
Facebook: http://www.facebook.com/NVAccess

@jcsteh jcsteh closed this in 796f3f1 Jan 6, 2016

@nvaccessAuto nvaccessAuto removed the incubating label Jan 6, 2016

Settings for controlling this are not present in the General settings dialog. They are present in the Synthesizer dialog. Fix in What's New only.

Contributor

michaelDCurran commented Jan 7, 2016

They are in the Synthesizer dialog.

What lead you to believe they were in General Settings?
On 7/01/2016 10:35 PM, JamaicanUser wrote:

Settings for controlling this are not present in the General settings
dialog.


Reply to this email directly or view it on GitHub
#3830 (comment).

Michael Curran
Executive Director, NV Access Limited
Phone: +61 7 3149 3306
Website: http://www.nvaccess.org/
Twitter: @nvaccess
Facebook: http://www.facebook.com/NVAccess

Collaborator

derekriemer commented Jan 7, 2016

Oddly I thought this at first as well. Maybe we should put what dialog
it is in the whats new?

On 1/7/2016 11:16 AM, Michael Curran wrote:

They are in the Synthesizer dialog.

What lead you to believe they were in General Settings?
On 7/01/2016 10:35 PM, JamaicanUser wrote:

Settings for controlling this are not present in the General settings
dialog.


Reply to this email directly or view it on GitHub
#3830 (comment).

Michael Curran
Executive Director, NV Access Limited
Phone: +61 7 3149 3306
Website: http://www.nvaccess.org/
Twitter: @nvaccess
Facebook: http://www.facebook.com/NVAccess


Reply to this email directly or view it on GitHub
#3830 (comment).


Derek Riemer
  • Department of computer science, third year undergraduate student.
  • Proud user of the NVDA screen reader.
  • Open source enthusiast.
  • Member of Bridge Cu
  • Avid skiier.

Websites:
Honors portfolio http://derekriemer.drupalgardens.com
Non-proffessional website. http://derekriemer.pythonanywhere.com/personal
Awesome little hand built weather app that rocks!
http://derekriemer.pythonanywhere.com/weather

email me at derek.riemer@colorado.edu mailto:derek.riemer@colorado.edu
Phone: (303) 906-2194

@michaelDCurran in the What's New

Contributor

jcsteh commented Jan 8, 2016

@jcsteh jcsteh added a commit that referenced this issue Jan 11, 2016

@jcsteh jcsteh What's New: The Audio ducking mode setting is in the Synthesizer dial…
…og, not the General Settings dialog.


Fixes #5668, fixes #5669. Re #3830.
d44281d

@josephsl josephsl added a commit to josephsl/nvda that referenced this issue Oct 28, 2016

@josephsl josephsl User guide: portable and temporary copies does not support audio duck…
…ing in Windows 8 and later. re #6519


#3830: confirmed by some users and perhaps an oversight: audio ducking isn't supported on portable and temporary copies in Windows 8 and later.
86194a2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment