Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Windows] fix DirectSound buffer underrun with some Bluetooth audio devices #25046

Merged
merged 1 commit into from
Apr 23, 2024

Conversation

thexai
Copy link
Member

@thexai thexai commented Apr 22, 2024

Description

Fix DirectSound buffer underrun with some Bluetooth audio devices

Fixes #20567

Motivation and context

Windows 11 has changed a bit the way DirectSound is handled and in combination with some Bluetooth audio devices / drivers causes buffer underruns.

There was some confusion with this because at the beginning of Windows 11 there were widespread problems with BT audio devices. After some Microsoft patches the situation improved significantly but even today some devices still have problems apparently only with Kodi (even that these same devices works in Kodi with Windows 10).

In any case DirectSound buffers seems very small: only 15ms chunks and 180 ms sink buffer. On forum one of affected users has confirmed issue is fixed increasing buffer size in various combinations:

15 ms * 20 chunks = 300 ms OK
15 ms * 30 chunks = 450 ms OK
50 ms * 8 chunks = 400 ms OK
50 ms * 5 chunks = 250 ms is not enough

https://forum.kodi.tv/showthread.php?tid=376406&pid=3193663#pid3193663

15 ms seems have some issues because 44100 sample rate * 0.015 is not exact number, then no exact number of samples for chunk duration. Good candidates are 20 ms 30 ms or 50 ms but stressing thins --> scrolling very fast in a list of files with GUI sounds I have gotten occasional buffer underruns with 20 ms and even 30 ms and 10 chunks (300 ms buffer). Same is happens with current 15 ms and 12 chunks. Seems 15 ms chunk is too small regardless of buffer size.

50 ms seems best number to obtain stability and works fine in my system with 5 chunks but not enough on these BT devices.

8 chunks seems totally stable and is one of options tested with success on affected BT devices.

NOTE: this problem only happens in DirectSound, WASAPI is not affected because works in other way and system/driver returns optimal buffer duration that is already bigger: 500 ms or 1s.

What is the effect on users?

Fixes DirectSound buffer underruns in Windows 11 in combination with some Bluetooth audio devices.

Types of change

  • Bug fix (non-breaking change which fixes an issue)
  • Clean up (non-breaking change which removes non-working, unmaintained functionality)
  • Improvement (non-breaking change which improves existing functionality)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that will cause existing functionality to change)
  • Cosmetic change (non-breaking change that doesn't touch code)
  • Student submission (PR was done for educational purposes and will be treated as such)
  • None of the above (please explain below)

Checklist:

  • My code follows the Code Guidelines of this project
  • My change requires a change to the documentation, either Doxygen or wiki
  • I have updated the documentation accordingly
  • I have read the Contributing document
  • I have added tests to cover my change
  • All new and existing tests passed

@thexai thexai added Type: Fix non-breaking change which fixes an issue Platform: Windows Component: Audio v22 Piers labels Apr 22, 2024
@thexai thexai added this to the "P" 22.0 Alpha 1 milestone Apr 22, 2024
@thexai thexai requested a review from fritsch April 22, 2024 14:39
@CrystalP
Copy link
Contributor

I understand that the PR helps, but is this the correct knob to turn?

There aren't many debug logs, but at least two can be opened and they show the same pattern.
https://paste.kodi.tv/onixifekiz.kodi
https://fars.ee/nYT6.log

CWin32DirectSound::GetSpace - buffer underrun - W:3672, P:4294950528, O:1872.
CWin32DirectSound::GetSpace - buffer underrun - W:7200, P:4294953392, O:3672.
CWin32DirectSound::GetSpace - buffer underrun - W:7200, P:4294953448, O:1912.
CWin32DirectSound::GetSpace - buffer underrun - W:7200, P:4294953496, O:1912.
CWin32DirectSound::GetSpace - buffer underrun - W:7200, P:4294953544, O:1912.

Play cursor offset is close to the max DWORD value, but:
frame count = 661, frame size = 8
=> with master code, chunk size = 661 * 8 = 5288, then buffersize = chunk size * 12 = 63456 bytes, should be the max play or write position, no?

We don't know if every call to UpdateCacheStatus() receives that weird cursor position or only some of them.

With the buffer size increase of the PR, do the underrun log lines actually go away? Or does the increase only cover up the problem?

https://learn.microsoft.com/en-us/previous-versions/windows/desktop/mt708925(v=vs.85) says that the write cursor is typically 15ms ahead of play cursor. It's very old documentation, but maybe that's a clue and we could check the offset between play and write after the buffer size increase. Compare with Win10 or Win11 + non-bluetooth audio output.

What do you think.

@fritsch
Copy link
Member

fritsch commented Apr 23, 2024

Especially Bluetooth headsets buffer a whole lot by themselves. You can hear that if you press the pause / stop button that they continue to play for "several time", some even in the seconds range. SBC and a potential bad reception in between can cause issues if payload chunks are too small. That was the case why I linked the pulseaudio sink. We had an exact similar issue there with audio stuttering.

While you definitely have a point with the Ringbuffer distance. If that could be used - even a dynamic would be possible.

In general for AE design: Large buffers do not harm, given that GetDelay and Flush / Drain are properly working.

@thexai
Copy link
Member Author

thexai commented Apr 23, 2024

I understand that the PR helps, but is this the correct knob to turn?

I don't think there is any other way to solve this that doesn't include an increase in the buffer size.

We should look for the origin of the issue in an increase of latency and/or jitter due internal changes in Windows 11 at least for Bluetooth audio. The underrun error in logs is only a consequence, not the cause. The main issue is audio glitch/corruption.

Play cursor offset is close to the max DWORD value

These "big" numbers are in reality near to zero:

DWORD is an unsigned 32 bit type (same as uint32_t) and these numbers are easily generated by overflow to (non possible) negative numbers.

Example:

DWORD number = 0;
number = number - 5288;

With the buffer size increase of the PR, do the underrun log lines actually go away?

Yes, this PR fixes the two things: audio corruption and log errors.

@thexai
Copy link
Member Author

thexai commented Apr 23, 2024

Jenkins build this please

@thexai thexai merged commit c92e0b9 into xbmc:master Apr 23, 2024
2 checks passed
@thexai thexai deleted the fix-DirectSound-Win11 branch April 23, 2024 14:41
@CrystalP
Copy link
Contributor

DWORD is an unsigned 32 bit type (same as uint32_t) and these numbers are easily generated by overflow to (non possible) negative numbers.

yes that is my guess as well, the Windows 11 DirectSound emulation may have some sort of bug.

It's probably unrelated but dsound has a constant DSBSIZE_FX_MIN for the minimum size of a buffer that is intended to have dsound effects applier (150ms).

With the buffer size increase of the PR, do the underrun log lines actually go away?

Yes, this PR fixes the two things: audio corruption and log errors.

I only saw users saying that it plays. Not seen any debug logs of that.

@CrystalP
Copy link
Contributor

and you merged already... what's the point of discussing...

@CrystalP
Copy link
Contributor

I'm not OK with a backport until there is more testing. At least low and high sampling rates. It could cause issues on many systems that worked fine until now.

@thexai
Copy link
Member Author

thexai commented Apr 23, 2024

The backport was not my intention either, I have only created a branch to make a build for interested users (precisely because the change will take time to reach Omega)

@fritsch
Copy link
Member

fritsch commented Apr 23, 2024

From this description (not a windows expert): https://learn.microsoft.com/en-us/previous-versions/windows/desktop/ee416820(v=vs.85)#remarks

it seems that they want to handle it internally. Means: Set it to zero and then use GetCaps to check how much buffer it wanted / needed.

dwBufferBytes
Size of the new buffer, in bytes. This value must be 0 when creating a buffer with the DSBCAPS_PRIMARYBUFFER flag. For secondary buffers, the minimum and maximum sizes allowed are specified by DSBSIZE_MIN and DSBSIZE_MAX, defined in Dsound.h.

@thexai
Copy link
Member Author

thexai commented Apr 24, 2024

The details are important here:

it seems that they want to handle it internally. Means: Set it to zero and then use GetCaps to check how much buffer it wanted / needed.

This is only for primary buffers but Kodi (and all applications in general) works with "secondary" buffers only:

CLog::LogF(LOGDEBUG, "secondary buffer created");

Primary buffer is an low level layer only intended to use in special cases as bypass OS sound mixer and more direct access to HW. Standard applications should not use it:

For applications that require specialized mixing or other effects not supported by secondary buffers, DirectSound allows direct access to the primary buffer.

When you obtain write access to the primary buffer, other DirectSound features become unavailable. Secondary buffers are not mixed, so hardware-accelerated mixing is unavailable.

Most applications should use secondary buffers instead of directly accessing the primary buffer. Applications can write to a secondary buffer easily because the larger buffer size provides more time to write the next block of data, thereby minimizing the risk of gaps in the audio. Even if an application has simple audio requirements, such as using one stream of audio data that does not require mixing, it will achieve better performance by using a secondary buffer to play its audio data.

You cannot specify the size of the primary buffer, and you must accept the returned size after the buffer is created. A primary buffer is typically very small, so if your application writes directly to this kind of buffer, it must write blocks of data at short intervals to prevent the previously written data from being replayed.

https://learn.microsoft.com/en-us/previous-versions/windows/desktop/ee419051(v=vs.85)

Actually all DirectSound API is currently emulated in Windows 10 / 11 only to offer backward compatibility, so direct access to the HW is also not possible using DirectSound and use primary buffer, today would have no advantages/usefulness.

@CrystalP
Copy link
Contributor

Maybe we should phase out DirectSound in favor of XAudio2, already coded for xbox and comes pre-installed with a good enough version on Windows 8 and above.

@fritsch
Copy link
Member

fritsch commented Apr 24, 2024

The details are important here:

it seems that they want to handle it internally. Means: Set it to zero and then use GetCaps to check how much buffer it wanted / needed.

This is only for primary buffers but Kodi (and all applications in general) works with "secondary" buffers only:

CLog::LogF(LOGDEBUG, "secondary buffer created");

Primary buffer is an low level layer only intended to use in special cases as bypass OS sound mixer and more direct access to HW. Standard applications should not use it:

For applications that require specialized mixing or other effects not supported by secondary buffers, DirectSound allows direct access to the primary buffer.
When you obtain write access to the primary buffer, other DirectSound features become unavailable. Secondary buffers are not mixed, so hardware-accelerated mixing is unavailable.
Most applications should use secondary buffers instead of directly accessing the primary buffer. Applications can write to a secondary buffer easily because the larger buffer size provides more time to write the next block of data, thereby minimizing the risk of gaps in the audio. Even if an application has simple audio requirements, such as using one stream of audio data that does not require mixing, it will achieve better performance by using a secondary buffer to play its audio data.

You cannot specify the size of the primary buffer, and you must accept the returned size after the buffer is created. A primary buffer is typically very small, so if your application writes directly to this kind of buffer, it must write blocks of data at short intervals to prevent the previously written data from being replayed.

https://learn.microsoft.com/en-us/previous-versions/windows/desktop/ee419051(v=vs.85)

Actually all DirectSound API is currently emulated in Windows 10 / 11 only to offer backward compatibility, so direct access to the HW is also not possible using DirectSound and use primary buffer, today would have no advantages/usefulness.

I actually read the code, especially the comment - but from the usage I was totally not sure if we change the secondary or the primary buffer there. XAudio2 had other issues IIRC.

@CrystalP
Copy link
Contributor

Issues such as? It can't do passthrough afaik but that doesn't matter.

@fritsch
Copy link
Member

fritsch commented Apr 25, 2024

Back at the time, when first implemented, we had massive issues with getting proper delays. Also I found the API totally complicated ...

@thexai
Copy link
Member Author

thexai commented Apr 26, 2024

Created a draft PR to test it: #25068

It works!

works = it has sound

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Audio stutter on win11 due to broken Directsound
3 participants