Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

audio: don't block on lock in ao_read_data #12643

Merged
merged 3 commits into from
Oct 20, 2023

Conversation

t-8ch
Copy link
Contributor

@t-8ch t-8ch commented Oct 15, 2023

ao_read_data() is used by pull AOs potentially from threads managed by external libraries. These threads can be sensitive to blocking. For example the pipewire ao is using a realtime thread for the callbacks.

Note:
I don't have a specific problem this fixes, but it looked like it could be one in theory.

@sfan5
Copy link
Member

sfan5 commented Oct 15, 2023

Is it possible for the thread to be starved of an opportunity to lock?
Do AO drivers react well if the function returns zero a bunch of times?

Seems like the potential issues are more severe than the theoretical fix.

@t-8ch
Copy link
Contributor Author

t-8ch commented Oct 15, 2023

Is it possible for the thread to be starved of an opportunity to lock?

Being totally starved and having a negative impact on realtime threads are two separate questions IMO.

Do AO drivers react well if the function returns zero a bunch of times?

In paused mode it also return 0 so it should.

Seems like the potential issues are more severe than the theoretical fix.

Also fine by me. If it comes up at some point we would have this PR on file.

@t-8ch t-8ch force-pushed the ao/read_data_nonblocking branch 2 times, most recently from 73432fe to 754deef Compare October 15, 2023 18:00
@sfan5
Copy link
Member

sfan5 commented Oct 15, 2023

Being totally starved and having a negative impact on realtime threads are two separate questions IMO.

My assumption is - and maybe I'm wrong - that if you lock some fairness algorithm/scheduling will eventually give you the lock while trylock has no such guarantee.

@t-8ch
Copy link
Contributor Author

t-8ch commented Oct 15, 2023

My assumption is - and maybe I'm wrong - that if you lock some fairness algorithm/scheduling will eventually give you the lock while trylock has no such guarantee.

I don't think pthread_mutex has any fairness guarantees.

The lock shouldn't really be contended for pull AOs.
It's only used the the data callback and control functions.

If it is conteded, because a control function is blocking somewhere, the AO data thread, which actually drives the playback loop in pull AOs, would be stuck to wait for this control function to give up the lock.
With the proposed changed the data thread gets notified that no data is available right away and can trigger timeout and error logic.

@sfan5
Copy link
Member

sfan5 commented Oct 16, 2023

Hmm yeah, you're right.
MinGW is still failing however.

ao_read_data() is used by pull AOs potentially from threads managed by
external libraries.  These threads can be sensitive to blocking.
For example the pipewire ao is using a realtime thread for the
callbacks.
@t-8ch
Copy link
Contributor Author

t-8ch commented Oct 16, 2023

Small update to fix pthread_mutex_trylock() together with thread debugging.

@sfan5 sfan5 merged commit ae908a7 into mpv-player:master Oct 20, 2023
12 of 14 checks passed
@handlerug
Copy link
Contributor

Commit ae908a7 introduced audio crackling on macOS.

Test file: ffmpeg -f lavfi -i "sine=frequency=1000:duration=30:sample_rate=48000" -af 'volume=-18dB' -c:a pcm_f32le -y test-sine.wav

SCR-20231023-dfgg

The Core Audio output handling code, which doesn't have any return value being 0 handling:

static OSStatus render_cb_lpcm(void *ctx, AudioUnitRenderActionFlags *aflags,
const AudioTimeStamp *ts, UInt32 bus,
UInt32 frames, AudioBufferList *buffer_list)
{
struct ao *ao = ctx;
struct priv *p = ao->priv;
void *planes[MP_NUM_CHANNELS] = {0};
for (int n = 0; n < ao->num_planes; n++)
planes[n] = buffer_list->mBuffers[n].mData;
int64_t end = mp_time_ns();
end += p->hw_latency_ns + ca_get_latency(ts) + ca_frames_to_ns(ao, frames);
ao_read_data(ao, planes, frames, end);
return noErr;
}

@t-8ch
Copy link
Contributor Author

t-8ch commented Oct 23, 2023

That's unfortunate.

It seems coreaudio doesn't even have a way to signal partial data.
Maybe it would be better to introduce a new buffer function to do so.

@sfan5
Copy link
Member

sfan5 commented Oct 23, 2023

A function parameter that indicates whether to block sounds easiest.
It's still not guaranteed that ao_read_data returns as many frames as requested so the coreaudio AO is doing it wrong either way.

@kasper93
Copy link
Contributor

kasper93 commented Oct 23, 2023

It seems coreaudio doesn't even have a way to signal partial data.
Maybe it would be better to introduce a new buffer function to do so.

Looks like only ao_pipewire checks the number of samples returned from ao_read_data . So in fact, we should update all other AOs to handle 0 case correctly.

// Read the given amount of samples in the user-provided data buffer. Returns
// the number of samples copied. If there is not enough data (buffer underrun
// or EOF), return the number of samples that could be copied, and fill the
// rest of the user-provided buffer with silence.

ao_read_data documentation is clear that it reads as much as possible and pads the rest with silence. Clearly this PR breaks this transaction. AO that can ignore additional samples should do so and for the rest padding with silence probably will be enough.

A function parameter that indicates whether to block sounds easiest.

It is good to not block here, but we need to remember that other than ao_pipewire exists. And should be updated to new behavior of the ao_read_data and probably should also update the comment.

@t-8ch
Copy link
Contributor Author

t-8ch commented Oct 23, 2023

How about filling the buffer with silence when the lock can't be acquired and 0 is returned?
Can you check if this works for ao_coreaudio?

Unfortunately I only can test this on Linux.

@sfan5
Copy link
Member

sfan5 commented Oct 23, 2023

ao_read_data should probably indeed do that, but won't that still cause gaps in audio output?
Maybe I'm misunderstanding the situation.

@kasper93
Copy link
Contributor

kasper93 commented Oct 23, 2023

ao_read_data should probably indeed do that, but won't that still cause gaps in audio output?

Right, I also have not look in depth. I believe for this trylock change all AOs code need to be reviewed if it is safe to do and how to ensure partial read/writes are supported correctly. Easy way would be to use trylock conditional, but I would prefer more principal solution, just because common code shouldn't be modified depending on which AO is used. There should be separation.

@handlerug
Copy link
Contributor

How about filling the buffer with silence when the lock can't be acquired and 0 is returned?

So, like this:

diff --git a/audio/out/buffer.c b/audio/out/buffer.c
index e23e4cf108..96636ff675 100644
--- a/audio/out/buffer.c
+++ b/audio/out/buffer.c
@@ -184,8 +184,13 @@ int ao_read_data(struct ao *ao, void **data, int samples, int64_t out_time_ns)
     struct buffer_state *p = ao->buffer_state;
     assert(!ao->driver->write);
 
-    if (pthread_mutex_trylock(&p->lock))
+    if (pthread_mutex_trylock(&p->lock)) {
+        // failed to acquire the lock, fill with silence per the contract
+        for (int n = 0; n < ao->num_planes; n++) {
+            af_fill_silence((char *)data[n], samples * ao->sstride, ao->format);
+        }
         return 0;
+    }
 
     int pos = read_buffer(ao, data, samples, &(bool){0});
 

Unfortunately, it still results in more or less the same effect. I mean, it still appears to be the same length to the audio driver, regardless of the actual contents.

This doesn't help either:

diff --git a/audio/out/ao_coreaudio.c b/audio/out/ao_coreaudio.c
index d96b597f6e..1c6f72ffd4 100644
--- a/audio/out/ao_coreaudio.c
+++ b/audio/out/ao_coreaudio.c
@@ -78,7 +78,11 @@ static OSStatus render_cb_lpcm(void *ctx, AudioUnitRenderActionFlags *aflags,
 
     int64_t end = mp_time_ns();
     end += p->hw_latency_ns + ca_get_latency(ts) + ca_frames_to_ns(ao, frames);
-    ao_read_data(ao, planes, frames, end);
+    int samples = ao_read_data(ao, planes, frames, end);
+    if (samples == 0) {
+        *aflags |= kAudioUnitRenderAction_OutputIsSilence;
+    }
+
     return noErr;
 }
 

@handlerug
Copy link
Contributor

I do want to mention though that this hack fixed the issue for me. Been using mpv for two days and haven't noticed any crackling (except rarely during the first second of playback, but it sounds more like underrun which is expected). I don't understand why or how it works, but I'm posting it anyway in case someone might find it useful. Would have been much easier if Core Audio was open-source (lol) so that I could just read their render callback code and check whether their scheduling changes depending on the value of mDataByteSize.

diff --git a/audio/out/ao_coreaudio.c b/audio/out/ao_coreaudio.c
index d96b597f6e..63315a881a 100644
--- a/audio/out/ao_coreaudio.c
+++ b/audio/out/ao_coreaudio.c
@@ -78,7 +78,18 @@ static OSStatus render_cb_lpcm(void *ctx, AudioUnitRenderActionFlags *aflags,
 
     int64_t end = mp_time_ns();
     end += p->hw_latency_ns + ca_get_latency(ts) + ca_frames_to_ns(ao, frames);
-    ao_read_data(ao, planes, frames, end);
+    int samples = ao_read_data(ao, planes, frames, end);
+
+    if (samples < frames) {
+        for (int i = 0; i < buffer_list->mNumberBuffers; i++) {
+            // HACK: I don't think mDataByteSize is supposed to be overwritten.
+            buffer_list->mBuffers[i].mDataByteSize = samples * ao->sstride;
+        }
+        if (samples == 0) {
+            *aflags |= kAudioUnitRenderAction_OutputIsSilence;
+        }
+    }
+
     return noErr;
 }
 

@t-8ch
Copy link
Contributor Author

t-8ch commented Oct 24, 2023

I guess at first we should revert the problematic commit. Unfortunately I don't have access to my dev setup right now.

The "hack" by @handlerug is similar to what pipewire is doing.

The docs mention this:

The mDataByteSize and mNumberChannels
 parameters needs to match the memory layout of
 mData, so update the size and number of channels
 when you update the data field.

If there would be no way for the callback to signal the actual amount of data read, it would be a really crappy API IMO. The callback would need to block forever until all the requested data is present or risk desync.

@handlerug
Copy link
Contributor

Yeah, I was reading that paragraph as well. Now that it’s a new day and I’m a bit refreshed, I guess it makes sense that you can modify the size field, though “memory layout” usually makes me think of stride and stuff and not length.

@sfan5
Copy link
Member

sfan5 commented Oct 24, 2023

I've pushed a revert.

What we need to figure out is whether all AOs have the ability to output less data than requested.
If not some creative workarounds in the audio core might be necessary.
All AOs will have to be fixed to deal with short writes or a zero return value correctly.
Then the trylock change could be reintroduced. Perhaps with a parameter indicating whether blocking is okay, if that helps anywhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants