-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ao: add a new ao "avfoundation" to support spatial audio in macOS #11955
Conversation
Hey, I don't have a device with spatial audio support but the feature looks interesting. Initial value of
@rcombs has a branch with avfoundation ao, maybe it can help: |
8c1414a
to
dcc4f66
Compare
|
Works now, thanks! |
5823b7a
to
accdd3f
Compare
@rcombs Would you plz review this pr? This is an independent implementation and fixes all bugs (I think) in previous implementations (like channel configuration, spinning, and timing). And spatialized stereo works (which I find buggy in your branch). |
I've tested Dolby Atmos (sample). Currently I can only render Dolby Atmos as multichannel audio. ("Multichannel" displayed in system UI, instead of "Dolby Atmos".) I've also tried @rcombs's implementation with SPDIF passthru on, but it can only activate "Multichannel" as well. The sample EAC-3 file has a 3 layered data format:
(grabbed from diff --git a/audio/out/ao_avfoundation.m b/audio/out/ao_avfoundation.m
index 7841a52f0d..a4b8ff8087 100644
--- a/audio/out/ao_avfoundation.m
+++ b/audio/out/ao_avfoundation.m
@@ -85,6 +85,9 @@ static bool enqueue_buf(struct ao *ao, void *buf, int bufsize,
[p->renderer enqueueSampleBuffer:sBuf];
p->enqueued += (double)samples / sample_rate;
+ if ([p->renderer status] == AVQueuedSampleBufferRenderingStatusFailed)
+ goto coreaudio_error;
+
ret = true;
coreaudio_error:
@@ -155,16 +158,32 @@ static bool enqueue_data(struct ao *ao, void *buf, int samples) API_AVAILABLE(ma
err = AudioFileOpenWithCallbacks(ao, read_cb, NULL, get_size_cb, NULL, file_type, &file);
CHECK_CA_ERROR("unable to parse packet properties");
- AudioStreamBasicDescription asbd;
- UInt32 io_size = sizeof(asbd);
- err = AudioFileGetProperty(file, kAudioFilePropertyDataFormat, &io_size, &asbd);
+ AudioFormatListItem afl[8];
+ UInt32 io_size = sizeof(afl);
+ err = AudioFileGetProperty(file, kAudioFilePropertyFormatList, &io_size, &afl);
AudioFileClose(file);
CHECK_CA_ERROR("unable to get ASBD for packet");
+ AudioStreamBasicDescription asbd = afl[0].mASBD;
+
if (p->asbd.mSampleRate != asbd.mSampleRate ||
p->asbd.mChannelsPerFrame != asbd.mChannelsPerFrame ||
p->asbd.mFormatID != asbd.mFormatID ||
p->asbd.mFramesPerPacket != asbd.mFramesPerPacket) {
+ for (int i = 0; i < io_size / sizeof(AudioFormatListItem); ++i) {
+ AudioStreamBasicDescription asbd = afl[i].mASBD;
+ MP_VERBOSE(
+ ao,
+ "asbd[%d]: sample rate: <%f>, channels: <%u>, format id: <%u>, frames per packet: <%u>, tag: <%u>\n",
+ i,
+ asbd.mSampleRate,
+ asbd.mChannelsPerFrame,
+ asbd.mFormatID,
+ asbd.mFramesPerPacket,
+ afl[i].mChannelLayoutTag
+ );
+ }
+
if (p->desc) {
CFRelease(p->desc);
p->desc = NULL;
@@ -183,6 +202,7 @@ static bool enqueue_data(struct ao *ao, void *buf, int samples) API_AVAILABLE(ma
if (!enqueue_buf(ao, p->pkt->data, p->pkt->size, asbd.mFramesPerPacket, asbd.mSampleRate, true)) {
success = false;
+ MP_FATAL(ao, "%s\n", cfstr_get_cstr((CFStringRef)[[p->renderer error] localizedFailureReason]));
goto coreaudio_error;
} Info of all three layers is got:
and the first layer is fed to the renderer. However, the playback fails with error |
you should file radar or DTS |
I didn't get your point. 😵💫 I don't think Apple support DTS. |
Sorry I mean: |
real_sample_count = request_sample_count; | ||
} | ||
|
||
CMSampleTimingInfo sample_timing_into[] = {(CMSampleTimingInfo) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sample_timing_info
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this typo wasn't fixed
CMSampleTimingInfo sample_timing_into[] = {(CMSampleTimingInfo) { | |
CMSampleTimingInfo sample_timing_info[] = {(CMSampleTimingInfo) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol I didn't notice that.
Not sure what's up but when stop is called during pause I see ao->buffer_state->pending->pts being ~2 seconds further, so on unpause ~2 seconds skip happens. |
@orion1vi This AO requires 1 second device buffering so there is at least 1 second preroll. mpv's decoder may also read ahead further. After stop, the 1~2 second buffer is cleared. On unpause, mpv does not refill it. So there is silence. I don't think this issue can be and should be handled in AO level. |
Same issue here #11488, but not in https://github.com/rcombs/mpv/tree/avfoundation, this one is push based though. |
Yes, there are multiple implementations for avfoundation now, and you can choose the one you like. I'm not interested in implementing a push-based ao, because avfoundation must have reason to use a large device buffer, and push-based ao does not feed enough data to it. Anyway, I don't think this issue should be solved at ao level. |
hey @ruihe774 first thanks for your PR and sry for me to take so long. i think we should get this rolling again. fist some questions, sorry if they have been answered already, a small summary for me would be helpful
|
@Akemi Hi!
One difference is channel configuration. gazsiazasz's implementation hard codes mono and stereo, and use rcombs's implementation constructs a custom CoreAudio channel layout using Mine uses a lookup of predefined CoreAudio channel layout and fallbacks to custom channel layout. With my implementation, spatial audio can work well with not only stereo audio but also, e.g., 5.1, 7.1, and any other standard channel layouts. Another difference between rcombs's and mine is that rcombs's is push-based ao and mine is pull-based. IMO
CoreAudio is a low level API and AVFoundation is a high level API that is build upon CoreAudio. I don't think it is feasible for a full replacement. Many low level features, e.g., low latency, configurable audio graph, device selection, is only available in CoreAudio, while high level features like AirPlay and spatial audio are only available in AVFoundation. In fact, I think Apple actually implements these high level features in some private CoreAudio audio units (i.e., nodes in the audio processing graph) and exposes these "Apple-designed" audio processing graphs as a whole to the high level API. It is a pity that Apple does not provide a low level interface to these advanced features to allow flexible audio graph configuration with them.
One issue is the effect that a very large device buffer (1 second) brings about. There is no way this issue can be handled in my ao implementation. In fact, if a large device buffer is set in other ao, such silence will also appear. Another issue is that currently it does not support proprietary spatial audio formats like Dolby Atoms, which needs metadata passthru. I'm not very interested in implementing this in a near future. Apart from these, the ao works pretty fine. I actually uses this ao in my everyday watching and have not spotted problems.
Yes for sure. |
okay that sounds good. it would be nice if you could rabase your changes and also squash most/all of your commits into one "initial avfoundation ao support" or similar. if a commit makes sense to keep separate you can/should leave it. i am going to review the code this or next weekend and i think we could get this merged then. |
186295a
to
f3d7b22
Compare
I've just looked into it. The reason why desync happens is that avfoundation itself queues samples. When paused, For example, the playing pts is 00:10, and avfoundation has prerolled for 1 second. In this case, next call to This is an issue caused by the fact that mpv uses To solve this issue, we need introduce an explicit API to pause pull-based ao. I've implemented it in 9abcdd2 and fixed the desync issue. Since this is a API change, I will put it into a separated PR for discussion after this PR is merged. |
Adding pause support to pull-based ao is the correct solution 👍🏼 |
@t-8ch I'm wondering how I can deal with the situation that zero samples are returned from Also, I don't think spinning in a RT thread as in |
I think a sleep in the underrun scenario is the most practical solution. But the duration of the sleep should probably be proportional to the amount of data already enqueued into the ao. And ideally the sleep would be interruptible by the ao's |
I don't think it is relevant. Lines 187 to 192 in f4a7931
Lines 674 to 684 in f4a7931
Instead, we want to be woken up when new data comes into the buffer. The most close one is p->pt_wakeup , but it is not broadcasted for pull-based ao. Still, p->pt_wakeup does not guarantee p->lock is unlocked so there may still be contention.
I think |
@kasper93 So how can playing pts be properly calculated? I find such code in // Return pts value corresponding to currently playing audio.
double playing_audio_pts(struct MPContext *mpctx)
{
double pts = written_audio_pts(mpctx);
if (pts == MP_NOPTS_VALUE || !mpctx->ao)
return pts;
return pts - ao_get_delay(mpctx->ao);
} while int64_t end = p->end_time_ns;
int64_t now = mp_time_ns();
driver_delay = MPMAX(0, MP_TIME_NS_TO_S(end - now)); I don't think BTW, I'm confused about how av diff is calculated. In double a_pos = written_audio_pts(mpctx);
if (a_pos != MP_NOPTS_VALUE && mpctx->video_pts != MP_NOPTS_VALUE) {
a_pos -= mpctx->audio_speed * ao_get_delay(mpctx->ao);
mpctx->last_av_difference = a_pos - mpctx->video_pts
+ opts->audio_delay + offset;
} This is inconsistent with the calculation in a_pos -= 0.5 * 1 + 0.6 * 2 What's more, double a_pts = written_audio_pts(mpctx) + opts->audio_delay - mpctx->delay; I'm confused. @Dudemanguy would you please look into this? |
Playback speed is not relevant on the |
@Dudemanguy I'm not discussing the ao level, and not about this PR. It is some side product of my investigation of how mpv handles I think this is the cause of AV diff when using avfoundation during playback speed change. avfoundation of a 2 second device buffer, So the value of |
That is part of the calculation, yes. The discontinuous jump that occurs when you change speeds could be more smooth but you do have to take the speed into consideration here. |
is this something you want to get in before this merge, so you can fix it here, or do you want to fix it in a another PR? a PR for it would also be appreciated. also is there still something missing here? |
I'd prefer fixing it in an another PR.
I think not. |
@Akemi Do I need to squash the commits? |
yeah, please squash them. |
@Akemi Time to merge it? |
thank you for your contribution and sorry again that it took me so long to get this rolling. [edit] |
As in #9252.
A new ao "avfoundation" that utilizes
AVSampleBufferAudioRenderer
in AVFoundation framework is added, which enables spatial audio for multichannel and stereo. (Ref)Note that though this ao works in both bundle and commandline for normal playback, spatial audio can be turned on only when mpv is bundled (mpv.app and iina.app) and fails to work in commandline. (coreaudiod crashes if we try to turn on spatial audio for mpv running in commandline.)