Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prototyping an API for getting the supported min and max buffersizes #401

Merged
merged 18 commits into from
Jul 14, 2020

Conversation

JoshuaBatty
Copy link
Contributor

This PR proposes an API for querying the supported min and max buffersizes of the device. I have managed to write successful test cases for ALSA, WASAPI, ASIO, CoreAudio, and Emscripten. I wanted to put this API out there for feedback before I start implementing all of the specific backends.

Note, the current version of winapi currently has no ability to query min/max buffersizes when using WASAPI. I have an open PR on winapi that can be found here that first needs to land until we can merge this work into CPAL.

This PR also removes the fn from(format: SupportedStreamConfig) -> SupportedStreamConfigRange function. Currently, it's only being used in wasapi/device line 529. I think we should aim to provide a better way of handling the supported formats returned by WASAPI rather than continue to modify this function to handle buffersizes.

Looking forward to hearing your feedback!

@ishitatsuyuki
Copy link
Collaborator

Given you are adding support for AudioClient3 in winapi, are you going to implement support for the "Shared" audio mode added in Windows 10? I was thinking of working on it so maybe we could work together.

@JoshuaBatty
Copy link
Contributor Author

Hi @ishitatsuyuki. It wasn't something that I had planned immediately but I think it could be worth considering. However, it probably makes sense for it to be separate from this work though.

@JoshuaBatty
Copy link
Contributor Author

JoshuaBatty commented May 22, 2020

So i've been making some good progress on this, have support for asio, coreaudio and alsa done.

Moving onto wasapi now. So in order to query the buffersize ranges supported by the device we are going to need to query the GetSharedModeEnginePeriod method supplied by IAudioClient3. As such, I was going to replace IAudioClient with IAudioClient3 in the wasapi backend. Only thing to note is that IAudioClient3 is only supported on Windows 8.1 +. That said, Microsoft officially ended support for Windows 7 in January of this year. I think it's probably a stretch for cpal to continue to try and support Windows 7 and we shouldn't be held back in upgrading to use a more modern IAudioClient. Happy to hear people's thoughts about the matter before I keep going though!

edit My bad IAudioClient 3 seems to only be supported on Win10. One option would be to give the user an option to enable IAudioClient3 with a feature flag, at least until Win8 is officially not supported....?

@dansgithubuser
Copy link
Contributor

Happy to drop Win 7.

@raphlinus
Copy link

I'm not actively working on audio things at the moment, but on druid we have quite a number of Windows 7 users (we always hear quickly when we break things) and are choosing to continue supporting it for at least a while.

@mitchmindtree
Copy link
Member

Seeing as it looks like the only way to reliably get the min and max buffer size via WASAPI requires Windows 10, I think it makes sense to have WASAPI just return both the min and max as whatever the default buffer size is by default so that users of WASAPI on older versions of Windows can at least use CPAL at all, even if they can't configure the buffer size.

Perhaps having some way of allowing users to opt-in to using the Windows-10-only IAudioClient3 is the way to go? Originally I mentioned the idea of using a feature flag to @JoshuaBatty, but after thinking a little further on it, perhaps it's better to have a WasapiHost-specific, alternative device constructor that constructs the device with IAudioClient3 rather than IAudioClient? E.g. something along these lines:

let (host, device) = if cfg!(target_os = "windows") {
    let wasapi_host = cpal::WasapiHost::new().unwrap();
    let wasapi_device = wasapi_host.default_output_device_iaudioclient3().unwrap();
    (wasapi_host.into(), wasapi_device.into())
} else {
    let host = cpal::default_host();
    let device = host.default_output_device().unwrap();
    (host, device)
};

// Enumerate supported configs, create streams, etc...

Thoughts?

Also, perhaps it might even be worth leaving this opt-in-windows-10-API stuff for a separate future PR and just focusing on landing the default case for now?

@Ralith
Copy link
Contributor

Ralith commented May 22, 2020

Perhaps having some way of allowing users to opt-in to using the Windows-10-only IAudioClient3 is the way to go?

Could default_host() dynamically select the best available host, at least as far as functionality it exposes is concerned? It'd be nice to allow people to get good behavior in any given target environment without a bunch of platform-specific special cases.

@mitchmindtree
Copy link
Member

@Ralith good point! Surely there's a reliable way to check availability of IAudioClient3, or even just check the Windows version. I'll look into this tomorrow with @JoshuaBatty.

@Ralith
Copy link
Contributor

Ralith commented May 22, 2020

IIRC it's idiomatic on Windows to check for APIs simply by trying to dynamically load the function pointer, but I'm not an expert.

@JoshuaBatty
Copy link
Contributor Author

Thanks for the feedback everyone, some really great ideas. Because we are still waiting on this PR to be merged in order to get IAudioClient3 support landed in winapi I think we should leave user-configurable buffersizes in Windows for a future PR. As such, the current implementation just returns the default buffersize as the supported min and max buffersize range for the time being. At least this allows us to merge this PR and get support for the other hosts.

If someone wanted to implement this in the future, something like the below code would need to be added to the wasapi device.rs file.

let mut current = 0; 
let mut format = std::ptr::null_mut(); 
let _hresult = (*audio_client).GetCurrentSharedModeEnginePeriod(&mut format, &mut current); 
let mut min = 0; 
let mut max = 0; 
let mut fundamental = 0; 
let mut default = 0; 
let hresult = (*audio_client).GetSharedModeEnginePeriod(format, &mut default, &mut fundamental, &mut min, &mut max); 
println!("default = {} -- fundamental = {} -- min buffersize = {}  -- max buffersize = {}", default, fundamental, min, max); 

Copy link
Member

@mitchmindtree mitchmindtree left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! I've added a note on one small patch that should probably be made for the emscripten host but once that's addressed I'm happy to land this.

set_timeout(|| audio_callback_fn::<D, E>(user_data_ptr), 330);
set_timeout(
|| audio_callback_fn::<D, E>(user_data_ptr, config, sample_format, buffer_size_frames),
330,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that the buffer size is configurable, this milliseconds value should correspond with the duration of a buffer, so something like buffer_size_frames * 1000 / sample_rate. I think the magic value of 330 from before was a hack so that the timeout would be ready just before the previous buffer completes (which was previously hard-coded to a third of a sec).

@sniperrifle2004
Copy link

I've spotted a few problems in the alsa host. I am also wondering if BufferSize::Default shouldn't have some meaningful definition. It now appears to depend on the backend (At least for the alsa host and at the moment also on the selected alsa device), which means default behaviour will change across platforms and backends. I think from a cpal client perspective consistent default behaviour should be strived for.

Copy link

@sniperrifle2004 sniperrifle2004 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the problems I mentioned.

@@ -890,6 +900,11 @@ fn set_hw_params_from_format<'a>(
hw_params.set_rate(config.sample_rate.0, alsa::ValueOr::Nearest)?;
hw_params.set_channels(config.channels as u32)?;

match config.buffer_size {
BufferSize::Fixed(v) => hw_params.set_buffer_size(v as i64)?,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be set_buffer_size_near.

Suggested change
BufferSize::Fixed(v) => hw_params.set_buffer_size(v as i64)?,
BufferSize::Fixed(v) => hw_params.set_buffer_size_near(v as i64)?,

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using set_buffer_size is the correct approach here. If the user has specified a Fixed value and that exact value is unsupported, then we should return an error to the user to notify them.

Perhaps we could open an issue to discuss adding a Nearest(FrameCount) variant to the BufferSize enum in a follow-up PR that aligns with the semantics of set_buffer_size_near?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I suggested set_buffer_size_near is because finding the Fixed values that actually work would pretty much rely on trial and error (Depending on the device). Especially since the possible range of buffer sizes is connected to the other hardware parameters. There are some Fixed values that should work most of the time, but relying on it will come to bite you. I think the user would benefit more from being able to query the precise buffer size afterwards (If that actually matters. It might). Nearest(Framecount) would of course also work for that though leaving the Fixed for the case in which the user really has to have that precise buffer size.

BufferSize::Fixed(v) => hw_params.set_buffer_size(v as i64)?,
BufferSize::Default => (),
}

// If this isn't set manually a overlarge buffer may be used causing audio delay
let mut hw_params_copy = hw_params.clone();
if let Err(_) = hw_params.set_buffer_time_near(100_000, alsa::ValueOr::Nearest) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will likely clobber the previously set buffer size. As it stands, it should probably be used in the default branch to keep preventing unexpectedly large buffer sizes (And thus latency).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will likely clobber the previously set buffer size.

Good spot!

As it stands, it should probably be used in the default branch to keep preventing unexpectedly large buffer sizes (And thus latency).

I'd prefer we omit this entirely in the case that the Default variant is specified, and that the user uses the supported stream configs API to select a suitable buffer size if they're concerned about latency. I do understand the desire to have a lower-latency default that is friendlier to users like game/pro-audio devs (especially as ALSA's default buffer size can be large) but I'm not sure that CPAL should take on the responsibility of deciding what this default should be, particularly when the user is in a better position to choose a suitable option for their project. Thoughts?

As a side note, in the case that we do opt for keeping a lower-latency default like this, I think it's important that it never fails and instead falls back to the system default in the case that whatever value we come up with isn't supported for some reason.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I can see where you're coming from, but if I was writing some cross platform application I would care about consistent behaviour, especially if I did not care about the precise parameters. And this definition of default is far from consistent. Whether the differences would be noticable though: I don't know. If the default parameters are extremely poor they might be.

This is even more important for the period size, since that also affects the default cpu load.

NOTE: I am not arguing for 100ms (That's just what was there), but some consistent value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened #447 to discuss the idea of adding a BufferSize::Near variant.

@sniperrifle2004
Copy link

sniperrifle2004 commented Jun 18, 2020

A given buffer size must also get a corresponding period size for the alsa host. If this is not directly configurable in a cross platform way I suggest using about a quarter of the buffer size. A safe value. And the call to set_period_size_near will need to be done before the call to set_buffer_size_near.

See #431

Copy link
Member

@mitchmindtree mitchmindtree left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also wondering if BufferSize::Default shouldn't have some meaningful definition. It now appears to depend on the backend (At least for the alsa host and at the moment also on the selected alsa device), which means default behaviour will change across platforms and backends.

@sniperrifle2004 Yes I think this is the intention and that this seems reasonable. Here, Default implies whatever the default behaviour is on the platform when no specific buffer size is set. Useful for cases where the user does not care or need to know exactly what the buffer size will be. In the case that the user does care (perhaps because they require low-latency), this PR should allow them to determine the valid buffer size range via the supported stream config methods and select a valid buffer size as desired. Perhaps updating the Default variant docs to reflect this meaning might be enough? I've added a comment at the relevant line.

A given buffer size must also get a corresponding period size for the alsa host. If this is not directly configurable in a cross platform way I suggest using about a quarter of the buffer size. A safe value. And the call to set_period_size_near will need to be done before the call to set_buffer_size_near.

See #431

Thanks for pointing this out! I was unaware of the need to set the period size separately - I agree something along the lines of your suggested solution makes sense. Are you happy for this change to be incorporated in this PR rather than landing #431? Btw @JoshuaBatty this stack overflow question has some interesting discussion/links on what "period" means in ALSA if you're interested.


#[derive(Clone, Debug, Eq, PartialEq)]
pub enum BufferSize {
Default,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add some docs to this variant explaining that Default refers to whatever the default behaviour is for the host in the case that no specific buffer size is set. It might be useful to also mention that this could result in potentially large buffer sizes, and that if the user cares about latency they should check specify a fixed buffer size in accordance with the range produced by the supported stream config API.

BufferSize::Fixed(v) => hw_params.set_buffer_size(v as i64)?,
BufferSize::Default => (),
}

// If this isn't set manually a overlarge buffer may be used causing audio delay
let mut hw_params_copy = hw_params.clone();
if let Err(_) = hw_params.set_buffer_time_near(100_000, alsa::ValueOr::Nearest) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will likely clobber the previously set buffer size.

Good spot!

As it stands, it should probably be used in the default branch to keep preventing unexpectedly large buffer sizes (And thus latency).

I'd prefer we omit this entirely in the case that the Default variant is specified, and that the user uses the supported stream configs API to select a suitable buffer size if they're concerned about latency. I do understand the desire to have a lower-latency default that is friendlier to users like game/pro-audio devs (especially as ALSA's default buffer size can be large) but I'm not sure that CPAL should take on the responsibility of deciding what this default should be, particularly when the user is in a better position to choose a suitable option for their project. Thoughts?

As a side note, in the case that we do opt for keeping a lower-latency default like this, I think it's important that it never fails and instead falls back to the system default in the case that whatever value we come up with isn't supported for some reason.

@@ -890,6 +900,11 @@ fn set_hw_params_from_format<'a>(
hw_params.set_rate(config.sample_rate.0, alsa::ValueOr::Nearest)?;
hw_params.set_channels(config.channels as u32)?;

match config.buffer_size {
BufferSize::Fixed(v) => hw_params.set_buffer_size(v as i64)?,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using set_buffer_size is the correct approach here. If the user has specified a Fixed value and that exact value is unsupported, then we should return an error to the user to notify them.

Perhaps we could open an issue to discuss adding a Nearest(FrameCount) variant to the BufferSize enum in a follow-up PR that aligns with the semantics of set_buffer_size_near?

@sniperrifle2004
Copy link

sniperrifle2004 commented Jun 24, 2020

@mitchmindtree I wouldn't mind #431 being incorporated in this. In fact it should be anyway. In fairness I have some use cases that would benefit from #431 right now (The low default period size in the pulse plugin is just bad for less powerful devices), however they will benefit even more if the buffer size can actually be set (100ms is a bit lower than is actually needed).

@sniperrifle2004
Copy link

sniperrifle2004 commented Jun 26, 2020

@mitchmindtree
Looks like leaving the buffer size unrestricted will always result in larg(ish) buffer sizes and the explanation is a lot simpler than I thought: snd_pcm_hw_params

What's worse though is that the minimum period time will also be chosen, so you will wind up with a worst-case high load/high latency scenario if the values are not restricted. If you are lucky the maximum buffer size will adjust to the period time. For what it's worth: Actual hardware - often by necessity - and software level devices often restrict the maximum buffer size to something reasonable even if it doesn't. The (potentially very) low period time will still be a problem though.

So let's at least settle on some (hidden?) default restrictions for the alsa host to avoid that. That way it still depends on the host, but no longer on whatever extremities the underlying device has.

As an alternative suggestion to the lowish 100ms aplay does this by default: Get the maximum buffer time. Cap it at half a second. Then restrict the period time to near a quarter of this buffer time (It's where I got that metric from. aplay has been immensely useful as both a reference implementation and a tool) and then restrict the buffer time to near the chosen value.

PS The pulse plugin does have a pretty awful maximum buffer size: 4MB. That's seconds to minutes worth of audio (Probably closer to seconds most of the time). However in practice the experienced latency is a lot lower than this advertised value. I suspect it simply dispatches audio data to the pulse server in writei. So it probably instead boils down to how fast the application is able to deliver that initial 4MB of data to pulse and then the additional cost of pulse delivering part of it to the card, which, for 4MB, would likely be low compared to the earlier costs. A little test with the beep example confirms this. It also explains why it's slow to start (Filling up that buffer becomes increasingly noticeable) and it stutters with the low period times on low powered devices: There is overhead involved in communicating with the pulse server and that is probably already enough to cause audible gaps in the playback if the period time is below a certain threshold. It may even be enough to cause stuttering on more powerful machines.

@mitchmindtree
Copy link
Member

Thanks for sharing more of your thoughts @sniperrifle2004!

PS The pulse plugin does have a pretty awful maximum buffer size: 4MB.

I think you're totally right, in this case the kinds of delays that the default buffer size might cause for users in the common case aren't really practical for any application I can think of. For now I'm happy seeing this land with the original 100ms delay as a default (along with your patch from #431), particularly as that's what we already have. We can address this properly in a follow-up PR.

As an alternative suggestion to the lowish 100ms aplay does this by default: Get the maximum buffer time. Cap it at half a second. Then restrict the period time to near a quarter of this buffer time (It's where I got that metric from. aplay has been immensely useful as both a reference implementation and a tool) and then restrict the buffer time to near the chosen value.

Yeah something along these lines could make sense. Perhaps we can open a follow up issue along the lines of "Reviewing and deciding upon a default buffer size strategy for ALSA".

@mitchmindtree
Copy link
Member

Thanks for addressing those issues @JoshuaBatty, this all looks good to me!

I think we can address the default buffer size issue discussed above in a follow-up PR. I've opened up #446 to discuss this further.

Also I think we will likely want something like set_period_time for the BufferSize::Fixed branch in the ALSA host, but we can add this in a follow-up PR.

Copy link

@PetrGlad PetrGlad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently there is no way to specify the buffer size.
UPDATE: Sorry for the confusion, rodio uses "SupportedStreamConfig::config(&self) -> StreamConfig" which makes it to ignore the preferred buffer sizes. So the problem is there.

@@ -297,6 +333,7 @@ impl SupportedStreamConfig {
StreamConfig {
channels: self.channels,
sample_rate: self.sample_rate,
buffer_size: BufferSize::Default,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Buffer size from the format is always ignored...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants