Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for 24-bit (S16LE) 48k audio (perhaps as separate installable package) (humble feature request) #7521

Open
tonsimple opened this issue May 21, 2022 · 18 comments
Labels
C: audio P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. T: enhancement Type: enhancement. A new feature that does not yet exist or improvement of existing functionality.

Comments

@tonsimple
Copy link

The problem you're addressing (if any)

Availability of "modern" 48k, 24 bit audio will improve compatibility with various audio equipment, quality of life features such as Pulse Equalizer (which really doesn't work very well with current Qubes 41k 16 bit setup)

This can be particularly useful for users who work in headphones using external DAC which is "picky" about format modes and generally ability to at least choose among two most popular formats (48k 24 bit and 41100 in 16 bit) would resolve a lot of subtle but annoying woes for people who (like me) use Qubes as general purpose workstation and thus for music listening too.

The solution you'd like

Keep current "no negotiation, just raw PCMs being thrown" architecture.

Separate pacat-simple-vchan-48k and separate module-vchan-sink-48k are made available as optional installs

original pacat-simple-vchan slightly modified to check upon startup if the pacat-simple-vchan-48k was installed and if it in fact was, launches the 48k version instead of the regular one.

User has to manually edit their pulseaudio configs in relevant template VMs to load module-vchan-sink-48k instead of regular one (people who care about "44.1k is causing me minor issues" are usually aware of how to do that)

User is also responsible for configuring resampling properly across the path from VM to Dom0/audio-VM and to their playback tract

The value to a user, and who that user might be

While it is true that under normal conditions the difference between 44.1k is not reliably distinguishable from 48k 24-bit of same content (except perhaps under very particular conditions), when having to deal with idiosyncrasies of particular playback tracts it may become source of problems (admittedly minor ones). My current setup results in some content being resampled from 48 (or more)k down to 44.1k and then (because of my external headphone equalizer which doesn't approve of 44.1k on principle, but hey, it was affordable) gets resampled back to 48k which does result in a handful of unpleasant minor artifacts even if "best" resampling methods are used.

Generally one is better off resampling 41.1k to 48k and then running the 48k content down to playback than the other way around, in my humble experience.
Also pulseaudio's equalizer tends to work better with 48k 24-bit which currently makes it rather "un-Qubesly" alas (I experimented with that extensively since 3-ish version of Qubes and ended up buying a hardware equalizer)

The suggested implementation does not add much complexity to the audio virtualization protocol itself:

still no negotiation, no "million switches", does not burden the developers with decisions about when to go 48k or reconfiguring the template VMs

It does however give users who are "Qubes all the way" a little bit more flexibility for slightly better quality of life

I realize it is probably a low-priority consideration but hope one day to see it implemented

@tonsimple tonsimple added P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. T: enhancement Type: enhancement. A new feature that does not yet exist or improvement of existing functionality. labels May 21, 2022
@DemiMarie
Copy link

The audio virtualization protocol may need some updates for PipeWire, so that would be a good time to make this change.

@DemiMarie DemiMarie self-assigned this May 21, 2022
@andrewdavidwong andrewdavidwong added this to the Release TBD milestone May 22, 2022
@andrewdavidwong andrewdavidwong removed this from the Release TBD milestone Aug 13, 2023
@LindaFerum
Copy link

Hi! at risk of sounding needy, @DemiMarie what's the status on this ? Is there any hope for us qubes-using audiophiles :) in foreseeable future ?

@DemiMarie
Copy link

@LindaFerum Turns out the audio virtualization protocol did not need updates for PipeWire, but I can add support for this.

Can you confirm that S16LE at 48000 samples/s is sufficient?

@LindaFerum
Copy link

LindaFerum commented Oct 17, 2023

Hi @DemiMarie !
Very grateful for you looking into this :)

Us audio weirdos are greedy for sound formats even bats can't tell apart (occasionally for a good reason - like some external equalizers and some DACs being weird about sample rates they get at input) so if it would be possible to implement "more and fancier" formats (S16LE at 96000 samples, maybe the 32LE stuff etc.) that would be very nice and kind.

If that's substantial amount of work S16LE at 48000 samples (and maaaaybe S16LE at 96000 samples) should cover most rational-ish requirements (S16LE at 48000 covers the requirements of my current setup but 96000 may come in handy for some equalizers I've worked with when tweaking my headphones)

Also as a little sidenote:
some people try to avoid resampling and thus pulse ends up using more than one sample rate depending on content, would similar "avoid resampling by using different sample rates for different content" behavior present a problem when using PipeWire in Qubes?

@marmarek
Copy link
Member

some people try to avoid resampling and thus pulse ends up using more than one sample rate depending on content, would similar "avoid resampling by using different sample rates for different content" behavior present a problem when using PipeWire in Qubes?

Yes it will be a problem, the audio protocol we use intentionally avoids any kind of format negotiation to reduce attack surface.

As for higher sample rates, while technically it might be possible, it will also make it more susceptible for underruns under higher load (because more data needs to be sent in short time). This can lead either to higher latency, or sound cracking when doing something else in that qube - both undesirable especially when the goal is to have higher sound quality...

@DemiMarie
Copy link

some people try to avoid resampling and thus pulse ends up using more than one sample rate depending on content, would similar "avoid resampling by using different sample rates for different content" behavior present a problem when using PipeWire in Qubes?

Yes it will be a problem, the audio protocol we use intentionally avoids any kind of format negotiation to reduce attack surface.

Is the difference in audio quality noticable to those who are not audio professionals? Professional audio is not a goal of Qubes OS, as it requires real-time guarantees that Qubes OS cannot currently provide. That said, the audio quality should be good enough for voice and video calls, home music and video playback, and other non-professional uses.

As for higher sample rates, while technically it might be possible, it will also make it more susceptible for underruns under higher load (because more data needs to be sent in short time). This can lead either to higher latency, or sound cracking when doing something else in that qube - both undesirable especially when the goal is to have higher sound quality...

To elaborate on this: real-time scheduling is not actually real-time under Qubes OS, because the underlying hypervisor (Xen) does not provide real-time guarantees and because Qubes does not use PREEMPT_RT kernels. Therefore, it is not possible to support workloads (such as professional audio production) that require hard real-time guarantees and minimal latency. The goal is to work for 99% of users; supporting the remaining 1% would require vastly more effort.

@LindaFerum
Copy link

LindaFerum commented Oct 17, 2023

Is the difference in audio quality noticable to those who are not audio professionals?

I think (claim, since I didn't do a high-quality DBT lol) that I can tell up to about 48k but everything above that, like telling apart a honest 48k recording from a honest 96k recording is IMHO not feasible, at least not outside specialized environment and extremely high quality gear (my gear is more - mediocre let's say), I only ever needed 96k sampling rates for quirky EQ situation (eventually I just got different headphones so I don't have to EQ anything)

S16LE 48k should be quite enough for most listening situations.

If you add support for S16LE 96k as advanced option that needs to be specified manually throughout the path from audio source Qube to domain where the sound card resides, that would be very nice and may eventually come in handy but I guess that depends on just how much effort is "advanced user only" feature like that worth.

Fighting underruns in 96k situations can be to some degree left as exercise to the user because it's somewhat hardware dependent and frankly is a known quantity in linux sound (even bare metal), usually with entirely occult solutions that are sure to get occult-ier in Qubes/Xen.

Since automatic format "switching" is a problem I think the capability is safe to ignore, just mention in some documentation that it's not supported along the path from source Qube to the domain of audio card

PipeWire has an IMHO very good resampler and allowed rates configuration options so users should be able to sort this one out depending on their situation and formats on their own

@DemiMarie
Copy link

Is the difference in audio quality noticable to those who are not audio professionals?

I think (claim, since I didn't do a high-quality DBT lol) that I can tell up to about 48k but everything above that, like telling apart a honest 48k recording from a honest 96k recording is IMHO not feasible, at least not outside specialized environment and extremely high quality gear (my gear is more - mediocre let's say), I only ever needed 96k sampling rates for quirky EQ situation (eventually I just got different headphones so I don't have to EQ anything)

S16LE 48k should be quite enough for most listening situations.

Thanks!

If you add support for S16LE 96k as advanced option that needs to be specified manually throughout the path from audio source Qube to domain where the sound card resides, that would be very nice and may eventually come in handy but I guess that depends on just how much effort is "advanced user only" feature like that worth.

Fighting underruns in 96k situations can be to some degree left as exercise to the user because it's somewhat hardware dependent and frankly is a known quantity in linux sound (even bare metal), usually with entirely occult solutions that are sure to get occult-ier in Qubes/Xen.

Do most professionals in this area use Apple devices? Apple seems to be the only ones that can guarantee the absense of xruns, since they control the entire stack all the way down to the hardware.

Since automatic format "switching" is a problem I think the capability is safe to ignore, just mention in some documentation that it's not supported along the path from source Qube to the domain of audio card

PipeWire has an IMHO very good resampler and allowed rates configuration options so users should be able to sort this one out depending on their situation and formats on their own

PipeWire’s resampler should make everything work automatically.

@LindaFerum
Copy link

Do most professionals in this area use Apple devices? Apple seems to be the only ones that can guarantee the absense of xruns, since they control the entire stack all the way down to the hardware.

Well, most peeps I know use are windows-primary. Apple does make it into the "zoo" sometimes but having several different windowses (there is good gear that never updated from windows XP, alas, sigh, etc.) is more typical in my experience.

Anyhow I've had some very decent experience with Qubes so far recreational listening wise and I think with 48k support and a bit of luck maybe could migrate some mixing work here too.

Thanks again, looking forward to testing the new features :)

@DemiMarie
Copy link

I’m pretty sure that this will not only need to be implemented, but actually become the default, in an R4.2 update. The reason is that PipeWire has much stricter timing requirements than PulseAudio. If the default “dummy” driver drives the graph, audio recording does not work well due to Xen scheduling jitter. This can be fixed if the Qubes module drives the graph, but that requires it to produce samples at 48KHz, and with the current 44100Hz audio format this requires the quantum to be a multiple of 160 samples. I am not aware of any way to enforce this requirement on PipeWire.

@DemiMarie
Copy link

Update: this will not be implemented in R4.2, but I want to make it default in R4.3.

@DemiMarie DemiMarie removed their assignment Mar 6, 2024
@LindaFerum
Copy link

I will be waiting patiently for it :-)

@DemiMarie
Copy link

Current plan:

  1. Support reading the audio format and rate from qubesdb.
  2. Have new pipewire-qubes packages advertise a supported-feature.agent-audio-format feature.
  3. Have new qubes-audio-daemon packages advertise a supported-feature.agent-audio-format feature.
  4. If a VM supports supported-feature.agent-audio-format and its AudioVM supports supported-feature.agent-audio-format, new audio parameters (which default to 48K and S16LE, but can be controlled via qvm-features) are written to qubesdb. These override the hard-coded defaults.

@DemiMarie
Copy link

@LindaFerum: What do you mean by “24-bit (S16LE)”? I thought that S16LE was 16-bit signed integers, which is what Qubes OS already uses. Is there a specific format you were thinking of?

@LindaFerum
Copy link

@DemiMarie Well, I'm not OP but I've read this issue title as "add support 24 bit (any) OR S16LE 48k
Presumably 16-bit 48k thing was a suggestion in case "adding more of them bits" was too gnarly an effort - or at least so I interpreted the suggestion as initially outlined 😄

I could live with 16 bits and 48k

but I much rather would live with 24bits and 48k

BTW, not hugely experienced with pipewire but it appears a bit weird about selecting bit depth in audio, essentially always doing an "autopilot thing" based on driver as far as I understood this https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/457 ?

@DemiMarie
Copy link

@LindaFerum Is there a specific 24-bit format you are thinking of?

@LindaFerum
Copy link

@DemiMarie I'd say s24le with 48kHz sample rate is a very okay format, I would expect strong plurality of external hardware and almost all software to handle it, so I'd just work with s24le at 48k and leave it at that.

I guess a lot indeed depends on how much work implementing a given format + sample rate support amounts to - like, having a 96k + 24bit option would be nice and some people will appreciate it, but I kind of think it might be not worth it if it's like, hard, and / or liable to create problems.

s24le at 48k is mighty fine IMHO.

"A real audiophile" (or a very beefy audio engineer would probably advocate for floating-point 32bit formats but, well, IMO that's overkill and overhead and all other over-things ;-)

@DemiMarie
Copy link

@DemiMarie I'd say s24le with 48kHz sample rate is a very okay format, I would expect strong plurality of external hardware and almost all software to handle it, so I'd just work with s24le at 48k and leave it at that.

Thank you! That is what I will go with, then.

I guess a lot indeed depends on how much work implementing a given format + sample rate support amounts to - like, having a 96k + 24bit option would be nice and some people will appreciate it, but I kind of think it might be not worth it if it's like, hard, and / or liable to create problems.

It isn’t hard per se, but it adds additional test requirements.

s24le at 48k is mighty fine IMHO.

"A real audiophile" (or a very beefy audio engineer would probably advocate for floating-point 32bit formats but, well, IMO that's overkill and overhead and all other over-things ;-)

Floating point numbers also have infinities, signed zeros, NaNs, denormals, and other junk that can trip up unsuspecting code. I definitely would prefer to avoid those.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: audio P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. T: enhancement Type: enhancement. A new feature that does not yet exist or improvement of existing functionality.
Projects
Status: Backlog
Development

No branches or pull requests

5 participants