Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

consider non-interleaved audio buffers #128

Open
lnihlen opened this issue May 27, 2020 · 0 comments
Open

consider non-interleaved audio buffers #128

lnihlen opened this issue May 27, 2020 · 0 comments
Labels
enhancement New feature or request

Comments

@lnihlen
Copy link
Member

lnihlen commented May 27, 2020

The current realtime audio ingestion system imports interleaved data from PortAudio, meaning that each sample frame is assumed to be 2 floats (stereo-only for now) and is uploaded to the GPU as packed 2D vectors in a single image of one frame of samples wide and 1 pixel tall.

This is supposedly fast in that the CPU only has to copy the data out of the buffer and into GPU memory without having to do any per-sample manipulation. But it is inflexible in that certain channel counts won't work well across all GPUs. For instance the Vulkan Hardware Database shows that signed 32-bit floats are broadly supported at 100% for singles, doubles, and quads, but support for sampling from triples is out for almost three quarters of the hardware supported. This means that at most we could build a system that can ingest 1, 2, or 4 channels of audio only. Scaling beyond 4 channels would require uploading a separate texture image.

Furthermore it seems from this proposal on portaudio that interleaved data may not always be the way the underlying hardware is providing the data to portaudio, so the library may be interleaving the data manually.

An alternative would be to upload the samples de-interleaved as a series of single floats in an image that is 1 frame of samples wide and an arbitrary number of channels tall.

As Scintillator is a video synth it's arguable that audio import, for visualization, doesn't need to be as sophisticated or flexible as SuperCollider. But it's also arguable that Scintillator should be able to consume and do something useful with any audio data that SuperCollider is capable of producing. And it is certainly the case that SuperCollider can produce very high channel count audio output. So it follows that Scintillator should also be able to handle these as inputs.

It might be best to expose to the log what the native API on the other side of PortAudio is providing, or is capable of providing, and offer the Scintillator user the option of requesting either interleaved audio with a fixed channel support or de-interleaved audio with an arbitrary number of channels. Perhaps the system by default could choose the one requiring the least processing power. Or perhaps ffmpeg audio decode and audio output will obviate the choice.

This is probably also worth waiting for some user feedback on, so opinions welcome here!

@lnihlen lnihlen added this to To Do in Media Workstream via automation Jul 29, 2020
@lnihlen lnihlen added the enhancement New feature or request label Jul 29, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
Development

No branches or pull requests

1 participant