New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use SharedArrayBuffer for getChannelData #2446
Comments
I think we've talked about using SharedArrayBuffers but that won't happen until the next version of the spec begins. Won't an AudioWorkletNode do the processing you need? If the processing doesn't have to be real-time, can't you use an offline audio context and an AudioWorklet to compute the things you need? |
@rtoy As mentioned, If you only have to analyze the It is definitely possible, but is this really how it should be done? It just seems wrong that you need to hack your way around such a fundamental limitation. |
Why is the audio an |
F2F discussion:
|
This comment was marked as off-topic.
This comment was marked as off-topic.
No. It does enable some offline analysis on audio files, but it's very limited compared to what SharedArrayBuffer would be capable of (or a wholesale replacement of AudioBuffer even). |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
TPAC 2020:
|
AudioWG virtual F2F:
Would that be a satisfactory solution, considering, it's quite hard to suddenly exposed shared memory that is being used withing a Web Codecs implementation? |
It's been some time since I started this thread, but for me personally, the upcoming Web Codecs API (and the fact it will provide random access to decoded audio data, IIRC) will improve quite a lot, as it essentially enables streaming audio file processing into memory chunks (which can be shared memory chunks then of course). |
And for the record, progressive rendering is #2445. With the spec going to proposed recommendation, we can start adding these to the spec soon. |
After a lot of experimentation, to me it seems there is a field which Web Audio API doesn't cover as nicely as it could: crunching numbers on
AudioBuffer
objects -- i.e. when we have afor
loop that grinds through a 200 MBFloat32Array
per channel.There are several approaches to work around this, but each has quite painful downsides. Before we get to that, I'd like to show a few common use-cases for audio editing applications.
Common use-cases
Here are a few examples which are common in audio editing applications, and sort-of require crunching numbers of audio data:
Now, to clarify, I understand it's possible to do the large part of some of these in real time. For example, for applying peak-normalization, it's enough to quickly run through the
AudioBuffer
for the peak-level, and apply its reciprocal using aGainNode
in real time. Similarly, aConstantSourceNode
can be used to remove the computed DC-offset.This is great! However, "quickly running through" in this case still means noticeable main thread blocks (around a few hundred milliseconds for a 5 minutes audio clip, depending on device), which is just enough to make it unfeasible.
And this is the problem here which, to me, seems to be unnecessarily complicated to get around (get around, because there is no real solution). We can do number crunching, but only at the cost of UX quality (or severe compromise of performance, as explained below).
Avoiding main-thread blocks
Avoiding the mean-thread block is essentially the only problem we need to solve here, while still having a solid performance on doing the computations (obviously, we can just get it done as fast as possible and show a CSS-animated "please wait" overlay while doing so, but that is quite restrictive, especially with the dominantly async nature of JS). Here are a few approaches:
Doing the computations gracefully
All things considered, this is the simplest approach.
Using
requestIdleCallback
, it's possible to do everything on the main thread without blocking, but at the cost of a severe impact on performance (the computations are completed much slower, because we have to yield to the main thread continuously).This approach is near-perfect from a UX perspective, but very bad from a performance perspective, a simple audio-level reading pass can take 10 times as long.
Using WebWorkers
This would be the obvious go-to solution, as WebWorkers are precisely for doing these tasks.
However, we can't transfer the
ArrayBuffer
fromgetChannelData
, because that's the data used by audio rendering directly. Now we have several options here:getChannelData
returns, gracefully (e.g. usingrequestIdleCallback
), and transfer that to the worker thread (or we can just copy to a SharedArrayBuffer and use that, if enabled)copyFromChannel
to transfer copies of small parts to the worker threadUnfortunately, both of these require data-copying on the main thread, and because of this, we are almost back to where we were with just doing computations on the main thread directly, gracefully (of course, copying even gigabytes of data is relatively fast on even mid-end modern devices, so if the computations are very expensive, we can prefer doing a graceful copy on the main thread, transfer the copy, then do the computations on the worker thread(s) with full power).
Using OfflineAudioContext
Using an
OfflineAudioContext
can be a decent solution, especially if we only have to read data (because it can be done in parts easily, and the actual data-processing can be done on the AudioWorklet thread).If we need to do changes to an
AudioBuffer
, we can either (1) do a full-pass render on it, resulting in double memory use, or (2) still do it in parts and use the deprecatedScriptProcessorNode
, with which we are back to where we began, doing the computations on the main thread (only the schedule is taken care of). However, with option 1, we can't efficiently do, say, trimming silence.That said, for reading, I believe using an
OfflineAudioContext
is currently the best option.Proposed solution: SharedArrayBuffer natively
I propose a relatively simple change to the spec:
getChannelData
returns aFloat32Array
with aSharedArrayBuffer
as the backing buffer. This would make it extremely simple to do audio data processing from even multiple WebWorker threads.The rest of the issues associated with
SharedArrayBuffer
(e.g. Meltdown) are not really the concern of the Web Audio API: use it if it's enabled, don't if it's not, the rest is up to that team.There isn't much more to add to this, really, it's a simple and elegant solution to a complex problem.
Closing Words
I'm also very interested to see how the committee sees the above mentioned common use-cases should be done using the Web Audio API. Because to me, at a first glance it seems nobody took time to think about them previously, or the API wasn't designed with them in mind.
The text was updated successfully, but these errors were encountered: