Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use SharedArrayBuffer for getChannelData #2446

Open
JohnWeisz opened this issue Aug 3, 2018 · 14 comments
Open

Use SharedArrayBuffer for getChannelData #2446

JohnWeisz opened this issue Aug 3, 2018 · 14 comments
Projects

Comments

@JohnWeisz
Copy link

After a lot of experimentation, to me it seems there is a field which Web Audio API doesn't cover as nicely as it could: crunching numbers on AudioBuffer objects -- i.e. when we have a for loop that grinds through a 200 MB Float32Array per channel.

There are several approaches to work around this, but each has quite painful downsides. Before we get to that, I'd like to show a few common use-cases for audio editing applications.

Common use-cases

Here are a few examples which are common in audio editing applications, and sort-of require crunching numbers of audio data:

  • Computing the peak level, and applying peak normalization
  • Computing the DC offset, and removing it
  • Trimming silence
  • Encoding to file format

Now, to clarify, I understand it's possible to do the large part of some of these in real time. For example, for applying peak-normalization, it's enough to quickly run through the AudioBuffer for the peak-level, and apply its reciprocal using a GainNode in real time. Similarly, a ConstantSourceNode can be used to remove the computed DC-offset.

This is great! However, "quickly running through" in this case still means noticeable main thread blocks (around a few hundred milliseconds for a 5 minutes audio clip, depending on device), which is just enough to make it unfeasible.

And this is the problem here which, to me, seems to be unnecessarily complicated to get around (get around, because there is no real solution). We can do number crunching, but only at the cost of UX quality (or severe compromise of performance, as explained below).

Avoiding main-thread blocks

Avoiding the mean-thread block is essentially the only problem we need to solve here, while still having a solid performance on doing the computations (obviously, we can just get it done as fast as possible and show a CSS-animated "please wait" overlay while doing so, but that is quite restrictive, especially with the dominantly async nature of JS). Here are a few approaches:

Doing the computations gracefully

All things considered, this is the simplest approach.

Using requestIdleCallback, it's possible to do everything on the main thread without blocking, but at the cost of a severe impact on performance (the computations are completed much slower, because we have to yield to the main thread continuously).

This approach is near-perfect from a UX perspective, but very bad from a performance perspective, a simple audio-level reading pass can take 10 times as long.

Using WebWorkers

This would be the obvious go-to solution, as WebWorkers are precisely for doing these tasks.

However, we can't transfer the ArrayBuffer from getChannelData, because that's the data used by audio rendering directly. Now we have several options here:

  • We can create a copy of what getChannelData returns, gracefully (e.g. using requestIdleCallback), and transfer that to the worker thread (or we can just copy to a SharedArrayBuffer and use that, if enabled)
  • We can use copyFromChannel to transfer copies of small parts to the worker thread

Unfortunately, both of these require data-copying on the main thread, and because of this, we are almost back to where we were with just doing computations on the main thread directly, gracefully (of course, copying even gigabytes of data is relatively fast on even mid-end modern devices, so if the computations are very expensive, we can prefer doing a graceful copy on the main thread, transfer the copy, then do the computations on the worker thread(s) with full power).

Using OfflineAudioContext

Using an OfflineAudioContext can be a decent solution, especially if we only have to read data (because it can be done in parts easily, and the actual data-processing can be done on the AudioWorklet thread).

If we need to do changes to an AudioBuffer, we can either (1) do a full-pass render on it, resulting in double memory use, or (2) still do it in parts and use the deprecated ScriptProcessorNode, with which we are back to where we began, doing the computations on the main thread (only the schedule is taken care of). However, with option 1, we can't efficiently do, say, trimming silence.

As a side-note: allocating an OfflineAudioContext (at least in Chrome) also has a main-thread block associated with it currently, but this is not a spec-issue, just a bad implementation.

That said, for reading, I believe using an OfflineAudioContext is currently the best option.


Proposed solution: SharedArrayBuffer natively

I propose a relatively simple change to the spec: getChannelData returns a Float32Array with a SharedArrayBuffer as the backing buffer. This would make it extremely simple to do audio data processing from even multiple WebWorker threads.

Obviously, this is only worth it if the AudioBuffer is created with backing SharedArrayBuffer in the first place, as otherwise we would just have to do a main-thread copy into the SharedArrayBuffer when calling getChannelData, which is practically useless, as we would be getting the same main thread blocks.

The rest of the issues associated with SharedArrayBuffer (e.g. Meltdown) are not really the concern of the Web Audio API: use it if it's enabled, don't if it's not, the rest is up to that team.

There isn't much more to add to this, really, it's a simple and elegant solution to a complex problem.

Closing Words

I'm also very interested to see how the committee sees the above mentioned common use-cases should be done using the Web Audio API. Because to me, at a first glance it seems nobody took time to think about them previously, or the API wasn't designed with them in mind.

@JohnWeisz JohnWeisz changed the title Heavy data processing on AudioBuffer -- how exactly is this supposed to be done? Heavy data processing on AudioBuffer could be a lot simpler Aug 3, 2018
@JohnWeisz JohnWeisz changed the title Heavy data processing on AudioBuffer could be a lot simpler Use SharedArrayBuffer for getChannelData Aug 3, 2018
@rtoy
Copy link
Member

rtoy commented Aug 3, 2018

I think we've talked about using SharedArrayBuffers but that won't happen until the next version of the spec begins.

Won't an AudioWorkletNode do the processing you need? If the processing doesn't have to be real-time, can't you use an offline audio context and an AudioWorklet to compute the things you need?

@JohnWeisz
Copy link
Author

JohnWeisz commented Aug 6, 2018

@rtoy As mentioned, OfflineAudioContext does work, but you are at double memory use. If your AudioBuffer is 2GB and you want to analyze that, you need another 2GB for the OfflineAudioContext buffer (assuming an identical sampleRate and channelCount).

If you only have to analyze the AudioBuffer, you can hack your way around this issue by only using a shorter OfflineAudioContext and doing multiple passes, sequentially or in parallel.

It is definitely possible, but is this really how it should be done? It just seems wrong that you need to hack your way around such a fundamental limitation.

@padenot
Copy link
Member

padenot commented Oct 25, 2018

Why is the audio an AudioBuffer in the first place. Is that because decodeAudioData is decoding everything in one go, without being able to get chunks?

@mdjp mdjp transferred this issue from WebAudio/web-audio-api Sep 17, 2019
@padenot
Copy link
Member

padenot commented Jun 15, 2020

F2F discussion:

  • This is related to other discussions about SharedArrayBuffer in the Web Audio API, but it's less hard to allow this than decodeAudioData, because in general implementations are in control of the code that is reading the float values (unlike for example a closed source decoder available on the OS or hardware that is being used by a particular UA)
  • If the previous assumption holds, implementations would have to audit their code to make sure no critical decision is made based on the value of the audio sampleps
  • https://github.com/WebAudio/web-audio-api-v2/issues/39#issuecomment-433023879 makes this somewhat less important because this shortcoming is being addressed and implemented in WebCodecs
  • Other OfflineAudioContext improvements might solve the other part of this

@guest271314

This comment was marked as off-topic.

@JohnWeisz
Copy link
Author

JohnWeisz commented Jul 21, 2020

@guest271314

Does AudioWorkletProcessor solve this?

No. It does enable some offline analysis on audio files, but it's very limited compared to what SharedArrayBuffer would be capable of (or a wholesale replacement of AudioBuffer even).

@guest271314

This comment was marked as off-topic.

@guest271314

This comment was marked as off-topic.

@guest271314

This comment was marked as off-topic.

@guest271314

This comment was marked as off-topic.

@padenot
Copy link
Member

padenot commented Oct 21, 2020

TPAC 2020:

  • This is useful, but hard. It exposed a third way to create shared memory on the web, and this need to be gated on the availability of shared memory

@padenot
Copy link
Member

padenot commented May 19, 2021

AudioWG virtual F2F:

  • Web Codecs is around the corner and allows decoding chunks of audio data, and this can be pipelined with analysis (say, in a worker).
  • OfflineAudioContext progressive rendering is also a solution if native nodes/worklet are used, since one can simply not use the outputed buffers

Would that be a satisfactory solution, considering, it's quite hard to suddenly exposed shared memory that is being used withing a Web Codecs implementation?

@JohnWeisz
Copy link
Author

JohnWeisz commented May 19, 2021

It's been some time since I started this thread, but for me personally, the upcoming Web Codecs API (and the fact it will provide random access to decoded audio data, IIRC) will improve quite a lot, as it essentially enables streaming audio file processing into memory chunks (which can be shared memory chunks then of course).

@rtoy
Copy link
Member

rtoy commented May 19, 2021

And for the record, progressive rendering is #2445. With the spec going to proposed recommendation, we can start adding these to the spec soon.

@mdjp mdjp transferred this issue from WebAudio/web-audio-api-v2 Sep 29, 2021
@mdjp mdjp added this to Untriaged in v.next via automation Sep 29, 2021
@mdjp mdjp moved this from Untriaged to Under Consideration in v.next Sep 29, 2021
@hoch hoch removed the priority-2 label Sep 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
v.next
Under Consideration
Development

No branches or pull requests

6 participants