New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement AudioWorklet #23807
Comments
Interest, yes, but I'm not sure how to make the processing model work safely with spidermonkey and the audio thread. SHould be fine with an audio thread, may be tricky if it's a process, which we might eventualy do. |
Is that the "rendering thread" mentioned in the spec? https://webaudio.github.io/web-audio-api/#rendering-thread And the "control thread" is basically a script-thread then, right?
What kind of process isolation do you have in mind? One rendering thread/process per origin? In that case, a worklet would be created from a "control thread"(script, I guess), whereas the audioworkletglobalscope/audioworkletprocessorwould run in the same process as the rendering thread(but on a different thread I guess). So, the interface used from script/control is the audioworkletnode, which comes with a The interface used "inside" the worklet scope, which could be running in a separate process, is then the audioworkletprocessor, which also comes with a Those two ports are then the endpoints to a communication channel, which in the process isolation case would have to cross-process. So the good thing about the So having |
Ok so I read the spec a bit more, so the audio worklet would run on the same "rendering thread" as the base audio context, it's really just a way for script to run a script in that context, it's not an additional background worker(although it's runs in the "background" rendering thread). So you do need a separate worklet global scope(a new concept, separate from worker global scope) in which to run the worklet processor code. Having said that, if you were to run the "rendering thread" in a separate process from script(and share it on a per-origin basis with various script processes?), you could use So the fact that the spec uses |
Ok thanks for the info.
Why would worklet would ruin introducing a process boundary? Reading the chat, it seems there is a mental model of running the Audio worklet as some sort of worker in the script-process. I have a different mental model, where the worklet is just a piece of JS code that is run on the same thread as where the audio processing happens, and not as a worker. The spec has some examples: // The main global scope
const context = new AudioContext();
context.audioWorklet.addModule('bypass-processor.js').then(() => {
const bypassNode = new AudioWorkletNode(context, 'bypass-processor');
}); // bypass-processor.js script file, runs on AudioWorkletGlobalScope
class BypassProcessor extends AudioWorkletProcessor {
process (inputs, outputs) {
// Single input, single channel.
const input = inputs[0];
const output = outputs[0];
output[0].set(input[0]);
// Process only while there are active inputs.
return false;
}
};
registerProcessor('bypass-processor', BypassProcessor); So if you were to move WebAudio to run in a separate process, it would require:
Then, the spec says to synchronously call the process method of the So I think that would require implementing Also, Also, I was looking at this example https://github.com/GoogleChromeLabs/web-audio-samples/blob/master/audio-worklet/design-pattern/shared-buffer/shared-buffer-worklet-processor.js and it does seem that you need to run an event-loop for the See for example how So I think you might have to run a separate thread for the |
The process boundary is kinda useless if we're allowing arbitrary JS into the audio process. I'm wary of letting gstreamer and spidermonkey share a process. It doesn't have to be the same process as script, but we should be careful about mixing it with gstreamer. But yes, this is one option! All of the red arrows in the diagram I drew are things which would ideally be process boundaries. We need to get rid of some and consider the trade-offs, this is an acceptable choice. |
Yes I agree. If I understand it correctly, we currently have one instance of Line 927 in 8328763
So one could imagine having one What would be inside one such "servo-media process" for a given origin?
So you could run one thread per In terms of actually plugging running the worklet into the overall processing of the node graph, I think you could try something like having the Where would those it could be in a You could also re-consider the current implementation of So the constellation would store a kind of "control sender" to each running media processes, but a script for a given origin would store a direct ipc-sender in the form of a One could imagine the below workflow: // The main global scope
const context = new AudioContext();
context.audioWorklet.addModule('bypass-processor.js').then(() => {
const bypassNode = new AudioWorkletNode(context, 'bypass-processor');
});
Come to think of it, what's kinda interesting is that each |
Right, currently we don't have a media-script process boundary, but it was designed with a plan to add such a process boundary eventually.
This can work. It's unclear if the performance impact would be high. OTOH, i'm not sure if we want to run JS code and gstreamer in the same process, ever, and those are really the only two options we have :) |
It seems in Chromium the audio worklet is run on a separate thread, in the same process as the "main" script-thread. So that actually is the opposite of what I had in mind. And, this approach seems to create issues:
So, in the light of the security issues related to abusing the worklet, it might indeed not be good idea to run the And it might still be a good idea to run the worklet, on a per-origin basis, in a separate process from the main content-process, using their own JS runtime, so as to isolate the GC? So basically making audio worklets "out-of-process" like service- and shared-workers? So you'd end-up with:
Alternatively 1 and 2 are combined into one process per origin, with 2 running on "lower priority" threads than the native audio code(but that means JS code is running in the same process as Gstreamer). It might sound like a lot of processes, and it could still result in more re-use of audio instances, since you'd be sharing those per origin, as opposed to have one per content-process(which is sharing one audio instance per browsing context group). Or, we just run I personally like the idea of the constellation starting to track So the difference with Chromium is that they put Three arguments I can find in favor of making
Although 2 has less weight if we were to run the worklet "in-process" in script, while still isolating the audio backend in a process. In such a setup the constellation could be used to initialize the backend and setup the ipc with script, and each new worklet would just have to be hooked up when created, perhaps even bypassing the constellation if script has a direct line of comm with audio. But that would mean worklet-backend communication would go through the script-process, whereas if we isolate the worket, it would involve backend to worklet direct communication(over ipc), bypassing script for the processing part. |
Less convinced by this, but if you feel it would be useful! |
The paint worklet does a lot of jumping through hoops to make sure that GC is never run by the worklet. (There are three worklet threads, each running its own SM instance, and when the active worklet detects GC pressure, it swaps itself out.) This depends on paint worklets being stateless though, if audio worklets are allowed to be stateful then this approach is out. They can still run an a a separate thread from the main script thread though, so at least GC on the main script thread won't pause the worklet. |
I'm not sure, starting to tend towards running the worklet in script. My initial understanding from reading the spec was that the intent of the worklet was really to run JS code on the audio thread(or at least right next to it in the same process, in the case of process isolation of audio), with the goal being making that JS code run "on par" with the native audio processing code. Its sort of what the spec says pretty explicitly:
But if in practice people run the worklet in a thread in the content-process, and then I assume call into the Actually I think the spec is not very clear about how calling I've asked a question at WebAudio/web-audio-api#2008
Yes, not only are they stateful, they also come with a See this example, for the interplay between
That's the idea I am slowly starting to adapt too. However, if we run audio in a separate process, then that means the audio "rendering loop" will have to ipc to script at each render quantum to make a blocking call into the
That seems like a lot of overhead for something that is supposed to happen "continuously". And even if we run audio in the content-process, like is done now, you'd still have to wait for the worklet thread to handle your Unless you somehow "pause" the event-loop of The spec contains wording like:
|
|
Ok sorry for driving everyone crazy, but I have one more "idea":
Basically, you could break-up (While the So at that point you could run a worklet processor just like any other native audio-node as part of it would happen on the script-process, in the rendering thread, and we'd find a way to run the event-loop of the So that way no JS is running alongside Gstreamer, yet you avoid cross-thread/process stuff between the rendering thread and a worklet processor. And So this could be done in two steps:
|
In general, you cannot have context switches in this whole situation (they are too long and brings non-determinism, and that causes glitches, and that's not acceptable). In the diagram that @Manishearth draw above, it's necessary to have the native Web Audio API node run synchronously alongside with the It's customary to then ship the full rendered audio buffer to another process to hand it off to the system. You roughly have a couple milliseconds to all do the above, if the system is not loaded too high. I'm always available to chat (Paris time), this is being implemented in Gecko at the minute, and spidermonkey has a couple peculiarities that we'd have loved to know before starting this. |
Thanks! Yes it would be good to chat. Perhaps we could organize something with a few people? cc @Manishearth @asajeffrey @ferjm
Ok, so that means the rendering thread and worklet global scope need to run in the same thread, and their event-loops(rendering loop + event-loop of the global scope) need to be interleaved on that same thread.
Ok so then we'd have to split our current WebAudio implemenation, put the actual backend in a separate process, while keeping the rendering thread in the script/content process(and have it run any worklets as well). I've tried to express this potential change in the diagram: |
I've looked at the currently worklet code, mostly used for paint worklet, and I think we can base audio worklet on it. cc @asajeffrey It's a bit of a pitty the worklet would not live on the same thread as the audio rendering thread, however I think using the threadpool and the gc/script-loading mechanism it brings is worth it, and I'm not sure we could run the worklet as a whole on the audio rendering thread anyway. One thing I was wondering about is how we could call the Then the rendering thread, when encountering a worklet node as part of the rendering graph, would simply need to make a blocking call to the worklet thread pool and instruct it to run a task which would call the Since audio processors(the JS class registered to run on the worklet) have internal state, I do wonder if we would need additional plumbing to replicate that state correctly across the worklet pool, since the threads switch roles and we'd want any new "primary" to always be up-to-date... |
Maybe for this we can just put a mutex around the node-name-to-processor-constructor-map that contains all the processors(similar to paint classes in the paint worklet), and share that with all threads in the pool, with the understanding that the mutex will not be contented since it will only be used by the primary? |
Maybe for this we can just put a mutex around the node-name-to-processor-constructor-map that contains all the processors(similar to paint classes in the paint worklet), and share that with all threads in the pool, with the understanding that the mutex will not be contented since it will only be used by the primary? So whereas the |
The problem is that Spidermonkey is letting internal JS state, so even if you could replicate the worklet state, the JS state would get out of sync. So I think we'd need a single-threaded implementation of audio worklets. |
It's the opposite: you really want to run this on a single thread, and certainly not on a thread pool (also you can't if you're using SM, as Jeffrey indicates below, because SM uses TLS). In general, multi-threaded real-time audio is a bad idea. Having thread/process hops in between the audio rendering thread and the worklets means that it won't be possible to have it work reliably (it will work for small work loads on a fast machine, maybe). |
@padenot Ok, thanks for the info. In Gecko, are you implementing the I'm wondering how you go from processing the "native" audio graph on the rendering thread, to calling into the |
@padenot it sounds like there's a tension between security (which would indicate running the worklet code in the same process as the user content) and low-latency (which would indicate running the worklet code in the main process). Is the plan to resolve this by having a dedicated audio process? |
For now, we're calling into SM from the render thread and that's it. The worklet code is user content, so I don't see the difference it makes compared to a normal script? The preferred architecture is to remote the audio system calls (and thus the audio callbacks) to a privileged process, as mentioned in #23807 (comment), and to forward the real-time callbacks using a synchronous IPC call. This way, there is only two context switches. This has been measured to be acceptable even under heavy load if the respective thread priorities are set to prevent priority inversions. |
It sounds like we essentially need two changes:
I think this issue is mostly about 2, and is also mostly orthogonal to 1, which should happen anyway regardless. |
@padenot One more question: on what basis do you make the privileged audio backend available to content? Is there one privileged process per origin, one for the entire UA, or something else? |
The content process (where the js runs, and also the DSP code in C++ that backs the native audio nodes) is sandboxed. The system calls required to open and run an audio stream don't work there. The audio stream is therefore opened in the parent process, that has the capability to open an audio stream. When a real-time callback is called in the parent process, we make a synchronous IPC call to the content process, and in particular to a specific thread in the content process, that has been prioritized appropriately. The DSP code runs there. When the correct number of frames has been processed, this synchronous IPC call returns. Any content process can open an audio stream remotely in the parent. |
@padenot thank you. Ok so I think this translates to running gstreamer in the "main" process, alongside the constellation, or we could put it in a separate process. In both cases we could have the constellation do the initial setup when an audiocontext is created by content, followed by setting up a direct ipc channel between the rendering thread in the content process, and the audio backend. |
We really do not want to run gstreamer in the main process, it should be in its own process. |
It does not really matter where audio stream resides, as long as the context switches are kept to an absolute minimum (2 per audio callback). |
I think we can do that by setting up a direct ipc channel between the audio backend and the audio rendering thread, using the constellation only for initial orchestration when content creates an audio-context. We would also probably need for the ipc to happen without an ipc-router thread, since that adds a context-switch per message. We could do that if we made the control-thread(the script-thread that created the audio context) also communicate with the rendering thread over ipc, even-though they are in the same process(The only reason we need a router thread is when we need to integrate ipc-messages with other multi-threaded messages in a single select). |
@Manishearth I can understand how we might be able to run gstreamer audio in its own process, but doesn't gstreamer video need to be in the main process? |
Does it? I thought we need to ship frames over to webrender anyway? But yeah if gstreamer is forced to be in the main process anyway we can ignore this. I'd really like for it to be elsewhere, though. |
@Manishearth depends on whether we can share textures between processes. |
Ok I'm interested in working on this, not right now but let's say fall/winter. I propose the following outline of major work items, for your consideration:
I think doing 1 is going to require some changes across interfaces, I can't provide an exhaustive overview upfront, but here are some examples/thoughts:
I think we can replace the The "backend" part of We wouldn't need Here is what a render quantum would look like:
Note that the rendering thread would also have to handle control messages, and messageport messages, coming from the "control thread", the script-thread that started using audio. I think we probably want to make those messages go over IPC as well, even though control and rendering are in the same process, just so that we can cut out the IPC router thread from the equation for both control and data messages, while running them using the same IPC-driven event-loop. The audio sink on the backend would be a Please let me know what you think. |
We could also have some fun and use shared-memory for the buffer of |
Good read https://padenot.github.io/web-audio-perf/, contains some info on architecture and perf considerations, dated by some years and not covering worklets however... |
https://webaudio.github.io/web-audio-api/#audioworklet
Any interest in this? @ferjm @Manishearth
It's using MessagePorts, coming soon to a neighorhood near you. #23637
Although this might be one of those "implicit port" which can more easily be implemented with a channel(like in dedicated workers).
The text was updated successfully, but these errors were encountered: