New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Audio Workers #113
Comments
Work in progress. See http://www.w3.org/2011/audio/track/actions/26 |
We haven't been clear enough on this. What we want is for JavaScript processing to happen only in workers. Doing anything on the same context as mouse and keyboard events are processed and where scripts can easily be blocked for 100s of milliseconds by layout reflows is simply a no-go. |
I completely agree that workers are a better approach for lower-latency, smaller buffer sizes. But there is a cost to the developer to being required to use a web worker because the JavaScript state is completely isolated from the main JS thread. Thus it will require more complex code, and some applications might not even be practical. Some developers have expressed concerns that JavaScriptAudioNode only happens in workers. |
While I agree that workers are slightly more cumbersome to work with than regular callbacks, I think that there are some risks with supporting both methods:
|
While I agree that it's not a good idea to do time-critical heavy lifting like audio on the main thread, sometimes there isn't much choice: emulators and other virtual machines, ports of existing code (more possible but could be very difficult as the original code may have strong shared state with the audio code), and such. These kind of programs are quite challenging as it is, I don't think a few bad eggs should make the lives of those developers worse than it already is by having to do expensive tricks like sending the audio data to the worker with postMessage and maintaining the callback system themselves. I always find these discussions about creating bad practices a bit frustrating, people will make bad choices, no matter how well we design things. For me, it doesn't mean that we shouldn't try to avoid making things so that developers want to do bad things, but actively making actual use cases harder to keep some people from making stupid decisions is counter-productive, IMHO. You give them the rope, some will hang themselves, some will build a bridge. |
(In reply to comment #5)
That's not how a worker-based AudioNode would work, it would be a callback in the worker that can read directly from the input and write directly to the output. There are things on the main thread that are not interruptible (layout and event handlers being the most obvious) so it's only luck if one is able to run the callback often enough. I can't speak for any other implementors, but I'm fairly certain it would fail horribly in Opera, as other pages running in the same process can't be expected to write code to avoid long-running scripts or expensive re-layouts. |
Exactly, and if the audio processing takes place in the main thread, you have no way of knowing when the callbacks in the worker occur. Hence you have to devise your own callback system to sync with the one going on in the worker, and send data over to the worker using postMessage, being a very ineffective solution for a case that's already very vulnerable. Not to mention that it's difficult to implement without ending up with weird edge-case race conditions.
Of course, it's impossible to predict what's going on other pages, but that applies to drawing and other things as well, to achieve the best results, users have to close other tabs unless the browser has some multi-threading going on with different tabs. But I beg to differ that Opera would fail horribly. In my sink.js [1](a library to allow raw access to audio cross-browser), I have a fallback using the audio tag, you can take a look at an example of how it runs on Opera here [2](a demo I made for last Christmas). The result is bearable, even though wav conversion and data URI conversion sucks the CPU dry. There are glitches every 0.5s or so, due to switching the audio tag but that's only because the onended event triggering the next clip fires a significant time after the audio has actually finished playing. [1] https://github.com/jussi-kalliokoski/sink.js |
(In reply to comment #7)
The solution is to not do audio processing in the main thread and to post the state needed to do it in the worker instead. This seems trivial to me, do you have a real-world example where it is not?
That's really cool, but not at all the same. If you generate 500 ms second chunks of audio, blocking the main thread with layout for 100 ms is not a problem. With the current JavaScriptAudioNode the block size can go as small as 256, which is only 5 ms at 48Khz. To never block the main thread for more than 5 ms is not a guarantee we can make. |
To get an idea of how long a layout reflow can take, visit http://www.whatwg.org/specs/web-apps/current-work/ wait for it to load and then run javascript:alert(opera.reflowCount + ' reflows in ' + opera.reflowTime + ' ms') On my very fast developer machine, the results are: 22 reflows in 19165.749624967575 ms That means that in the best case (all reflows took the same amount of time) the longest reflow was 871 ms. With https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html I got these results: 7 reflows in 79.2522840499878 ms Unfortunately the longest reflow isn't exposed, but even if it's "only" 12 ms that means that it's not hard to go way beyond that on a slower machine, like a smartphone. |
(In reply to comment #8)
No, I'm afraid I don't, and there probably aren't too many around (anymore, developers know better these days). But the emulator case still stands, there are already emulator environments written in JS. [1] [2] [3] [4]
Obviously the developers need to adjust their buffer sizes to work in the main thread, buffer sizes of 256 samples in the main thread with JS are (for now) a bit unrealistic — given the complexity and threading that comes with this API — and fails currently in Chrome as well (although I'm not sure why, the CPU usage is only 2% or so, but I suppose this is a thread communication latency issue), in fact any buffer size under 2048 makes the JSNode glitch horribly. If the developer expects her application to work in a mobile phone as well, she'll have to adjust that buffer size further. Indeed, I once proposed that the buffer size argument of the JSNode would be optional so that the browser could make a best approximation of what kind of a buffer size could be handled on a given setup. [5] It's not like we can prevent people from doing their audio processing in the main thread. What we can do, however, is give them proper tools to do that in a minimally disruptive way for the user experience. [1] http://fir.sh/projects/jsnes/ |
Not quite the same thing as audio processing, but we're trying to limit Similarly, JS audio processing is guaranteed to fail on the main thread |
(In reply to comment #11)
I agree it should, but I don't think it will. What should an emulator/VM developer do? Render off main-thread as well? MSP API would have been a perfect fit for that use case, given it's ability to process video as well... Analyzing the byte code of those existing games and other programs and isolating the audio code to another thread doesn't sound very feasible. |
(In reply to comment #12)
|
There has been a lot of discussion about a Worker-accessible canvas API and I expect one to be created fairly soon. Then we'll have a solution for VMs and emulators that really works. I agree that providing main-thread JS audio processing is just setting authors up to fail. |
(In reply to comment #12)
I guess it depends a lot on what kind of system you want to emulate, and to what extent you need CPU cycle exact state coherence (e.g. do you want to emulate a few popular games, or do you want to implement a fully functional virtualization of a machine?). For instance for implementing a SID chip for a C=64 emulator, I'd imagine that you can simply post time-stamped messages from the main thread to an audio worker (possibly batched as a time-stamped command buffer per frame or whatever), where the worker implements all SID logic. A similar solution should work for the NES sound HW too, I guess. For emulating the Paula chip on the Amiga, you'd have more problems since it uses DMA for accessing CPU-shared memory. On the other hand, I think you should be able to come quite far by setting up a node graph that effectively emulates the Paula chip using automation for timing etc, eliminating a lot of the problems that you would otherwise have with a 100% JS-based mixer. In any event, this does strike me as a show-stopper for having audio processing in workers. Especially given that machine emulators are usually quite odd, both in terms of architecture and actual use cases. |
(In reply to comment #15)
As Philip pointed out: should read "this does NOT strike me as a show-stopper" ;) |
(In reply to comment #16)
All right, I'm running out of points to defend my case, especially since I don't have a horse of my own in the race. :) And if it's one or the other, I prefer the audio processing is done only in workers instead of the main thread (obviously), but I still think it'd be wise to have both. |
(In reply to comment #17)
I agree with Jussi. Quite honestly it's a lot simpler for developers to have access to the complete JS state while doing the processing. If people are willing to work with larger buffer sizes, then quite reasonable things can be done in the main thread. |
If so, then "larger buffer sizes" should be a hard requirement. On my fairly powerful desktop computer a layout could block for at least 871 ms. The closest power-of-two at 48Khz is 65536, i.e. over a second. With that amount of latency it doesn't seems very useful. |
(In reply to comment #19)
What? Why would it be a hard limit? Hard limits aren't very future-friendly. Should setTimeout have a minimum timeout limit of 871ms as well? Or requestAnimationFrame? Developers have to be conscious about performance and avoiding layout reflows anyway, why should this API be any different? |
(In reply to comment #20)
One problem is that developers don't control all the pages that might possibly be sharing the same thread. No browser puts every page on its own thread. So even if you write your page perfectly, you're still vulnerable to latency caused by poorly-written pages sharing your main thread. |
(In reply to comment #20)
I'd also like to add to this discussion that you can't really compare glitches in graphics/animation to glitches in audio. In general we (humans) are much more sensitive to glitches in audio than to frame drops in animation. You can usually get away with a 100 ms loss in an animation every now and then, but you can't as easily get away with a 1 ms glitch in your audio. Most systems (DVD, DVB etc) prioritize audio over video. This can be seen when switching channels on some TV boxes for instance, where video stutters into sync with the continuous audio - it's hardly noticeable, but it would be horrible if it was the other way around (stuttering audio). In other words, an audio API should provide continuous operation even under conditions when a graphics API fail to do so. |
(In reply to comment #21)
Of course, but this argument is just as valid against having an audio API at all, after all the developer can't anticipate what else is running on the user's computer aside from the browser. For all the developer knows, the API might be running on a mobile browser with all cores (or maybe just one) busy. Throwing more threads at it doesn't necessarily solve the problem of not being able to anticipate all situations. |
(In reply to comment #22)
Yes, this is why it is preferable to run audio in a real time / priority thread where possible, but it's not always possible, maybe due to the system or the nature of the application. |
(In reply to comment #23)
True, but there is a significant difference between running several threads on a single core (preemptive scheduling should give any thread CPU quite often), and running several pages in a single thread (a callback may have to wait for seconds). |
(In reply to comment #25)
Still, no matter what we do, in some cases audio will not work as expected, will miss refills, and there's nothing we can do about it. What we can do, however, is give the proper tools to handle these situations, including main thread audio processing that doesn't have to resort to manually transferring audio to a worker (or graphics to the main thread either), because that's even more expensive and likely to fail. |
(In reply to comment #26)
|
I definitely favor helping developers "do the right thing", so I also prefer focusing exclusively on JavaScriptAudioNode processing in workers (instead of the main thread). Regardless, we all seem to agree that support for doing processing in workers needs added. If we do decide to support both, I suggest requiring some sort of explicit opt-in for running on the main thread. This way developers would at least be less likely to gravitate to it by default. |
(In reply to comment #26)
With JS audio produced in Workers, the browser should be able to make audio work reliably in any situation short of complete overload of the device. With JS audio on the main thread, audio will start failing as soon as a badly-behaving page happens to share the same main thread as the audio page. That is a huge difference.
For use-cases such as "run an emulator producing sound and graphics", the best solution is to provide Worker access to canvas as well as audio. Then you can have reliable audio and a reliable frame rate as well. Are there any other use-cases that are problematic for audio production in Workers? |
(In reply to comment #29)
If Workers get access to canvas, at least I can't think of a (valid) reason why anyone would want to process audio in the main thread. :) |
On 4/1/15 the group agreed on adopting the 2nd approach outlined in Chris Wilson's draft of the AudioWorker specification, in which AudioContext.createAudioWorker() returns a Promise, which completes after the script is loaded and has run. When the Promise has succeeded, the AudioWorker is ready to use with no script-induced delays. All node configuration can take place within the worker script. onnodecreate and onaudioprocess event handlers are shared among all node instances; the passed-in event in both cases has a "node"property that refers to a audio-worker-side node handle object. |
Bravi, that is very beautiful.
|
It is giving us a hard time not to know what is behind the curtain. I am regularly checking this group and chrome-status https://www.chromestatus.com/features#AudioWorker and it feels like there has been made some decisions but the development has not yet started. The first time I asked was in September 2014 where I got the response: "soon" AudioWorkers are crucial for our development in audiotool.com. Especially when dealing with the web-midi-api. Lower latency and increasing playback stability are the key-points we are looking forward to. Is there any place where I can get more recent information? Any insights are appreciated. |
@andremichelle the group adopted the AudioWorker spec changes 4 weeks ago, so the path is not in doubt. That is actually a big change since Sept. 2014 at which point we did not have an agreed specification for AudioWorker. The whole WG understands that AudioWorker is crucial for adoption of Web Audio and unfortunately it took a while to hammer it out. Actual implementation progress is a matter for the individual browser vendors to answer; I'd recommend asking on public-audio-dev@w3.org (because this particular group's all about the specification). |
@joe, @andremichelle, in light of #532, the spec still needs some work to be implementable, but I'm optimistic, those new issues are technical spec/Web Platform issues, we should be able to solve them quickly, since we all more-or-less agree on the API and on the behaviour. |
@joeberkovitz I did not expect it to be an easy task. However it feels from the outside that there are way too many people interfering with the outlines of the possible implementations. I am not sure if this way of having discussions all over this github place is a good way to approach a good solution. In any case it seems very slow. Forgive my naiv attitude. I am sure you guys take it all very seriously. |
Out of interest: The post before mine is from May. Is there anything going on in another thread I am missing? |
@andremichelle: yes, #532 and http://www.w3.org/2015/07/09-audio-minutes.html#item04 (a bit down the page) |
Hi André, We're actually waiting on @padenot right now to write up the proposal, ...Joe On Tue, Jul 21, 2015 at 7:09 AM, André Michelle notifications@github.com
. . . . . ...Joe Joe Berkovitz Noteflight LLC |
Thanks for the insides Joe, I really appreciate it. |
Any updates on this? —
|
Seems like there was a conversation about this in the mailing list in the last couple of days. https://lists.w3.org/Archives/Public/public-audio/2015JulSep/0047.html |
Where? I'd like to contribute. |
Ah! I guess you're subscribed by email. I updated my previous comment with the link. But here you go. https://lists.w3.org/Archives/Public/public-audio/2015JulSep/0047.html |
@andremichelle et al., there is some substantial and very recent progress here to report. @padenot is in the process of writing up a more detailed proposal which can be viewed on an ongoing basis here: http://padenot.github.io/web-audio-api/ The first step of this work is refining the asynchronous processing model in Web Audio so that the descriptions of AudioWorker rest on a very solid foundation. Paul is transitioning to work nearly full time on this between now and W3C TPAC (October 26) and has committed to a complete spec draft being available by that date. There's no expectation that any of this will be implemented by TPAC, but this feature is among the very highest-priority tasks of any work taking place in the Audio WG and we're all hopeful of implementations to follow once the spec is agreed. |
edit: wrong issue number I posted to. |
Hi guys. Regarding the AudioWorker spec, I'm curious to know where things stand at this point in time. Seeing as this issue is still open am I right to assume the spec has not been finalized yet? |
We are currently reviewing @padenot's refinement of the processing model mentioned in my previous comment (and it's on the WG's agenda for today's call). A lot of work has taken place on that, and it touches many parts of the specification. The AudioWorker interfaces themselves have not changed much since the original proposal. |
Hi, |
Any update on the work being done here? |
@Maushundb Take a look at the current proposal. |
I don't know what to do with this. @hoch has done a lot of work on worklets and a lot of the text is in the spec already, so I suppose I can close this. |
Audio-ISSUE-107 (JSWorkers): JavaScriptAudioNode processing in workers [Web Audio API]
http://www.w3.org/2011/audio/track/issues/107
Raised by: Marcus Geelnard
On product: Web Audio API
https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#JavaScriptAudioNode
It has been discussed before (see [1] and [2], for instance), but I could not find an issue for it, so here goes:
The JavaScriptAudioNode should do its processing in a separate context (e.g. a worker) rather than in the main thread/context. It could potentially mean very low overhead for JavaScript-based audio processing, and seems to be a fundamental requirement for making the JavaScriptAudioNode really useful.
[1] http://lists.w3.org/Archives/Public/public-audio/2012JanMar/0225.html
[2] http://lists.w3.org/Archives/Public/public-audio/2012JanMar/0245.html
The text was updated successfully, but these errors were encountered: