-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Module.newWorker() to allow loading *hosted* libs via blob #8338
Comments
The worker can now be fetch()'d remotely via CORS from the S3 instance. emscripten-core/emscripten#8338
Interesting. I'm surprised such a workaround for CORS is necessary, in that I'm surprised it's so easy to work around it... As to adding a new |
The worker.js calling code is expecting a synchronous result to be returned, and that it is a functional worker with the supported API. There's a completely different contract between Without that, you'd need a way to handle every synchronous call to a worker (present and future API) via an asynchronous implementation. That's not something JavaScript can do, e.g. no top-level AWAITs. If it worked for this use case, it would be non-trivial...it would mean giving back a stub object from a new Worker() call that wasn't itself a worker, but had (for instance) a postMessage() handler. That handler would have to be able to accept messages before the worker was ready, and put them in its own queue. When the actual worker loaded, it would have to process that deferred list of messages. etc. |
I think you can still polyfill it? Yes, Can even use a JS Proxy object to support future APIs, but yeah, there might be special logic that is extra work, so it's not 100% future-proof. I realize this isn't easy to do, but adding a new hook in emscripten has a cost too. On the other hand, maybe there's a way to implement that without adding code size or complexity? |
It may be the case that workers are such that there are no purely synchronous demands of them (or at least emscripten vows to not use any such APIs), so a proxy atop a worker that may not be in existence at the time of the API calls is a technical possibility. But... though I can't speak for everyone... if my only two choices were to design/build/debug/maintain this artificial proxying object that simulates buffers for a Worker it doesn't have yet...vs. do a post-build patch of the worker.js output...I'd definitely pick patching the worker.js output to say
Code-size-wise, this strikes me as a rather minimal in the scheme of things. It need not affect non-pthread builds...and anyone using pthreads is probably prepared for some size. And in that build, all it's asking for is there be an overridable default:
I'd be happy enough supplying the fetch()-based override hook in client code, without EMSCRIPTEN_USE_FETCH_FOR_WORKERS doing it automatically or having that code sitting inert in Perhaps the long tail of |
I think those are good points. Meanwhile I realized, can't the locateFile API work for this? |
I don't see how--as it's a synchronous API that returns a URL. And so long as that URL is on a remote domain and you don't go through the fetch API, the browser won't be happy. |
Yeah, true - I wonder if we shouldn't consider making it an async API, as a general fix for this type of issue. Not a simple change though. |
You mean locateFile would be able to give either URLs or Blobs asynchronously? That might be a forward-looking choice. (...though making you rethink all the locateFile() callsites) If you wanted backwards compatibility, you could have the callback be the second parameter, and if |
Yeah, that might be a good long-term plan, but would indeed require changing all the locations. Another option here might be to do the async operation before - that is, before you load the main JS, load the worker JS and prepare the object URL etc., and stash that, and then you can hand it off synchronously in locateFile. A way to do that easily might be of general use perhaps (for any file we try to load)? |
@kripken This idea works. Something like the following:
|
That is an interesting feature to know about, and it seems https://caniuse.com/#feat=bloburls But in practice that would seem to run against the "you can't use async/await at the top level" issue: https://stackoverflow.com/a/51525286 |
You don't need to instantiate your module at the top level and the async/await is just for brevity... |
What I meant is--it's not something that can necessarily be done without larger scale restructuring of the calling script. (The fact that rearrangements from synchronous to asynchronous can be invasive is one of the sticking points of modifying emscripten itself to address the issue.) But yes, |
They renamed `unusedWorkerPool` to `unusedWorkers`, breaking a patch step in the Pthread build. emscripten-core/emscripten#8338
@hostilefork What to you think of my proposed idea here: #9796? What that also fix (or avoid) this issue for you? |
@VirtualTim I wound up giving up on my patch, and just bit the bullet and reorganized my loading process. I fetch the worker in a pre-load step as suggested by @prestomation. Things stall until it gets back, then I kick off the webassembly loading process...knowing that when the request for the worker comes in I have the data on hand to use URL.createObjectURL() on it. In terms of "what I think" ...well... 🙄 ...anything that makes it easier would be nice. But I think that when you're having to use these workarounds to ultimately achieve the same effects as if the browser had let you load the resource like you wanted in the first place--it's hard not to feel like blaming the browser. One of two things is true: the capability itself is a security bug and they have not securely blocked the capability -or- they are introducing more insecurity by forcing you into ever-more-hare-brained mechanics to accomplish a desire that should be deemed legitimate. The category of problem continues and I've joined in griping about it. (That issue may affect you if you are worried about this problem, so perhaps you want to look into it.) One wonders if there's any chance at pushing back against this trend. So maybe the browser makers could be lobbied to just let you use the URL like you wanted in the first place. |
@hostilefork I am new to WASM/JS (coming from C/C++). Banged my head against enabling CORS headers and why it still doesn't work until I found this thread. I under the solution @prestomation has proposed but I am not sure where I do put the worker preloading code so I can suspend the WASM initialization until the fetch() is complete. Edit: |
@X-Kent Sorry to miss your message. The worker preloading code simply has to happen before you load the emscripten .js file (which is what triggers the locateFile method in Module for the If it still helps... I have set up the loading process as a chain of "promiser" functions.
Hopefully that helps, or maybe the code and comments will give you some ideas... |
This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 30 days. Feel free to re-open at any time if this issue is still relevant. |
I now believe using multithreaded code as a library (e.g. where worker-based-Wasm is pulled casually in the way that one might use a hosted version of jQuery) is a lost cause. (No fault of emscripten, of course!!) I'm closing this issue, as COOP/COEP was the last straw...I'm not going to swim upstream against that current. |
Note for further readers; you could also leverage const module = await Module({
// https://stackoverflow.com/q/25458104
locateFile: (fileName, scriptDirectory) => {
const url = scriptDirectory + fileName;
if (url.endsWith('.worker.js')) {
return URL.createObjectURL(new Blob(
[`importScripts('${url}');`],
{'type': 'application/javascript'}));
}
return url;
}
}); (if you use multiple threads, you may want to cache the output of |
(I wouldn't mind tinkering around and providing this as a PR, if the idea was considered sound.)
When you build a library that uses pthreads in emscripten, it produces an additional
my-lib.worker.js
file.If you try to put your main
my-lib.js
in a location besides where your page is being served from, and tell locateFile() that themy-lib.worker.js
file is right there with it...you get a bit of "browser security theatre":The failing call originates from within
library-pthread.js
, which is built into themy-lib.js
file when you are using pthreads.Clients don't want to be editing either of these files. But if you could do so, and if your hosting location has CORS enabled, you might workaround by changing:
To fetch() the JS into a Blob and then use:
I tested this and it works, so we're having to do a post-build patch step to rewrite the build product.
But it would be better if emscripten had a
Module.newWorker = function (url, callback) {...}
option. That could default to being implemented as justcallback(new Worker(url))
. Then interested parties could provide their own replacement, such as:Does that sound reasonable? I've gathered Emscripten doesn't want to take Promises for granted, but if it did this could use them instead of a callback.
The text was updated successfully, but these errors were encountered: