-
Notifications
You must be signed in to change notification settings - Fork 399
Experimental feature: Dynamic worker loading #4383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Definitely happy to see this progressing but I definitely think it's rather unfortunate for us to have introduce a bespoke syntax/API for this. It would be nice if there were alignment here with either/both Node.js worker threads and web workers API. |
The API design here is intended to directly expose how our platform works. We could consider keeping it private, while building compatibility wrappers for specific standard APIs. I suspect, though, that there will be a lot of friction trying to apply various standard APIs that were designed for different use cases. Both Node worker threads and Web Workers are primarily intended to allow parallelization, offloading work to a background thread. The dynamic worker loader API in this PR explicitly does not do that -- when you call into the worker, it runs in the _same_thread. Instead, our feature is more about sandboxing and dynamic code loading. Moreover, it's an explicit design goal that the dynamic worker loader should be able to load any real Worker code that you could otherwise deploy with wrangler, etc. That pretty much necessitates that our API include concepts like bindings and compat dates, which obviously don't exist in any standard worker-loading API. |
Wonder if there is a mode where it shouldn't? What if I'm building a platform? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, LGTM. Tests can be expanded. Otherwise rest is nits.
fbea3a0
to
c8c50c2
Compare
c9e6913
to
eaa987e
Compare
I think there's a lot of room for us to add features to support parallelism, but they are probably orthogonal to this feature... |
432e00d
to
f574ed6
Compare
b4a7f48
to
82fa635
Compare
Fixed cyclic reference bug introduced by anonymous loaders: b4a7f48 Then rebased on main. |
This must be vestigial, but at present it isn't used anywhere. If we needed to keep it around, we'd probably want to clear it in `unlink()`, like we do with `WorkerService::waitUntilTasks`.
Previously, the FacetManager interface took an actor class channel number as input when creating a facet. This was a little odd as it was the only place where channel numbers could be exchanged for channels outside of IoChannelFactory itself. Now, IoChannelFactory has a `getActorClass(uint channel)` method that returns an object representing the actor class. This object has no methods, it can only be passed into `FacetManager`. This makes the code in `server.c++` look a little better. More importantly, though, it opens the possibility of facets being implemented by actor classes that didn't strictly come from channel numbers. Dynamic isolate loading, in particular, will create ActorClassChannels dynamically! Also, Dynamic Dispatch may be extended to support actor classes.
Somewhat similar to the previous commit, we will soon need a uniform way to refer to outgoing "fetchers", whether they be service bindings (defined by a channel number), dynamic dispatch targets, dynamically-loaded worker entrypoints, or even DO stubs. This will especially be needed so that we can hook in a way to embed these in the `env` object passed to a dynamic isolate.
I realize it would be even better to go all the way to `kj::Rc` here, but that would create a lot more noise, also requiring a coordinated internal change, so I'm punting that for the moment.
This can create reference cycles, so we verify that cycles have been broken during the unlink() phase at shutdown.
Many objects in `server.c++` would hold a regular reference to a parent `Service`. We should make these strong references now. With this change, it should now be the case that if someone holds onto a `SubrequestChannel`, it prevents the underlying `Service` from being destroyed. This will be important when we implement dynamic isolate loading, which will presumably involve evicting isolates before server shutdown.
Turns out we don't actually need to know that these are `Service`s anymore. Well, except in a weird special case involving tails, but that case was _already_ doing dynamic downcasting to query the type, so we can just do more of that.
(Not hooked up yet, though.)
This implements a Worker loader I/O channel for workerd. But as of this commit there's still no binding that actually populates these channels.
The binding is declared like: `(workerLoader = (id = "foo"))` Similar to memory cache bindings, if the same ID is used in different bindings within the same workerd process, they will bind to the same loader, meaning they share the same LRU cache of workers. (Well, at present there's no LRU, the isolates just stay alive forever, but...)
Dynamic isolates will typically hook up to an already-extant `SubrequestChannel`.
This is an alternative to a simple channel number, needed with dynamically-loaded classes.
On review, this name is not actually "displayed" anywhere. In fact, it's not used for much of anything, except maybe some internal error messages.
In this case, it doesn't share a cache with any other loader.
Loading the same isolate name again should not call the callback.
Prompt: ``` deps/workerd/src/workerd/api/worker-loader-test.js contains tests for the new "dynamic worker loader" feature, but needs more test cases. Please add the following tests: - Test non-string modules types. The source code of a module, instead of being a string, can be an object, like `{text: "blah"}`, to represent a module type other than an ES module. The types are `js` (ES module, but more explicit), `cjs` (CommonJS module), `text` (just a string), `data` (just an ArrayBuffer), and `json` (arbitrary JS object as long as it is JSON-serializable). - Test setting compat date / flags works. - Test code loader callback that does something asynchronous (can just wait for a 1ms timeout) - Test what happens if the code getter callback throws an exception (it should propagate to the caller when it tries to call into the worker). Please add the tests to the end of the file in the order listed above. The worker loader API, if you need to refer to it, is defined in deps/workerd/src/workerd/api/worker-loader.h (in JSG format). ``` I significantly rewrote the compatDateFlags test (Claude's version didn't really test dates), and made a couple other minor tweaks, but otherwise this is mostly Claude's work at a cost of $0.29. This actually caught a bug, fixed up in a928d92.
82fa635
to
98ed71b
Compare
That's weird, it lets me load it in the browser (after a click). |
I switched to github's new preview UX that they are testing and apparently they reduced the threshold on size of the diff. Switching back to the original UX and it works. |
This is an experimental-only feature and will probably remain so until we play around with it a bit. Similar to facets (#4123), the interface is likely to change based on testing. This is also strictly an implementation in workerd -- the production runtime would need a very different implementation.
A Worker loader binding allows you to start an isolate from code provided at runtime, and then talk to that isolate.
The test has more examples.
TODO (this PR):
TODO (later PRs, probably):
env
so you can actually provide resource bindings. Currently it can only contain simple serializable values.