-
Notifications
You must be signed in to change notification settings - Fork 203
Custom JS resolver #196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom JS resolver #196
Conversation
|
||
drop(stylesheets); // ensure we aren't holding the lock anymore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything above this point in the function is effectively unchanged, the diff looks worse than it is. All I did was remove drop(stylesheets)
because this wasn't good enough to prove to Rust that stylesheets
isn't accessible past the await
boundaries. Instead I made a scope that returned source_index
and kept stylesheets
there so they are fully out of scope before the await
.
abcbd4f
to
85db0c6
Compare
Thanks! There's a lot here. I'll take a look soon. |
) The two `SourceProvider` methods are now `async` which will give a hook for JavaScript to implement them and be called from the main thread. Everything is `async` now where possible and `bundle()` executes the `Future` synchronously to maintain its contract. Since it won't support custom JavaScript resolvers, there should never be a case where `bundle()` can't execute synchronously. Rayon doesn't seem to support `async` iterators, but also shouldn't be as necessary here. Removed it in order to call `async` APIs.
This is roughly equivalent to `bundle()`, except that is executes the `Future` *asynchronously*, returning a JS `Promise` holding the result. Errors are formatted a bit differently since they just use the `Display` trait to convert to strings, rather than the `throw()` function which converts them to a JS `SyntaxError` in `bundle()`. This can't be used in the async version because it must return a `napi::Result` which is immediately emitted to the user and no JS values are accessible. While it is possible to return the `CompileError` directly from the async block in a `napi::Result<Result<TransformResult, CompileError>>` and then call `throw()` in the callback with access to JS data, doing so causes lifetime issues with `fs` and isn't easily viable.
) This adds a new `JsSourceProvider` which acts a `SourceProvider` which invokes associated JS implementations of the resolver if they exist or falls back to `FileProvider` when not given. This allows JS consumers to override one or both of these methods. If JS does *not* override either, they should not pay any significant performance penalty since all the Rust work will stay on the same thread. The JS implementations are invoked as thread-safe functions, pushing the arguments onto a queue and adding it to the event queue. Some time later, the main thread pulls from this queue and invokes the function. `napi-rs` doesn't seem to provide any means of receiving a JS return value in Rust, so instead arguments for both `read()` and `resolve()` include a callback function using Node calling conventions (`callback(err, result)`). This looks like: ```javascript await bundleAsync({ // Options... resolver: { read(file: string, cb: (string) => void): void { // Read file and invoke callback with result. fs.read(file, 'utf-8').then((res) => cb(null, res), (err) => cb(err, null)); }, resolve(specifier: string, originatingFile: string, cb: (string) => void): void { // Resolve and invoke callback with result. const resolved = path.resolve(originatingFile, '..', specifier); cb(null, resolved); }, }, }); ```
…(...args) => Promise<Result>` (parcel-bundler#174) This hides the fact that `napi-rs` doesn't actually do anything with the return value of a threadsafe JS function, so any returned data would be dropped. Instead, Parcel adds a callback function which gets invoked by JS with the resulting value, following Node conventions of `callback(err, result)`. This is unergonimic as an API, so the JS shim exposes a `Promise`-based interface which gets converted to the callback behavior required by `napi-rs`. This also includes tests for all the common use cases and error conditions of custom JS resolvers.
This rejects immediately if the user attempts to pass a `resolver` property to the synchronous `bundle()` function. Since communciating across between background threads and the main thread is quite tricky without an asynchronous hook, `bundle()` does not support custom resolvers.
I took another pass at this PR and I think I was able to resolve those two open issues.
Hopefully that should resolve all the known blockers here. Let me know if there's anything I can do to help push this forward. |
@devongovett, anything I can do to help push this forward? Would love to see custom JS resolvers in Parcel. |
hey, sorry for my slow responses. This is a big change, and I haven't had time to fully test it out and think through it yet. Two things I am thinking about:
Hopefully in the meantime you're able to use your fork so I'm not blocking you. |
Sorry, I don't mean to be too annoying about this, mainly just making sure it didn't get forgotten. 😅
Can you elaborate on your concern with the async approach? Are you worried that there is a performance regression since async isn't quite zero-cost? Are you hoping to avoid a new Looking at your prototype, IIUC, it seems like you're basically using JavaScript workers to manage the different threads with Rayon to parallelize scheduling and using mpsc channels to block on the work. Am I reading this correctly? I'm a bit confused by the value of this given that the parallel operation with Rayon just sends a message to a JS worker and waits for it. I'm not sure how much value that's really giving and I feel like the overhead of additional spawned threads is probably more costly than its benefit? Since Tokio uses a multi-threaded work stealing approach, that sounds a lot closer to the current Rayon model than this mpsc method and would probably have similar performance characteristics (I'm speculating here, not sure if you have any benchmarks where I could easily measure and validate this). We should also keep in mind the expected use case of custom resolvers and readers. A resolver is likely a trivial amount of work (if it reads files it might take more time, but will be IO-bound), while a reader will always be heavily IO-bound. Given that neither is CPU-bound, if the goal is to avoid overworking the main thread, I'm not sure there's much value in it given that async JS will push those IO tasks off the main thread automatically and any CPU work is likely to be trivial. I guess it might result in an unnecessary context switch in and out of the main thread, but it looks like your mpsc approach doesn't use the main thread for much useful work anyways, so I don't think it's really much worse in this regard? Also, if we do this in the synchronous In the context of the custom resolver, the JS worker approach also means that the resolver needs to be written in a worker, which definitely makes the API much less ergonomic. Module blocks could help with this, but that proposal is still a ways out. I'm not saying the async approach is superior, just trying to understand the trade-offs you're making here and where you're trying to go.
Agreed that |
Merged an alternative approach in #263, following the same API you proposed here, but without needing to make the bundler internals async. Will be contributing a few napi changes upstream later. Thanks for getting this started, and apologies for my slow responses. Hopefully you were able to use your fork in the meantime. |
This adds a new
bundleAsync()
function with support for a custom JavaScript resolver. A resolver which re-implements the default Rust behavior looks like:This required some changes to the way work is parallelized to be compatible with
async/await
. I had to remove Rayon as it doesn't seem compatible with this model. This resulted in two still-unsolved issues (edit: these should both be resolved now, see #196 (comment)):loadFile()
where I had to clone each rule during processing to resolve ownership issues. I'll leave a comment at the relevant place in the PR.async
and joined withfutures::future::join_all()
, but as soon as one operation gets blocked all of them are blocked. Since we can't use Rayon for this, I suspecttokio::spawn()
could serve the purpose of abstracting out all the details of how many threads to create and how to distribute jobs for them, but it ran into similar ownership issues from 1. and I couldn't find a good way to get it working.Any ideas or suggestions for how to resolve these issues would be greatly appreciated.
This implementation takes advantage of
napi-rs
ThreadsafeFunction because JS data is restricted to the main thread, the Rust processing is mostly happening on background threads. This queues any requested invocations and waits for the main thread to become available (effectively adding the call to the JS event queue). Once ready, the JS function is invoked. The return value is dropped asnapi-rs
doesn't seem to do anything with it, however Parcel passes in a callback function for the JS to invoke with the result, using Node callback conventions (callback(err, result)
). This means the actual JS contract is:This is unergonomic, so a small JS shim wraps the Rust implementation of
bundleAsync()
and converts this contract to thePromise
-based one mentioned earlier. This makes custom resolvers much easier to use while still adhering tonapi-rs
' required calling conventions.I included tests for the new behavior, though there doesn't seem to be much existing JS test infrastructure, so it's a bit primitive for now and mostly aligns with the existing JS test of
transform()
. They can be executed withnpm run build && node test-bundle.js
. Not sure if there's a better setup for this.bundle()
is unaffected because communicating between threads in a synchronous-compatible manner is quite tricky and not in-scope here. However, it does throw if given aresolver
, since that's indicative of user-error.