-
Notifications
You must be signed in to change notification settings - Fork 257
RLS should compile in a separate process #1307
Comments
I think this depends on which model will cargo/rustc use in the future, multi-process/multi-threading, and whichever their choice, RLS have to follow.
It seems like vfs has a small code-base, if needed, I think we can use something like linux shared memory map to make it work across process.
Let's take llvm/clang as an example, although most llvm/clang's tools are designed to run as a separate process, but llvm's api can be thread-safe if each use it's own context(but I not quite sure does clang's api can be run from separate threads) , and I don't know what's (or what will be) the status for rustc's support for multi-threading.
I agree that thread cancelling is tricky.
I think maybe we can implement something with tokio to support listen on unix-domain/tcp/udp and not communicate through stdin/stdout? I think io-multiplexing with tokio is easy, but maybe we have to determine RLS' threading model to fully make use of tokio.
Perhaps with shared memory? |
I think the real fix should be on the rustc side, two things needs to be done:
The above api's can not be stable because they depends on the details of rustc's internals, but a stable api can be provided for macro-writers, which register callbacks before/after macro-expansion for each crate(and perhaps guaranteed to run the same thread of macro expander for the crate?), this way, procedural can have a chance to manage their mutable states. |
Yeah I think there are definitely improvements upstream that can be done for proc-macros. Also we can solve all stdout leakage by wrapping RLS itself in a inner process and using explicit comms. But these can both happen in addition to wrapping the compiler in a process. Though I imagine the latter will become a non-issue if we do, as proc-macro printlns are the most problematic for rls. I'm personally very keen to have cancelling for compiles and to a lesser extent completion. Cross-process vfs could be achieved with implementing |
Hi, @alexheretic I have thought about compile cancelling for a while, and I think perhaps it can be implemented in the following manner: |
@lijinpei It's a clever idea. I actually experimented with this myself way back when I first started moaning about cancellation in rls. From what I can remember it didn't seem to help much, probably because it didn't affect the cargo run. At the time I decided it was too much added complexity without decent enough benefit. If you something like this can work with cargo it might be worth a try. I still think separate process is the way to go though. |
@alexheretic Ok, I will try to explore making servo/ipc-channel work on windows with named pipe. |
Implement support for out-of-process compilation This is quite a lengthy patch, but the gist of it is as follows: - `rls-ipc` crate is introduced which acts as the IPC interface along with a server/client implementation - `rls-rustc` is enhanced with optional support for the IPC - RLS can optionally support it via setting `RLS_OUT_OF_PROCESS` env var (like `rls-rustc` it needs to be compiled `ipc` feature) The IPC is async JSON-RPC running on Tokio using `parity-tokio-ipc` (UDS on unices and named pipes on Windows) - Tokio because I wanted to more or less easily come up with a PoC - RPC because I needed a request->response model for VFS IPC function calls - uds/pipes because it's somewhat cross-platform and we don't have to worry about `rustc` potentially polluting stdio (maybe just capturing the output in `run_compiler` would be enough?) However, the implementation is far from efficient - it currently starts a thread per requested compilation, which in turn starts a single-threaded async runtime to drive the IPC server for a given compilation. I imagine we could either just initiate the runtime globally and spawn the servers on it and drive them to completion on each compilation to reduce the thread spawn/coordination overhead. While this gets rid of the global environment lock on each (previously) in-process crate compilation, what still needs to be addressed is the [sequential compilation](https://github.com/rust-lang/rls/blob/35eba227650eee482bedac7d691a69a8487b2135/rls/src/build/plan.rs#L122-L124) of cached build plan for this implementation to truly benefit from the unlocked parallelization potential. I did some rough test runs (~5) and on a warm cache had the following results: - integration test suite (release) 3.6 +- 0.2s (in-process) vs 3.8 +- 0.3s (out-of-process) - rustfmt master whitespace change (release) 6.4 +- 0.2s (in-process) vs 6.6 +- 0.3s (out-of-process) which at least initially confirms that the performance overhead is somewhat negligible if we can really parallelize the work and leverage process isolation for increased stability. cc #1307 (I'll squash the commits in the final merge, 30+ commits is a tad too much 😅 ) If possible I'd like to get more eyes on the patch to see if it's a good approach and what might be directly improved: - @matklad for potentially shared rustc-with-patched-filesystem - @alexheretic for the RLS/implementation itself - @alexcrichton @nrc do you have thoughts on if we can share the parallel graph compilation logic with Cargo somehow? For now we just rolled our own linear queue here because we didn't need much more but maybe it might be worthwhile to extract the pure execution bits somehow?
Support for compiling in a separate process has landed in the master branch, although you need to explicitly build with an |
@Xanewok What is needed for this be on by default? |
@Xanewok ping. Hoping this will be set to the default soon at least on linux, I'm getting out-of-memory with RLS even though I have 32G of memory. |
How do I build a version of rls with the ipc feature enabled? |
RLS currently calls into the compiler within it's own process. As I understand it this is done to allow compiling using Vfs, or more plainly compiling a project with unsaved files.
Problems with in-process
There are, however, some drawbacks to in-process compilation. I can think of 3 good ones: static memory isolation, cancellation & stdout isolation
Cargo/rustc have been designed as batch tasks run from cli, whereas RLS is a long-lived process. This means static memory usage in the compiler continues to consume memory when RLS is idle, and doing so is fairly reasonable in the compiler but less so in RLS.
Procedural macros authors assume compilation is a batch job / short lived. We can see this with "conflicting implementation of trait" with enum_dispatch crate #1212 where a proc-macro's use of mutable statics means RLS encounters bugs that are not easily encountered with cargo. It also mean proc-macros may consume memory while RLS is idle.
Compiles cannot be cancelled. This is an old annoyance of mine about RLS, in this way RLS is inferior to
cargo watch -x check
. Processes can be cancelled, and the compiler handles such cancellations pretty well. Threads cannot be cancelled in the same way. It would be a big performance improvement to RLS to be able to simply cancel the current compiler run after a source change, allowing the next compile run to start immediately.Compiler & procedural macros can very easily print to stdout. This basically kills RLS because we use RLS stdout exclusively for LSP comms and the LSP client libs cannot handle garbage.
Vfs cross-process
The main barrier, it seems, then is getting Vfs to work cross-process. If this is possible we can start the compiler in a new process. It would also unlock further improvements.
So can we make Vfs work with the compiler in another process?
Are there other benefits of working in-process?
The text was updated successfully, but these errors were encountered: