-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Out of Process JIT Compiler Server #44935
Comments
CC @AndyAyersMS, @mangod9 and @davidwrighton because this effort needs support from multiple teams. |
cc @jkotas Interesting idea. (this is more of a VM issue, so will relabel). Out of process jitting seems challenging but doable. The JIT-EE interface could be a remoting interface. The interface is fairly chatty so remoting it probably would not be very efficient, but this inefficiency might be tolerable if the results could be used to short-circuit future jitting. But caching JIT results and re-using them in other processes seems like it would be quite difficult. The JIT-EE interface is also stateful. JIT queries can cause side effects in the runtime, and those side effects along with side effects incurred by running code impact future queries made by the JIT. So it's not obvious how the results of jitting in one process could be shared by another process, even if they two are running the same exact bundle of managed code. |
Yeah need to determine whether the ROI is justifiable given the challenges noted above + security implications. |
I believe that once you iterate a few times on what the caching scheme would look like, you will end up with something that looks very much like AOT image cache. The request that application issues describes the assemblies/assembly/type/method that it wants the code for, the dependencies (e.g. exact versions of the dependencies) and environment (e.g. processor model). The server sends back an AOT image that fits the bill that the application loads and wires in. At assemblies/assembly granularity, it should be possible to build this as service independent on the runtime itself using the callbacks that the runtimes provides today. At finer grained granularity, it would require more finer grained callbacks and probably more compact AOT format than PE files for efficiency. For hyperscale services running a lot of code on a lot of servers, I doubt that this would deliver better results than a more traditional workflow where the AOT code is deployed from the get-go. |
PowerShell could be benefit from this too. It is a common scenario to run the same PowerShell scripts by task scheduler/cron. PowerShell can compile and cache script blocks but PowerShell lost the cache after its process stopped and next start is always cold. So an inter-process cache would be very useful. |
Background and Motivation
Consider a large monolithic application with a lot of managed code that needs to be JIT compiled. Furthermore, this application is running on hundreds of thousands of servers and each of these servers is redeploying this application multiple times per day. This scenario is going to (in fact already has in many cases) become more common with applications running in ephemeral containers which are shipped in pods with various lifetimes.
It is then wasteful to recompile this application on thousands of nodes. One answer to this problem is precompilation of the managed code in a build environment once and deploy this precompiled code to these thousands of servers. This does, in fact, solve most of the problems and is probably the most ideal solution.
However, it is difficult to precompile all code, and in a large application that has been precompiled a significant chunk of code (depending on the app) can still be JIT compiled.
There are various reasons why JIT compiled code will exist for most applications.
Given this, one way to mitigate the JIT compilation occurring on thousands of nodes could be if there were a cache of this JIT compiled code available at a cost significantly lower than actually JIT compiling the code. For example, a file share or database of precompiled code for that method that can mostly trivially loaded into the running program.
This desire to have a code cache then necessitates a different lifetime than the process where the code will be needed and therefore it has to be out of process.
Proposal
The proposal is to have a system that could provide this functionality and do it in a way such that the target process could request through some mechanism the code or method to be compiled and that the compiler server can facilitate this either by storing it in some form, delegating it to a 3rd party, or actually generating the code -- or a combination there of.
Prior Art
I'm only aware of LLVM's Orc subsystem which has similar goals to what this issue is proposing, the ability of generate code in a different process than the process where code is intended to be used in.
Risks
Specialized niche use case & too complicated for most .NET customers.
The text was updated successfully, but these errors were encountered: