Background
The .NET team has done an incredible job addressing cold start times, primarily through Native AOT. While Native AOT is fantastic, it comes with strict limitations: lack of dynamic code generation (Reflection.Emit), constrained reflection, and ecosystem incompatibility.
For many large, legacy, or highly dynamic enterprise applications, moving to Native AOT is simply not possible. These apps still suffer from long startup times and high CPU spikes during initial JIT compilation, which is a major pain point in Serverless (AWS Lambda, Azure Functions) and auto-scaling Kubernetes environments.
Proposal
I would like to propose adding support for Application Checkpoint/Restore, similar to the CRaC (Coordinated Restore at Checkpoint) API in the JVM ecosystem or Node.js snapshotting.
The workflow would look like this:
- Warmup Phase (Build/Deploy time): The application is started. A script drives mock traffic to it. The JIT compiles the hot paths, and caches/DI containers are populated.
- Checkpoint: The runtime takes a snapshot of the entire memory state (the heap, JITted code, etc.) and dumps it to disk.
- Restore Phase (Production): When a new instance is needed, the runtime boots directly from the snapshot.
Value Proposition
- The Best of Both Worlds: We get sub-millisecond, AOT-like startup times without sacrificing peak JIT performance (Tiered Compilation, PGO) or breaking Reflection/dynamic capabilities.
- Serverless Dominance: This would make standard JIT-compiled C# an absolute powerhouse in serverless environments, removing the cold start penalty entirely.
Currently, implementing something like CRIU (Checkpoint/Restore In Userspace) on Linux with .NET is highly unstable and unsupported. Having native, coordinated runtime support (where the framework knows how to pause background threads, close file handles, and re-open network sockets on restore) would be a massive leap forward for .NET backend development.
Background
The .NET team has done an incredible job addressing cold start times, primarily through Native AOT. While Native AOT is fantastic, it comes with strict limitations: lack of dynamic code generation (
Reflection.Emit), constrained reflection, and ecosystem incompatibility.For many large, legacy, or highly dynamic enterprise applications, moving to Native AOT is simply not possible. These apps still suffer from long startup times and high CPU spikes during initial JIT compilation, which is a major pain point in Serverless (AWS Lambda, Azure Functions) and auto-scaling Kubernetes environments.
Proposal
I would like to propose adding support for Application Checkpoint/Restore, similar to the CRaC (Coordinated Restore at Checkpoint) API in the JVM ecosystem or Node.js snapshotting.
The workflow would look like this:
Value Proposition
Currently, implementing something like CRIU (Checkpoint/Restore In Userspace) on Linux with .NET is highly unstable and unsupported. Having native, coordinated runtime support (where the framework knows how to pause background threads, close file handles, and re-open network sockets on restore) would be a massive leap forward for .NET backend development.