Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
copy of its
prp-azul/README.md(so that formatting is ON):Azul JVM & PodResourceProfiles
This example shows how PodResourceProfiles (vertical scaling) can help with resource intensive workloads during startup. Azul JVM - Zing runs JIT compilation during application warmup that dynamically
optimize certain (hot) paths of code into machine code. This compilation process requires more CPU than normal mode of the Java application. At the same time, we would like to make sure,
the users get the best experience so that we should allow the incoming traffic to the application only after it has been heated and it is performant enough.
Architecture
Azul JVM can expose the information about its compilation queue using JMX. With a simple Python script we can read this number and consider the workload ready only after it is bellow some configurable
threshold. Startup probes in Kubernetes are great fit for this use-case. They allow to check certain criteria more often during the startup and only after the startup is done, the classical readiness & liveness probes can kick in and start doing their periodic checks.
To demonstrate a Java application that does some serious heavy lifting, we choose to use the Renaissance benchmarking suite from MIT. Namely the
finagle-httpbenchmark. This particular benchmarksends many small Finagle HTTP requests to a Finagle HTTP server and waits for the responses. Once the benchmark run to completion, we run a sleep command.
Important
This feature is possible only with Kubernetes In-Place Pod Resource Updates. This feature is enabled by default since 1.33 (for older version it needs to be enabled using a feature flag).
Demo
Tip
For trying this on k3d, create the cluster using:
We should be able to see that after some time, it drops from
1CPU to0.2.In order to check the length of the compilation Q, one can run:
Conclusion
By asking for right amount of compute power at right times, we allow for more effective bin-packing algorithm in Kubernetes and, if used together with tools like Karpenter, this boils down to real cost savings.