feat: Introduce Horizontal Pod Autoscaler, offload blocking evaluatio…#269
Merged
IsmailMehdi merged 18 commits intomainfrom Mar 18, 2026
Merged
feat: Introduce Horizontal Pod Autoscaler, offload blocking evaluatio…#269IsmailMehdi merged 18 commits intomainfrom
IsmailMehdi merged 18 commits intomainfrom
Conversation
…n tasks to a thread pool, and enhance session manager robustness.
…ve the build time environment variable.
Collaborator
Author
|
/gcbrun |
1 similar comment
Collaborator
Author
|
/gcbrun |
…include additional metadata for sdist and wheel URLs.
Collaborator
Author
|
/gcbrun |
Collaborator
Author
|
/gcbrun |
…for UI and metrics.
…n an `UncloseableStream`.
…llations and removing NVM.
…hin the Docker container.
…e CSV reporter outputs to a shared volume when running in server mode.
…ve an extra blank line in `csv.py`.
Collaborator
Author
|
/gcbrun |
totoleon
approved these changes
Mar 18, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces infrastructure and code enhancements to support horizontal scaling, improve concurrency, and increase the robustness of session management within the evalbench service.
Key Changes
Scalability & Orchestration
Horizontal Pod Autoscaler (HPA): Added a new hpa.yaml configuration to automatically scale the evaluation server based on CPU utilization (target 50%).
Resource Management: Defined explicit CPU and memory requests and limits for the evalbench-eval container to ensure predictable scaling behavior.
Makefile Updates: Updated the deploy target to include the HPA configuration during deployment.
Performance & Concurrency
Async Offloading: Modified eval_service.py to offload blocking operations, such as evaluator.evaluate and _process_results, to a thread pool executor. This prevents the gRPC event loop from blocking during intensive evaluation tasks.
Container Optimization: Updated the Dockerfile to streamline supervisord configuration and set a fixed BUILD_TIME environment variable.
Session Management Robustness
Safe Lookups: Updated sessionmgr.py to use .get() for session lookups and added existence checks before deletion to prevent KeyError exceptions.
Improved Reaper Logic: Refactored the session reaper to be more efficient, using list comprehensions for identifying expired sessions and increasing the sleep interval to 10 seconds to reduce overhead.