Improve browser-run reliability and resource controls#57
Merged
joyzoursky merged 14 commits intomainfrom Mar 17, 2026
Merged
Conversation
- make runner transport cadence and claim retry behavior configurable\n- add per-pod local browser concurrency guard with dispatch locking\n- reduce runner API rate-limit DB writes via in-memory mode option\n- tune default polling/reaper intervals and document new env knobs\n- add memory-aware HPA metric support
- add bounded env validation for resource-related runtime knobs\n- implement adaptive backoff for SSE polling and local run status polling\n- lower default device-sync cadence risk by aligning defaults\n- optimize runner device sync writes to avoid per-device unchanged upserts\n- document new polling interval controls in Helm docs and env example
- add k6-based runner claim load gate workflow with DB tx/s and RSS checks\n- add seed and k6 scripts for repeatable claim-load benchmarking\n- add maintainer dependency lifecycle policy and docs index link
- add profile override files for low, standard, and high deployment sizes - document profile-specific install commands - include runtime env tuning guidance for each profile
- switch control-plane defaults to a low-cost 1-replica baseline - reduce default polling and concurrency to lower CPU/memory pressure - make memory-backed rate limits the default to reduce DB write load
- add a browser-run gate script that dispatches real local browser executions - add a second load-gate workflow job with p95 latency, RSS, and OOM thresholds - wire a workspace npm script for browser gate execution
- remove direct browser dispatch from API, MCP, cancellation, and lease-reaper paths - add browser-runner worker loop and worker-mode guard for dispatcher - keep CI browser load gate aligned with worker mode and clean related tests
- add browserWorker values and deployment template - disable browser worker mode in control-plane pods explicitly - update sizing profiles and docs for split control-plane/browser-worker topology
- add integration-style local browser runner cancel SLA assertion\n- align device sync transaction test mocks with batch sync implementation
- add active-run non-abort SLA coverage for local browser runner reconciliation\n- add device sync no-op skip-path coverage for unchanged recent devices\n- make max cancellation poll interval configurable and document env knob
- set control-plane default and low profile to 50m/96Mi requests and 250m/256Mi limits\n- keep browser worker at 500m/1Gi requests with 2 CPU / 2Gi limits
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
low,standard,high) plus dedicated browser-worker deployment/HPA controls.Changes
Makefilelocal dev startup.Validation
npm run verify(pass)Breaking Changes
SKYTEST_BROWSER_WORKER=truelocally, browser-worker deployment in Helm).Risks
Follow-ups