🚀 Announcing Agent Sandbox v0.5.0!
We're excited to announce the release of Agent Sandbox v0.5.0! This release marks a significant milestone with the official graduation of our APIs to v1beta1, bringing enhanced stability, critical security hardening, and a wealth of new features and improvements across the platform, client SDKs, and examples. Dive in to experience a more robust and developer-friendly Agent Sandbox.
⚠️ Breaking Changes / Action Required
- API Group Upgrade and Deprecation (
v1alpha1tov1beta1):- The core and extension APIs (
agents.x-k8s.ioandextensions.agents.x-k8s.io) have been officially graduated fromv1alpha1tov1beta1. v1alpha1APIs are now deprecated. While multi-version CRD support is introduced with a conversion webhook, users are strongly encouraged to migrate theirv1alpha1resources tov1beta1.- Action Required: Update your manifests and API interactions to use
apiVersion: agents.x-k8s.io/v1beta1andapiVersion: extensions.agents.x-k8s.io/v1beta1. Refer to the API Migration Guide for detailed steps.
- The core and extension APIs (
- Sandbox
spec.replicasRemoved,spec.operatingModeIntroduced:- The
spec.replicasfield has been removed from the Sandbox API and replaced withspec.operatingMode(with valuesRunningandSuspended). - This is a breaking change for any automation or tools that relied on
spec.replicasfor scaling (e.g.,kubectl scale, HorizontalPodAutoscalers, PodDisruptionBudgets). - Action Required: Update your Sandbox manifests and any scaling logic to use
spec.operatingModefor managing Sandbox lifecycle.
- The
- SandboxClaim
spec.templateRefReplaced byspec.warmpoolRef:- The
SandboxClaimAPI no longer usesspec.templateRefor thewarmpoolpolicy field. Instead, claims must explicitly point to aSandboxWarmPoolusingspec.warmpoolRef. - To achieve a cold start without pre-warming, cluster administrators should create a
SandboxWarmPoolwithreplicas: 0for users to reference. - Action Required: Update
SandboxClaimmanifests to referencespec.warmpoolRefpointing to an existingSandboxWarmPoolresource.
- The
- NetworkPolicy Namespace Restriction for
sandbox-router:- The default
NetworkPolicygenerated by theSandboxTemplatecontroller now strictly scopes ingress rules to theagent-sandbox-systemnamespace for thesandbox-router. - Action Required: If your deployments are running the
sandbox-routerin a namespace other thanagent-sandbox-system, you must migrate and deploy thesandbox-routerinsideagent-sandbox-systemprior to or in tandem with upgrading the controller to avoid service interruption.
- The default
Key Highlights
Core API & Platform Stability
- API Graduation & Multi-Version Support: Official graduation of core and extension APIs to
v1beta1, including multi-version CRD support with conversion webhooks forv1alpha1compatibility during migration (#817, #993). - Sandbox Lifecycle Management: Replaced
spec.replicaswithspec.operatingModefor more explicit control over Sandbox suspension and resume behavior (#801). - SandboxClaim Enhancements:
SandboxClaimnow usesspec.warmpoolReffor clearer warm pool association and gained printer columns for improvedkubectl getvisibility (#899, #984). - Optimized Warm Pool Operations: Enabled parallel creation and deletion of sandboxes within
SandboxWarmPoolcontroller, significantly speeding up scale operations (#798). - Improved Warm Pool Selection Strategy: Implemented a smart warm pool selection strategy that prioritizes ready sandboxes, spreads workloads across nodes, and optimizes for in-memory processing, reducing API overhead (#878, #939).
- Resource Adoption & Persistence: Fixed orphan adoption for Sandbox child resources and introduced explicit authorization for unowned resources to prevent hijacking (#944, #784).
- Performance Improvements: Switched
SandboxClaimstatus updates to patching (.Patch()) to reduce conflicts at scale, improving overall system performance (#508). - Helm Chart Enhancements: Added support for
podSecurityContext,containerSecurityContext,podAnnotations, andpodLabelsin the controller Helm chart for better Kubernetes policy compliance and custom metadata injection (#753, #750). - Storage Configuration via SandboxClaim: Introduced support for volume claim templates within SandboxClaims, enabling customized persistent volumes with policy-driven merging (#960).
- Warmpool Label Propagation: Enhanced warmpool label propagation from sandbox to pod, ensuring consistent identification across resources (#927).
- Preserve Zero Replica Counts: Fixed an issue where zero replica counts in warmpool status were not preserved during server-side apply operations (#807).
- Assigned Sandbox Name Storage: Switched to storing assigned Sandbox names in annotations instead of labels to bypass Kubernetes length constraints (#771).
Security & Hardening
- SSRF Protections: Disabled automatic HTTP redirects in both Go and Python SDKs to prevent Server-Side Request Forgery (SSRF) vulnerabilities from untrusted sandbox workloads (#874, #816).
- Router Security: Addressed an unauthenticated internal proxy vulnerability in the sandbox router with strict input validation and optional bearer token authentication (#755).
- Network Policy Enhancements: Default
NetworkPolicynow blocks IPv6 link-local traffic and strictly scopes ingress to theagent-sandbox-systemnamespace for thesandbox-routerfor enhanced isolation (#827, #881). - Build-time Injection Prevention: Sanitized git-derived version strings to prevent build-time command injection vulnerabilities (#946).
- Denial of Service (ReDoS) Fix: Replaced a vulnerable regex matching function with an iterative dynamic programming approach to resolve a ReDoS vulnerability (#935).
- Pod Metadata Protection: Protected system-reserved Pod labels and annotations from tenant override to prevent traffic hijacking or tracking label forging (#894).
- Warm Pool Poisoning Prevention:
isAdoptablefunction now explicitly rejects unowned sandboxes to prevent warm pool poisoning (#875). - OpenTelemetry Trace Sanitization: Sanitized
sandbox.commandattribute in OpenTelemetry traces to prevent sensitive data exposure (#895). - CLI Tool Hardening: Fixed concurrency race conditions and stale PID cleanup issues in
resourcectlCLI utility, preventing data loss and arbitrary process termination (#934, #902).
Client SDK & Developer Experience
- Dynamic Timeout Propagation: SDKs now support dynamic timeout propagation to the sandbox router, ensuring long-running operations are not prematurely terminated (#857).
- Python Async Client Cleanup: Added
cleanup=Truesupport toAsyncSandboxClientfor automatic resource cleanup on program termination (#859). - Python
additionalPodMetadataExposure: ExposedadditionalPodMetadatain the Python client for direct control over Sandbox Pod labels and annotations (#979). - Go Client PodIP Routing: Enabled PodIP routing in the Go client to resolve connection issues when Kubernetes DNS is unavailable for sandbox services (#910).
- Sandbox Client Improvements: Hardened filesystem path sanitization, improved label selectors, and enabled template-verified reattachment in the Python SDK (#695).
- PSS SDK Enhancements: Enabled restoration from dedicated snapshots and filtering by creation timestamp for the Python Snapshot SDK (#799, #732).
- CI/CD & Tooling: Optimized CI staging builds, increased promotion timeouts, and updated
pyyamldependency for CRD sorting during release publish (#1021). Improved AI code review configuration and guidelines for Copilot and CodeRabbit (#938, #936, #947, #866).
Examples & Documentation
- RL & Evals Example: Introduced
agent-sandbox-rl, a complete Python package for multi-cluster warm-pool orchestration of RL and Evals workloads (#1000). - Anthropic Agents Example: Added an example for running Anthropic Managed Agents self-hosted sandboxes on GKE Agent Sandbox (#950).
- Sandboxed Tools Enhancements: Improved
sandboxed-toolsexamples to persist sessions and filesystem state across multiple tool calls and refactored tools into their own package (#888, #887, #877, #886). - MCP Server Example: Provided an example for running an MCP server inside a sandbox with persistent storage (#937).
- AKS Kata Container Example: Added an AKS example demonstrating Kata Containers with sandbox warm pools (#839).
- Ray Integration: Documented an example on how to run a RayJob with Agent Sandbox via direct PodIP (#868, #742).
- Comprehensive Troubleshooting Guide: Added a detailed troubleshooting guide for debugging SDK, custom image, and cluster-level issues (#660).
- API & NetworkPolicy Documentation: Updated documentation to reflect
v1beta1API changes, clarified NodeLocal DNS walkthrough, and expanded NetworkPolicy guidance (#867, #823, #815). - Issue Templates: Added structured GitHub issue templates for bug reports, feature requests, and epics, and improved their ordering (#880, #891).
Installation
Core & Extensions
# To install only the core components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.5.0/manifest.yaml
# To install the extensions components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.5.0/extensions.yamlTo upgrade from v0.4.6 to v0.5.0, please follow the detailed steps in API Migration Guide.
Python SDK
pip install k8s-agent-sandbox==0.5.0Contributors
We extend our sincere thanks to all contributors to this release:
@AlexBulankou, @ArthurKamalov, @HasonoCell, @RidPra, @SHRUTI6991, @aditya-shantanu, @aleks-stefanovic, @alimx07, @app/dependabot, @armistcxy, @arpitjain099, @chw120, @hrsh1209, @ianchakeres, @inardini, @janetkuo, @justinsb, @kannon92, @lauragalbraith, @mesutoezdil, @moficodes, @mvanhorn, @patcrombie, @rainwoodman, @rmalani-nv, @ryanzhang-oss, @sairajp-rewind, @shaikenov, @shelwinnn, @shrutiyam-glitch, @sohanpatil, @tom1299, @tomergee, @vicentefb, @volatilemolotov, @zhzhuang-zju
New Contributors
- @lauragalbraith made their first contribution in #763
- @shelwinnn made their first contribution in #805
- @arpitjain099 made their first contribution in #796
- @patcrombie made their first contribution in #803
- @rainwoodman made their first contribution in #711
- @rmalani-nv made their first contribution in #750
- @shaikenov made their first contribution in #798
- @tom1299 made their first contribution in #845
- @armistcxy made their first contribution in #885
- @ianchakeres made their first contribution in #906
- @hrsh1209 made their first contribution in #753
- @AlexBulankou made their first contribution in #866
- @ryanzhang-oss made their first contribution in #839
- @mvanhorn made their first contribution in #864
- @kannon92 made their first contribution in #947
- @RidPra made their first contribution in #859
- @sairajp-rewind made their first contribution in #957
- @HasonoCell made their first contribution in #857
- @mesutoezdil made their first contribution in #978
- @inardini made their first contribution in #950
- @zhzhuang-zju made their first contribution in #974
Full Changelog: v0.4.6...v0.5.0