Skip to content

Support creating sandbox CRs in a configurable target namespace (multi-tenant isolation) #1795

@prakashmirji

Description

@prakashmirji

Problem Statement

Problem

The OpenShell gateway only manages sandbox CRs in its own namespace. It tracks sandboxes internally by UUID and requires the CRD to exist in the same namespace as the gateway pod. There is no --namespace flag or multi-namespace watch capability.

This means all agent sandbox pods from all tenants run in a single shared namespace, regardless of organizational boundaries.

Current Behavior

  • Gateway deployed in namespace openshell
  • All sandbox CRs must be created in openshell
  • All agent pods run in openshell
  • No way to isolate tenants at the Kubernetes namespace level

Proposed Design

Desired Behavior

Option A: Gateway watches multiple namespaces

# Gateway configuration
args:
  - --watch-namespaces=project-dev,project-prod,openshell

Option B: Sandbox CRD accepts target namespace

apiVersion: agents.x-k8s.io/v1alpha1
kind: Sandbox
metadata:
  name: my-agent
  namespace: project-dev   # Sandbox CR AND pod created HERE

Option C: Per-tenant gateway instances (documented pattern)

Deploy lightweight gateway replicas scoped to specific namespaces.

Option B is preferred.

Alternatives Considered

Why This Matters

Requirement Current (single NS) With Multi-NS
Kubernetes RBAC per tenant Not possible Tenants manage their own NS
NetworkPolicy isolation All agents share NS Per-namespace NetworkPolicy
Resource quotas per team Single quota for all Per-namespace ResourceQuota
Istio PeerAuthentication Shared identity Per-namespace mTLS policies
Audit attribution All in one NS Clear namespace ownership
Compliance (data residency) Cannot segregate Namespace-level controls

Use Case

Enterprise platform with multiple teams/tenants deploying agents. Each team has their own Kubernetes namespace with RBAC, NetworkPolicy, ResourceQuota, and audit requirements. Currently all their agents are forced into a single shared namespace, breaking existing organizational security controls.

Workaround

  • All sandboxes run in the gateway namespace (openshell)
  • Logical tenant isolation via custom AgentClient CRD (controls which client IDs can call which agents)
  • Auth enforcement via Istio AuthorizationPolicy with label selectors on agent pods
  • No true namespace-level isolation between tenants' workloads

Agent Investigation

No response

Checklist

  • I've reviewed existing issues and the architecture docs
  • This is a design proposal, not a "please build this" request

Metadata

Metadata

Assignees

No one assigned

    Labels

    state:triage-neededOpened without agent diagnostics and needs triage

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions