Skip to content

cloud-codex runtime + LiteLLM-shared auth surface (ADR-014)#370

Merged
samxu01 merged 2 commits into
mainfrom
sprint/cody-via-litellm
May 15, 2026
Merged

cloud-codex runtime + LiteLLM-shared auth surface (ADR-014)#370
samxu01 merged 2 commits into
mainfrom
sprint/cody-via-litellm

Conversation

@samxu01
Copy link
Copy Markdown
Contributor

@samxu01 samxu01 commented May 15, 2026

Summary

  • New cloud-codex k8s runtime (per-agent Deployment + PVC) running codex CLI in-cluster; first agent: Cody.
  • Codex CLI routes through LiteLLM via ~/.codex/config.toml (model_provider = litellm) — same OAuth chain as openclaw moltbot.
  • LiteLLM pod gains codex-cli sidecar + chatgpt-auth PVC so operators device-auth from inside the cluster (resolves cluster-IP-bound OAuth).
  • ADR-014 captures the runtime ≠ auth-surface invariant; CLAUDE.md gets 2 quick-rule bullets.

Test plan

  • Cody pod running and replying to mentions via codex CLI → LiteLLM → ChatGPT (verified 2026-05-15)
  • Rotator reads pod-side auth-N.json (nested codex CLI shape) — accounts 1 + 2 in rotation
  • PVC survives helm upgrades (strategy.type: Recreate)
  • Skills docs updated (separate commit in commonly-skills)
  • Reviewer: confirm ADR-014 wording matches the shipped behavior

🤖 Generated with Claude Code

samxu01 and others added 2 commits May 14, 2026 23:02
…pt.com

Multi-runtime ≠ multi-auth-surface. Codex CLI's runtime distinction
(sandbox, tool use, sessions) is independent from where its HTTPS calls
go. Point codex CLI at LiteLLM instead of chatgpt.com so:

- single auth surface across openclaw and codex runtimes
- one rotator, one cluster-bound auth.json (already established by PR #365)
- per-agent codex login --device-auth no longer needed
- per-agent /state/.codex/auth.json no longer needed
- shared quota pool across all agents
- LiteLLM observability captures all model traffic regardless of runtime

What changes:
- Boot script seeds ~/.codex/config.toml with model_provider=litellm,
  base_url pointing at LiteLLM service, wire_api=responses (matches the
  chatgpt/ bridge's Responses-API shape), env_key=LITELLM_API_KEY.
- LITELLM_API_KEY exported from a k8s Secret (cloud-codex-<name>-litellm-key,
  optional so the pod can boot before the key exists; warning logged
  if missing).
- Drops the "wait for /state/.codex/auth.json" gate — no longer needed
  since codex CLI no longer holds its own auth.

Operator setup (per agent):
  1. POST /api/registry/install (cloud-codex/<name>)
  2. Mint AgentInstallation runtime token → secret cloud-codex-<name>-token
  3. Mint LiteLLM virtual key → secret cloud-codex-<name>-litellm-key
  4. helm upgrade — pod boots, no device-auth needed

The cloud-codex pod's PVC still holds /state/.commonly/tokens/<name>.json
(commonly agent run loop's CAP token); only the codex auth.json went away.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… quick-rules

ADR-014 captures the runtime ≠ auth-surface separation: cloud-codex agents
run codex CLI in-cluster but proxy through LiteLLM, which is the single
ChatGPT-OAuth holder for the cluster. Pod-side device-auth via the new
codex-cli sidecar resolves the server-side IP-binding constraint that
killed laptop-uploaded tokens.

CLAUDE.md Agent Runtime quick-rules: 2 bullets (cloud-codex registration
+ "never device-auth elsewhere") so future sessions land in the right
mental model without needing the ADR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant