Skip to content

Support Codex subscription auth in Daytona#296

Merged
xdotli merged 3 commits into
mainfrom
codex/codex-subscription-auth-daytona
May 20, 2026
Merged

Support Codex subscription auth in Daytona#296
xdotli merged 3 commits into
mainfrom
codex/codex-subscription-auth-daytona

Conversation

@xdotli
Copy link
Copy Markdown
Member

@xdotli xdotli commented May 20, 2026

Summary

  • auto-inherit Codex auth env vars and support CODEX_ACCESS_TOKEN as a native Codex subscription/access-token path
  • normalize CODEX_API_KEY to OPENAI_API_KEY for native codex-acp runs so the existing auth.json writer still works
  • update Codex docs, conformance/integration credential checks, and example scripts

Validation

  • uv run python -m pytest tests/test_subscription_auth.py tests/test_resolve_env_helpers.py tests/test_sdk_internals.py::TestResolveAgentEnv tests/test_agent_registry.py tests/test_provider_runtime.py -q
  • uv run ruff check src/benchflow/agents/env.py tests/test_subscription_auth.py tests/test_resolve_env_helpers.py tests/test_sdk_internals.py tests/conformance/run_conformance.py
  • uv run ty check src/benchflow/agents/env.py
  • bash -n tests/integration/run.sh tests/examples/test_codex.sh tests/examples/test_codex_custom_provider.sh

Live E2E

Ran with OPENAI_API_KEY, CODEX_API_KEY, and CODEX_ACCESS_TOKEN unset so auth had to come from local ~/.codex/auth.json uploaded into Daytona.

  • tests/examples/hello-world-task on Daytona + codex-acp + gpt-5.4-mini: reward 1.0, 1 tool call
  • SkillsBench jax-computing-basics on Daytona + codex-acp + gpt-5.4-mini: reward 1.0, 15 tool calls
  • local evidence: jobs/codex-daytona-auth-real/2026-05-19__23-20-31/jax-computing-basics__5248f3f2/result.json

Note

Medium Risk
Updates credential resolution for codex-acp, which can affect whether runs authenticate via API key, access token, or host login—especially when using custom OpenAI-compatible endpoints. Changes are well-covered by new tests and docs, but auth-path regressions could block Codex runs.

Overview
Adds first-class Codex subscription/access-token support by auto-inheriting CODEX_ACCESS_TOKEN/CODEX_API_KEY, treating CODEX_ACCESS_TOKEN as sufficient auth for native OpenAI Codex runs, and normalizing CODEX_API_KEYOPENAI_API_KEY so existing ~/.codex/auth.json writing continues to work.

Tightens auth fallback rules so host login / subscription auth and access tokens are not accepted when Codex is routed to a custom OpenAI-compatible base URL, and updates conformance/integration scripts, example scripts, and docs to reflect the expanded Codex credential options.

Reviewed by Cursor Bugbot for commit 34f3044. Bugbot is set up for automated code reviews on this repo. Configure here.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 90aedd1477

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +130 to +132
from benchflow.agents.providers import find_provider

return find_provider(model) is None
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restrict CODEX_ACCESS_TOKEN auth to true OpenAI endpoints

This native-context check treats every unregistered model prefix as OpenAI-native, so CODEX_ACCESS_TOKEN now bypasses the OPENAI_API_KEY requirement even when Codex is routed to a custom endpoint via OPENAI_BASE_URL/BENCHFLOW_PROVIDER_BASE_URL with an unprefixed model (for example local OpenAI-compatible stubs). In that setup, BenchFlow accepts auth that is not a provider API key and skips the intended validation path, which can break custom-provider runs or send the wrong bearer token to non-OpenAI endpoints. The new regression test only covers prefixed providers (vllm/...), so this unprefixed custom-endpoint path remains unguarded.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 6 potential issues.

View 5 additional findings in Devin Review.

Open in Devin Review

Comment thread tests/test_subscription_auth.py Outdated
assert result["_BENCHFLOW_SUBSCRIPTION_AUTH"] == "1"

def test_codex_access_token_auth(self, monkeypatch, tmp_path):
"""Guards PR #295: Blocks-style Codex auth via CODEX_ACCESS_TOKEN."""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Regression test docstrings reference non-existent PR #295 instead of current PR #296

AGENTS.md mandates: "Regression tests must name the PR/commit they guard in the docstring." All five new regression tests reference "PR #295" but this is PR #296, and no PR #295 appears anywhere in the repository's git history (the most recent PR is #294). This breaks the traceability the convention is designed to provide. Affected locations: tests/test_subscription_auth.py:179, tests/test_subscription_auth.py:195, tests/test_subscription_auth.py:214, tests/test_resolve_env_helpers.py:383, tests/test_resolve_env_helpers.py:400.

Suggested change
"""Guards PR #295: Blocks-style Codex auth via CODEX_ACCESS_TOKEN."""
"""Guards PR #296: Blocks-style Codex auth via CODEX_ACCESS_TOKEN."""
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread tests/test_subscription_auth.py Outdated
assert "_BENCHFLOW_SUBSCRIPTION_AUTH" not in result

def test_codex_api_key_auth_alias(self, monkeypatch, tmp_path):
"""Guards PR #295: CODEX_API_KEY works for native Codex auth."""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Regression test docstring references non-existent PR #295 instead of PR #296

Same AGENTS.md convention violation as the other four new regression tests — references "PR #295" which doesn't exist in repo history. Should reference current PR #296.

Suggested change
"""Guards PR #295: CODEX_API_KEY works for native Codex auth."""
"""Guards PR #296: CODEX_API_KEY works for native Codex auth."""
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread tests/test_subscription_auth.py Outdated
def test_codex_access_token_does_not_auth_custom_provider(
self, monkeypatch, tmp_path
):
"""Guards PR #295: access tokens are not proxy API keys."""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Regression test docstring references non-existent PR #295 instead of PR #296

Same AGENTS.md convention violation — references "PR #295" which doesn't exist in repo history. Should reference current PR #296.

Suggested change
"""Guards PR #295: access tokens are not proxy API keys."""
"""Guards PR #296: access tokens are not proxy API keys."""
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread tests/test_resolve_env_helpers.py Outdated
def test_no_model_codex_access_token_wins_over_host_auth(
self, monkeypatch, tmp_path
):
"""Guards PR #295: CODEX_ACCESS_TOKEN is already usable auth."""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Regression test docstring references non-existent PR #295 instead of PR #296

Same AGENTS.md convention violation — references "PR #295" which doesn't exist in repo history. Should reference current PR #296.

Suggested change
"""Guards PR #295: CODEX_ACCESS_TOKEN is already usable auth."""
"""Guards PR #296: CODEX_ACCESS_TOKEN is already usable auth."""
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread tests/test_resolve_env_helpers.py Outdated
assert "_BENCHFLOW_SUBSCRIPTION_AUTH" not in result

def test_no_model_codex_api_key_alias_normalizes(self, monkeypatch, tmp_path):
"""Guards PR #295: CODEX_API_KEY is Codex-native API-key auth."""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Regression test docstring references non-existent PR #295 instead of PR #296

Same AGENTS.md convention violation — references "PR #295" which doesn't exist in repo history. Should reference current PR #296.

Suggested change
"""Guards PR #295: CODEX_API_KEY is Codex-native API-key auth."""
"""Guards PR #296: CODEX_API_KEY is Codex-native API-key auth."""
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread tests/test_subscription_auth.py Outdated
Comment on lines +215 to +216
for k in ("OPENAI_API_KEY", "ANTHROPIC_API_KEY"):
monkeypatch.delenv(k, raising=False)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Test does not clear CODEX_API_KEY from environment, unlike all neighboring tests

test_codex_access_token_does_not_auth_custom_provider clears only OPENAI_API_KEY and ANTHROPIC_API_KEY from the environment but does NOT clear CODEX_API_KEY. Every other Codex auth test in the same class (e.g. test_codex_subscription_auth at line 159-165, test_codex_access_token_auth at line 180, test_codex_api_key_auth_alias at line 196) consistently clears all Codex-related env vars. If a CI runner has CODEX_API_KEY set in its environment, auto_inherit_env will copy it into agent_env. While the test still passes today (because _normalize_codex_auth_env correctly refuses to bridge CODEX_API_KEYOPENAI_API_KEY for vllm-prefixed models), the incomplete cleanup makes the test fragile and inconsistent with the established pattern.

Suggested change
for k in ("OPENAI_API_KEY", "ANTHROPIC_API_KEY"):
monkeypatch.delenv(k, raising=False)
for k in ("CODEX_ACCESS_TOKEN", "CODEX_API_KEY", "OPENAI_API_KEY", "ANTHROPIC_API_KEY"):
monkeypatch.delenv(k, raising=False)
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@xdotli xdotli merged commit 2108e12 into main May 20, 2026
3 checks passed
@xdotli xdotli deleted the codex/codex-subscription-auth-daytona branch May 20, 2026 07:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant