Support Codex subscription auth in Daytona by xdotli · Pull Request #296 · benchflow-ai/benchflow

xdotli · 2026-05-20T03:25:46Z

Summary

auto-inherit Codex auth env vars and support CODEX_ACCESS_TOKEN as a native Codex subscription/access-token path
normalize CODEX_API_KEY to OPENAI_API_KEY for native codex-acp runs so the existing auth.json writer still works
update Codex docs, conformance/integration credential checks, and example scripts

Validation

uv run python -m pytest tests/test_subscription_auth.py tests/test_resolve_env_helpers.py tests/test_sdk_internals.py::TestResolveAgentEnv tests/test_agent_registry.py tests/test_provider_runtime.py -q
uv run ruff check src/benchflow/agents/env.py tests/test_subscription_auth.py tests/test_resolve_env_helpers.py tests/test_sdk_internals.py tests/conformance/run_conformance.py
uv run ty check src/benchflow/agents/env.py
bash -n tests/integration/run.sh tests/examples/test_codex.sh tests/examples/test_codex_custom_provider.sh

Live E2E

Ran with OPENAI_API_KEY, CODEX_API_KEY, and CODEX_ACCESS_TOKEN unset so auth had to come from local ~/.codex/auth.json uploaded into Daytona.

tests/examples/hello-world-task on Daytona + codex-acp + gpt-5.4-mini: reward 1.0, 1 tool call
SkillsBench jax-computing-basics on Daytona + codex-acp + gpt-5.4-mini: reward 1.0, 15 tool calls
local evidence: jobs/codex-daytona-auth-real/2026-05-19__23-20-31/jax-computing-basics__5248f3f2/result.json

Note

Medium Risk
Updates credential resolution for codex-acp, which can affect whether runs authenticate via API key, access token, or host login—especially when using custom OpenAI-compatible endpoints. Changes are well-covered by new tests and docs, but auth-path regressions could block Codex runs.

Overview
Adds first-class Codex subscription/access-token support by auto-inheriting CODEX_ACCESS_TOKEN/CODEX_API_KEY, treating CODEX_ACCESS_TOKEN as sufficient auth for native OpenAI Codex runs, and normalizing CODEX_API_KEY → OPENAI_API_KEY so existing ~/.codex/auth.json writing continues to work.

Tightens auth fallback rules so host login / subscription auth and access tokens are not accepted when Codex is routed to a custom OpenAI-compatible base URL, and updates conformance/integration scripts, example scripts, and docs to reflect the expanded Codex credential options.

^{Reviewed by Cursor Bugbot for commit 34f3044. Bugbot is set up for automated code reviews on this repo. Configure here.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 90aedd1477

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-20T03:29:22Z

+    from benchflow.agents.providers import find_provider
+
+    return find_provider(model) is None


Restrict CODEX_ACCESS_TOKEN auth to true OpenAI endpoints

This native-context check treats every unregistered model prefix as OpenAI-native, so CODEX_ACCESS_TOKEN now bypasses the OPENAI_API_KEY requirement even when Codex is routed to a custom endpoint via OPENAI_BASE_URL/BENCHFLOW_PROVIDER_BASE_URL with an unprefixed model (for example local OpenAI-compatible stubs). In that setup, BenchFlow accepts auth that is not a provider API key and skips the intended validation path, which can break custom-provider runs or send the wrong bearer token to non-OpenAI endpoints. The new regression test only covers prefixed providers (vllm/...), so this unprefixed custom-endpoint path remains unguarded.

Useful? React with 👍 / 👎.

devin-ai-integration

Devin Review found 6 potential issues.

View 5 additional findings in Devin Review.

devin-ai-integration · 2026-05-20T03:34:58Z

        assert result["_BENCHFLOW_SUBSCRIPTION_AUTH"] == "1"

+    def test_codex_access_token_auth(self, monkeypatch, tmp_path):
+        """Guards PR #295: Blocks-style Codex auth via CODEX_ACCESS_TOKEN."""


🟡 Regression test docstrings reference non-existent PR #295 instead of current PR #296

AGENTS.md mandates: "Regression tests must name the PR/commit they guard in the docstring." All five new regression tests reference "PR #295" but this is PR #296, and no PR #295 appears anywhere in the repository's git history (the most recent PR is #294). This breaks the traceability the convention is designed to provide. Affected locations: tests/test_subscription_auth.py:179, tests/test_subscription_auth.py:195, tests/test_subscription_auth.py:214, tests/test_resolve_env_helpers.py:383, tests/test_resolve_env_helpers.py:400.

Suggested change

"""Guards PR #295: Blocks-style Codex auth via CODEX_ACCESS_TOKEN."""

"""Guards PR #296: Blocks-style Codex auth via CODEX_ACCESS_TOKEN."""

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-05-20T03:34:59Z

+        assert "_BENCHFLOW_SUBSCRIPTION_AUTH" not in result
+
+    def test_codex_api_key_auth_alias(self, monkeypatch, tmp_path):
+        """Guards PR #295: CODEX_API_KEY works for native Codex auth."""


🟡 Regression test docstring references non-existent PR #295 instead of PR #296

Same AGENTS.md convention violation as the other four new regression tests — references "PR #295" which doesn't exist in repo history. Should reference current PR #296.

Suggested change

"""Guards PR #295: CODEX_API_KEY works for native Codex auth."""

"""Guards PR #296: CODEX_API_KEY works for native Codex auth."""

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-05-20T03:35:01Z

+    def test_codex_access_token_does_not_auth_custom_provider(
+        self, monkeypatch, tmp_path
+    ):
+        """Guards PR #295: access tokens are not proxy API keys."""


🟡 Regression test docstring references non-existent PR #295 instead of PR #296

Same AGENTS.md convention violation — references "PR #295" which doesn't exist in repo history. Should reference current PR #296.

Suggested change

"""Guards PR #295: access tokens are not proxy API keys."""

"""Guards PR #296: access tokens are not proxy API keys."""

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-05-20T03:35:02Z

+    def test_no_model_codex_access_token_wins_over_host_auth(
+        self, monkeypatch, tmp_path
+    ):
+        """Guards PR #295: CODEX_ACCESS_TOKEN is already usable auth."""


🟡 Regression test docstring references non-existent PR #295 instead of PR #296

Same AGENTS.md convention violation — references "PR #295" which doesn't exist in repo history. Should reference current PR #296.

Suggested change

"""Guards PR #295: CODEX_ACCESS_TOKEN is already usable auth."""

"""Guards PR #296: CODEX_ACCESS_TOKEN is already usable auth."""

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-05-20T03:35:03Z

+        assert "_BENCHFLOW_SUBSCRIPTION_AUTH" not in result
+
+    def test_no_model_codex_api_key_alias_normalizes(self, monkeypatch, tmp_path):
+        """Guards PR #295: CODEX_API_KEY is Codex-native API-key auth."""


🟡 Regression test docstring references non-existent PR #295 instead of PR #296

Same AGENTS.md convention violation — references "PR #295" which doesn't exist in repo history. Should reference current PR #296.

Suggested change

"""Guards PR #295: CODEX_API_KEY is Codex-native API-key auth."""

"""Guards PR #296: CODEX_API_KEY is Codex-native API-key auth."""

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-05-20T03:35:04Z

+        for k in ("OPENAI_API_KEY", "ANTHROPIC_API_KEY"):
+            monkeypatch.delenv(k, raising=False)


🟡 Test does not clear CODEX_API_KEY from environment, unlike all neighboring tests

test_codex_access_token_does_not_auth_custom_provider clears only OPENAI_API_KEY and ANTHROPIC_API_KEY from the environment but does NOT clear CODEX_API_KEY. Every other Codex auth test in the same class (e.g. test_codex_subscription_auth at line 159-165, test_codex_access_token_auth at line 180, test_codex_api_key_auth_alias at line 196) consistently clears all Codex-related env vars. If a CI runner has CODEX_API_KEY set in its environment, auto_inherit_env will copy it into agent_env. While the test still passes today (because _normalize_codex_auth_env correctly refuses to bridge CODEX_API_KEY → OPENAI_API_KEY for vllm-prefixed models), the incomplete cleanup makes the test fragile and inconsistent with the established pattern.

Suggested change

for k in ("OPENAI_API_KEY", "ANTHROPIC_API_KEY"):

monkeypatch.delenv(k, raising=False)

for k in ("CODEX_ACCESS_TOKEN", "CODEX_API_KEY", "OPENAI_API_KEY", "ANTHROPIC_API_KEY"):

monkeypatch.delenv(k, raising=False)

Was this helpful? React with 👍 or 👎 to provide feedback.

Support Codex subscription auth in Daytona

90aedd1

chatgpt-codex-connector Bot reviewed May 20, 2026

View reviewed changes

devin-ai-integration Bot reviewed May 20, 2026

View reviewed changes

xdotli added 2 commits May 19, 2026 23:40

Make tasks generate CLI guard robust

a274349

Restrict Codex subscription auth to native endpoint

34f3044

xdotli merged commit 2108e12 into main May 20, 2026
3 checks passed

xdotli deleted the codex/codex-subscription-auth-daytona branch May 20, 2026 07:15

devin-ai-integration Bot mentioned this pull request May 20, 2026

Fix v0.5 follow-up stress regressions #306

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Codex subscription auth in Daytona#296

Support Codex subscription auth in Daytona#296
xdotli merged 3 commits into
mainfrom
codex/codex-subscription-auth-daytona

xdotli commented May 20, 2026 •

edited by cursor Bot

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 20, 2026

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

devin-ai-integration Bot May 20, 2026

Uh oh!

devin-ai-integration Bot May 20, 2026

Uh oh!

devin-ai-integration Bot May 20, 2026

Uh oh!

devin-ai-integration Bot May 20, 2026

Uh oh!

devin-ai-integration Bot May 20, 2026

Uh oh!

devin-ai-integration Bot May 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		from benchflow.agents.providers import find_provider

		return find_provider(model) is None

	"""Guards PR #295: Blocks-style Codex auth via CODEX_ACCESS_TOKEN."""
	"""Guards PR #296: Blocks-style Codex auth via CODEX_ACCESS_TOKEN."""

	"""Guards PR #295: CODEX_API_KEY works for native Codex auth."""
	"""Guards PR #296: CODEX_API_KEY works for native Codex auth."""

	"""Guards PR #295: access tokens are not proxy API keys."""
	"""Guards PR #296: access tokens are not proxy API keys."""

	"""Guards PR #295: CODEX_ACCESS_TOKEN is already usable auth."""
	"""Guards PR #296: CODEX_ACCESS_TOKEN is already usable auth."""

	"""Guards PR #295: CODEX_API_KEY is Codex-native API-key auth."""
	"""Guards PR #296: CODEX_API_KEY is Codex-native API-key auth."""

		for k in ("OPENAI_API_KEY", "ANTHROPIC_API_KEY"):
		monkeypatch.delenv(k, raising=False)

Conversation

xdotli commented May 20, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Live E2E

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

xdotli commented May 20, 2026 •

edited by cursor Bot

Loading