Skip to content

fix(providers): reserve credential placeholder revisions#2049

Merged
elezar merged 3 commits into
mainfrom
1826-provider-refresh-placeholders/jm
Jun 30, 2026
Merged

fix(providers): reserve credential placeholder revisions#2049
elezar merged 3 commits into
mainfrom
1826-provider-refresh-placeholders/jm

Conversation

@johntmyers

@johntmyers johntmyers commented Jun 29, 2026

Copy link
Copy Markdown
Collaborator

Summary

Split the non-agent provider/runtime changes out of #1826. Reserve OpenShell's revision-scoped credential placeholder namespace so provider env vars cannot collide with generated placeholders, and allow stale revisioned placeholders to fall back to current credentials only when the key still exists.

Related Issue

Related to #1826.

Changes

  • Reject provider credential env vars that use the reserved v<digits>_ placeholder namespace.
  • Resolve stale revisioned credential placeholders through the current alias when the original revision aged out but the key still exists.
  • Preserve fail-closed behavior when a stale revisioned placeholder references a removed credential key.
  • Include OCSF network failure messages even when a destination endpoint is present.
  • Allow Codex access to files.openai.com in the provider profile.
  • Document the reserved placeholder namespace in architecture and provider docs.

Testing

  • git diff --check
  • cargo test -p openshell-core --lib
  • cargo test -p openshell-providers profiles --lib
  • cargo test -p openshell-ocsf format::shorthand --lib
  • mise run pre-commit passes
  • Unit tests added/updated
  • E2E tests added/updated (if applicable)

Note: first pre-commit attempt in the temp worktree failed because the fresh mise venv had not installed grpc_tools before the concurrent python:proto task ran. Rerunning after dependency installation passed.

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)

Signed-off-by: John Myers <johntmyers@users.noreply.github.com>
@github-actions

Copy link
Copy Markdown

@johntmyers johntmyers added the test:e2e Requires end-to-end coverage label Jun 29, 2026
@github-actions

Copy link
Copy Markdown

Label test:e2e applied for 2033c36. Open the existing run and click Re-run all jobs to execute with the label set. The run will execute the standard E2E suite after building the required gateway and supervisor images once. The matching required CI gate status on this PR will flip green automatically once the run finishes.

Comment thread architecture/sandbox.md
Comment thread crates/openshell-core/src/secrets.rs Outdated
Comment thread crates/openshell-providers/src/profiles.rs Outdated

@elezar elezar left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: reserve credential placeholder revisions

Solid change overall — the reservation closes a real ambiguity, and the graceful fallback (stale revisioned placeholder → current credential only when the key still exists) is well-tested. A few things to address before merge, plus notes.

1. The two reserved-namespace checks are duplicated across crates (gate on this)

secrets.rs::uses_reserved_revision_namespace (openshell-core) and profiles.rs::uses_reserved_placeholder_revision_namespace (openshell-providers) are semantically identical — same v<digits>_<key> grammar, same digit check, same emptiness checks. The only textual difference is that secrets.rs names the byte predicate (is_env_key_char) while profiles.rs inlines |b| b.is_ascii_alphanumeric() || b == b'_' — and is_env_key_char (secrets.rs:18) is exactly that. So they're byte-for-byte equivalent behavior.

This matters because they're the two enforcement halves of one invariant:

  • profiles.rs is the load-time gate — rejects a profile whose credentials.env_vars name lands in the reserved namespace (profiles.rs:1138).
  • secrets.rs is the runtime enforcement — skips such keys when building the resolver (secrets.rs:180).

If these ever drift, you get a real gap: a key the validator accepts but the runtime strips (credential silently missing at runtime), or — worse — a runtime check narrower than the validator, letting a crafted key slip past stripping and collide with a generated placeholder, which is exactly the ambiguity the reservation exists to prevent.

The reserved namespace is defined by the placeholder format in secrets.rs (placeholder_for_env_key_for_revision{PREFIX}v{revision}_{key}), and openshell-providers already depends on openshell-core. Recommendation: keep one canonical check in openshell-core::secrets, make it pub, have profiles.rs call it, and delete the providers-side copy. Drift becomes structurally impossible.

Bonus: there's a third copy of the same grammar — revisioned_placeholder_env_key (secrets.rs:489) parses {PREFIX}v<digits>_<KEY> with the identical tail. All three want one private parser, e.g. fn split_revisioned_key(s: &str) -> Option<(&str, &str)>, with uses_reserved_revision_namespace/revisioned_placeholder_env_key as thin wrappers and a pub export for providers. That also collapses the two hand-maintained edge-case test suites into one.

2. Scope — three unrelated changes bundled

The PR is fix(providers): reserve credential placeholder revisions, but also carries (a) the OCSF NET:FAIL message-formatting change and (b) the codex files.openai.com allowlist entry. Neither relates to the reservation. I understand these were carved out of #1826, but they're independently reviewable/revertable — please split (b) at minimum (it arguably warrants its own issue), and justify (a) on its own.

3. Fallback ignores the revision value entirely

In resolve_placeholder, once the direct lookup misses, revisioned_placeholder_env_key extracts only the key and resolves via the current canonical alias — the numeric revision is discarded. So v999999_GITHUB_TOKEN (a revision that never existed) resolves to the current token identically to a genuinely-aged-out v10_, and any retained v<N>_ is honored indefinitely while the key exists. Placeholders are sandbox-internal, so this is likely acceptable — but the revision component is non-authoritative on the fallback path, and stale vs. bogus revisions are indistinguishable. Please confirm that's intended and worth a comment.

Notes (non-blocking)

  • OCSF activity == "FAIL" is stringly-typed (shorthand.rs). Behaviorally correct (Fail is a valid NetworkActivity activity_id), but special-casing the uppercased display name is brittle — if the label is ever reworded the message silently stops appearing. Prefer matching the typed ActivityId::Fail.
  • Broadened OCSF message emission — confirm no secrets. NET:FAIL now surfaces message even when dst/rule/reason context is present. AGENTS.md is explicit that OCSF messages must never contain secrets and the JSONL may ship externally. Please confirm failure-path messages are builder-controlled and sanitized.
  • codex files.openai.com granted read-write. Consistent with sibling endpoints, but a file CDN host is worth a deliberate read-write vs read decision; no test covers the profile change.
  • Double-parse / double-warn per install. from_provider_env_for_current_revision clones the env and parses it twice (false then true), so a reserved-namespace key emits the same warn! twice per install. The current-aliases pass could be derived from the first rather than re-parsed.
  • by_placeholder.is_empty() → None. A non-empty provider_env consisting entirely of reserved keys now yields None instead of Some(empty) — confirm no caller treats Some as "provider env was present." (Low risk; merge already collapses empty→None.)
  • Reserved keys are dropped from child_env, not just the resolver. Safe given the production caller (provider_credentials.rs) feeds profile-validated keys that the new load-time gate already rejects — but v<digits>_ is a broad reservation; a one-line comment noting that assumption would help.

Checked and not issues (for the record)

  • Merge order does not corrupt the "current" fallback. Per-revision resolvers in generations are built with include_current_aliases=false (revisioned placeholders only, no collisions); the canonical alias is contributed solely by current_resolver, replaced on every install_environment and chained last in merge_resolvers. Fallback always points to the newest revision.
  • Retained in-window revisions resolve to their own values (direct by_placeholder hit), not collapsed to current — fallback only triggers after a revision ages out of MAX_RETAINED_CREDENTIAL_GENERATIONS.

Items 1–3 are what I'd like addressed/answered before merge.

Signed-off-by: John Myers <johntmyers@users.noreply.github.com>
@johntmyers

Copy link
Copy Markdown
Collaborator Author

/ok to test 42fbc04

@johntmyers johntmyers requested a review from elezar June 29, 2026 18:19
Signed-off-by: John Myers <johntmyers@users.noreply.github.com>
@johntmyers

Copy link
Copy Markdown
Collaborator Author

Addressed in 42fbc04d and the follow-up test commit b371f711:

  • Fixed the broken sandbox architecture sentence.
  • Removed the unrelated OCSF shorthand and Codex profile changes from this PR.
  • Made openshell_core::secrets::uses_reserved_revision_namespace the canonical reserved namespace check and updated provider profile validation to call it.
  • Collapsed the revisioned placeholder grammar into one parser in secrets.rs.
  • Added tests for reserved namespace grammar, including the very_unlikely edge case, and unknown stale revisioned placeholders.
  • Documented the fallback behavior: after a revision ages out, the numeric revision is non-authoritative and the fallback resolves by key to the current credential only if that key still exists.

The OCSF/Codex notes are no longer part of this PR and can be handled separately if needed.

@elezar elezar left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @johntmyers. Definitely a cleaner diff now.

@elezar elezar merged commit f27ff15 into main Jun 30, 2026
46 checks passed
@elezar elezar deleted the 1826-provider-refresh-placeholders/jm branch June 30, 2026 08:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test:e2e Requires end-to-end coverage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants