refactor(sandbox): move gh token precedence into the credential helper by ColeMurray · Pull Request #683 · ColeMurray/background-agents

ColeMurray · 2026-05-28T06:48:38Z

Summary

Moves the gh CLI wrapper's 4-branch /bin/sh token-precedence decision tree into the Python credential helper as a pure, unit-testable _gh_wrapper_should_mint() behind a new gh-token action; the shell wrapper becomes a thin delegator.
Drops the brittle GITHUB_TOKEN != GITHUB_APP_TOKEN value-equality heuristic in favor of the authoritative OI_GITHUB_TOKEN_IS_FALLBACK marker.
Exports GH_TOKEN before exec (instead of env GH_TOKEN=… exec), so the token never appears in process argv.

Follow-up to #679 — items P1-1 and P1-2 from the post-merge maintainability review.

Behavior change (intentional)

With OI_GITHUB_TOKEN_IS_FALLBACK=1, gh now always refreshes. Previously, if GITHUB_TOKEN and GITHUB_APP_TOKEN differed while the marker was set, the wrapper passed through with the (stale) GITHUB_TOKEN — which gh reads, while it ignores GITHUB_APP_TOKEN. The only case where old ≠ new requires manually re-exporting an internal env var inside the running sandbox; the supported override path (repo secrets) never sets the marker, so user-provided tokens are still respected. Net: strictly better in every supported case.

Out of scope / unchanged

The credential minting/caching/locking core and the git get/store/erase protocol actions.
The credential-helper shim install (base.py / toolchain.py / entrypoint runtime) — action-agnostic, unaffected by the token → gh-token rename.
No image CACHE_BUSTER bump needed: the wrapper is rewritten on every boot, and the baked shim is unchanged.

Incidental

Two pre-existing isinstance(x, (int, float)) calls converted to union syntax (UP038) so the file is lint-clean under current ruff (valid on every ruff version).

Test plan

ruff check + ruff format clean on the changed files
pytest tests/test_gh_wrapper.py tests/test_git_credential_helper.py — 44 passed locally
Full sandbox-runtime suite green in CI
Decision matrix covers: nothing in env, non-github host, GH_TOKEN, user GITHUB_TOKEN/GITHUB_APP_TOKEN, marked fallback (incl. differing values), and GH_TOKEN winning over a marker
Shell wrapper: minted token exported as GH_TOKEN and never in argv; empty/failed helper falls through to the existing env

Summary by CodeRabbit

Release Notes

New Features
- Enhanced GitHub CLI integration with improved token handling and automatic token provisioning support.
Bug Fixes
- Strengthened credential expiry validation for improved reliability.
Tests
- Expanded test coverage for credential helper and wrapper behavior.

The gh CLI wrapper carried a 4-branch /bin/sh token-precedence decision tree, including a brittle `GITHUB_TOKEN != GITHUB_APP_TOKEN` value-equality heuristic. Move that decision into the Python credential helper as a pure, unit-testable `_gh_wrapper_should_mint()` behind a new `gh-token` action, and reduce the wrapper to a thin delegator. - Drop the value-equality heuristic; rely on the authoritative OI_GITHUB_TOKEN_IS_FALLBACK marker (a marked fallback always refreshes). - Export GH_TOKEN before exec instead of `env GH_TOKEN=… exec`, so the token never lands in process argv. - Port the shell decision-tree tests to Python (decision matrix + action tests); slim the shell test to delegator wiring + argv-cleanliness. Also converts two pre-existing `isinstance(x, (int, float))` calls to union syntax (UP038) so the file is lint-clean under current ruff.

coderabbitai · 2026-05-28T06:48:50Z

📝 Walkthrough

Walkthrough

This PR refactors GitHub token minting by moving precedence logic from a complex shell wrapper into Python credential helper functions. The new gh-token action conditionally mints fresh tokens based on environment state, while the wrapper simplifies to a delegation pattern that exports the minted token to the real gh process.

Changes

GitHub Token Minting Refactoring

Layer / File(s)	Summary
Type and import hardening `packages/sandbox-runtime/src/sandbox_runtime/credentials/git_credential_helper.py`	Updates imports to add `TYPE_CHECKING` and `Mapping`; hardens expiry validation in cached-credential and control-plane response paths to use `int \| float` type checks.
GH-token minting action `packages/sandbox-runtime/src/sandbox_runtime/credentials/git_credential_helper.py`	Introduces `_gh_wrapper_should_mint()` to conditionally decide token minting based on `VCS_HOST`, user-provided `GH_TOKEN`, and fallback-token indicators; implements `_print_gh_token()` to mint and log results; routes `gh-token` action in `main()`.
Wrapper delegation to helper `packages/sandbox-runtime/src/sandbox_runtime/entrypoint.py`	Replaces complex shell precedence logic with simple delegation: wrapper calls credential helper's `gh-token` action and exports non-empty result as `GH_TOKEN` before running real `gh`.
Wrapper shell integration tests `packages/sandbox-runtime/tests/test_gh_wrapper.py`	Refactors test harness with faked gh and helper; adds four tests for token export to `GH_TOKEN`, token non-appearance in argv, suppression on empty helper output, and fallback on helper failure.
Credential helper minting tests `packages/sandbox-runtime/tests/test_git_credential_helper.py`	Adds `clean_gh_env` fixture; parametrizes `_gh_wrapper_should_mint()` tests across environment scenarios; extends `gh-token` action coverage for user-token suppression, non-`github.com` host suppression, and error-case handling.

🎯 3 (Moderate) | ⏱️ ~20 minutes

🐰 Token logic hops from shell to Python's embrace,
Wrapper delegates with grace and pace,
Helper mints with care and tests keep score,
GitHub CLI greets fresh tokens at the door! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 71.43% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main refactoring: moving token precedence logic from the gh wrapper shell script into the Python credential helper, which is the primary architectural change across all modified files.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch refactor/gh-token-helper

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-05-28T06:49:07Z

Terraform Validation Results

Step	Status
Format	✅
Init	✅
Validate	✅

Note: Terraform plan was skipped because secrets are not configured. This is expected for external contributors. See docs/GETTING_STARTED.md for setup instructions.

Pushed by: @ColeMurray, Action: pull_request

open-inspect

Summary

PR #683, refactor(sandbox): move gh token precedence into the credential helper, by @ColeMurray. Reviewed 4 changed files (+206/-176); the refactor moves the gh token precedence decision into the Python helper, keeps the shell wrapper thin, and adds targeted tests for the decision matrix and wrapper wiring.

Critical Issues

None found.

Suggestions

None blocking.

Nitpicks

None.

Positive Feedback

The token precedence rules are now centralized in a pure helper function with a clear decision matrix, which is much easier to test and maintain than shell branching.
Exporting GH_TOKEN before exec avoids putting the minted token in process argv.
The tests cover the important fallback-marker and user-token precedence cases, plus shell behavior for empty and failed helper output.

Questions

None.

Verdict

Approve.

Verification: attempted pytest tests/test_gh_wrapper.py tests/test_git_credential_helper.py, but pytest is not installed in this environment. Ran manual checks against _gh_wrapper_should_mint with PYTHONPATH=src successfully.

coderabbitai

🧹 Nitpick comments (1)

packages/sandbox-runtime/tests/test_gh_wrapper.py (1)

93-106: ⚡ Quick win

Add the existing-GH_TOKEN fallthrough case.

These tests only prove the wrapper preserves GITHUB_TOKEN when the helper prints nothing or exits nonzero. The delegated contract in packages/sandbox-runtime/tests/test_git_credential_helper.py:559-621 also treats a user-provided GH_TOKEN as authoritative, so it's worth locking that down at the shell layer too.

🧪 Suggested test

+def test_preserves_existing_gh_token_when_helper_prints_nothing(tmp_path: Path) -> None:
+    wrapper = _build_wrapper(tmp_path, token_cmd_body=PRINTS_NOTHING)
+    out = _run(wrapper, {"VCS_HOST": "github.com", "GH_TOKEN": "user_token"})
+    assert "GH_TOKEN=user_token" in out

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/sandbox-runtime/tests/test_gh_wrapper.py` around lines 93 - 106, Add
a new test to cover the fallthrough when an existing GH_TOKEN is present in the
environment: mirror the patterns in test_no_export_when_helper_prints_nothing
and test_falls_through_when_helper_fails but call _build_wrapper with
PRINTS_NOTHING and EXITS_NONZERO (or one of them) and invoke _run with an
environment that includes both GITHUB_TOKEN and GH_TOKEN (e.g.,
"GITHUB_TOKEN":"user_token", "GH_TOKEN":"preexisting_token"); assert the output
contains "GH_TOKEN=preexisting_token" and that "GITHUB_TOKEN=user_token" is
still present to ensure the wrapper preserves a user-provided GH_TOKEN.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@packages/sandbox-runtime/tests/test_gh_wrapper.py`:
- Around line 93-106: Add a new test to cover the fallthrough when an existing
GH_TOKEN is present in the environment: mirror the patterns in
test_no_export_when_helper_prints_nothing and
test_falls_through_when_helper_fails but call _build_wrapper with PRINTS_NOTHING
and EXITS_NONZERO (or one of them) and invoke _run with an environment that
includes both GITHUB_TOKEN and GH_TOKEN (e.g., "GITHUB_TOKEN":"user_token",
"GH_TOKEN":"preexisting_token"); assert the output contains
"GH_TOKEN=preexisting_token" and that "GITHUB_TOKEN=user_token" is still present
to ensure the wrapper preserves a user-provided GH_TOKEN.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f5d29383-90bb-447c-a2e0-fad3a57cf7e4

📥 Commits

Reviewing files that changed from the base of the PR and between ce3a14d and be6a52c.

📒 Files selected for processing (4)

packages/sandbox-runtime/src/sandbox_runtime/credentials/git_credential_helper.py
packages/sandbox-runtime/src/sandbox_runtime/entrypoint.py
packages/sandbox-runtime/tests/test_gh_wrapper.py
packages/sandbox-runtime/tests/test_git_credential_helper.py

open-inspect Bot approved these changes May 28, 2026

View reviewed changes

coderabbitai Bot reviewed May 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(sandbox): move gh token precedence into the credential helper#683

refactor(sandbox): move gh token precedence into the credential helper#683
ColeMurray wants to merge 1 commit into
mainfrom
refactor/gh-token-helper

ColeMurray commented May 28, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 28, 2026 •

edited

Loading

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented May 28, 2026

Uh oh!

open-inspect Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ColeMurray commented May 28, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Behavior change (intentional)

Out of scope / unchanged

Incidental

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented May 28, 2026

Terraform Validation Results

Uh oh!

open-inspect Bot left a comment

Choose a reason for hiding this comment

Summary

Critical Issues

Suggestions

Nitpicks

Positive Feedback

Questions

Verdict

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ColeMurray commented May 28, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 28, 2026 •

edited

Loading