Skip to content

test: opt-in ability tools resolve for sandbox runs#2601

Merged
chubes4 merged 1 commit into
mainfrom
test/sandbox-opt-in-tool-resolution
Jun 8, 2026
Merged

test: opt-in ability tools resolve for sandbox runs#2601
chubes4 merged 1 commit into
mainfrom
test/sandbox-opt-in-tool-resolution

Conversation

@chubes4

@chubes4 chubes4 commented Jun 8, 2026

Copy link
Copy Markdown
Member

Summary

Adds a regression test for ToolPolicyResolver proving that an opt-in, ability-projected tool (the exact shape data-machine-code uses for workspace_write) resolves for a Codebox sandbox run when the runtime declares it via allow_only and an allow-mode tool_policy, and stays hidden otherwise.

Why

While debugging why WP Codebox coding agents couldn't edit files, this test deterministically settled an architectural question: Data Machine must not (and does not) know about the sandbox. sandbox is just an unknown mode string that normalizes away; the paired chat mode carries tools. The resolver, opt-in gating, and projection chain all work for the sandbox argument shape — this locks that contract so the runtime tool-delivery path can't silently regress.

The test projects a real registered ability (datamachine/get-wordpress-post) under an opt-in tool name and asserts:

  • it resolves for modes: ['sandbox','chat'] + allow_only + allow-mode tool_policy
  • it is excluded without opt-in

Testing

  • php tests/Unit/AI/Tools/SandboxOptInToolResolutionTest.php (via homeboy test data-machine): OK (2 tests, 4 assertions)

AI assistance

  • AI assistance: Yes
  • Tool(s): OpenCode (openai/gpt-5.5)
  • Used for: Built a deterministic PHPUnit reproduction of the sandbox tool-resolution path to isolate the real blocker (which turned out to be upstream of Data Machine), and added this regression test; Chris directs the work and remains responsible for review/testing.

Adds a regression test proving ToolPolicyResolver surfaces an opt-in
ability-projected tool (the shape data-machine-code uses for
workspace_write) for a Codebox sandbox run when the runtime declares it
via allow_only and an allow-mode tool policy, and hides it otherwise.

Data Machine has no sandbox-specific knowledge: 'sandbox' is an unknown
mode string that normalizes away while the paired 'chat' mode carries
tools. This locks that contract so the runtime tool-delivery path for
coding agents cannot silently regress.
@homeboy-ci

homeboy-ci Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Homeboy Results — data-machine

Lint

lint — passed

ℹ️ Full options: homeboy docs commands/lint
Deep dive: homeboy lint data-machine --changed-since 66f56c0

Artifacts and drill-down
  • CI results artifact: homeboy-ci-results-data-machine-lint-quality-Linux-node24 contains immediate command JSON for this action invocation.
  • Observation artifact: homeboy-observations-data-machine-lint-quality-Linux-node24 contains exported Homeboy run history for deeper queries.
  • Drill-down: download the observation artifact, then run homeboy runs import <dir>, homeboy runs list, and homeboy runs findings <run-id>.
  • Artifacts are attached to the workflow run: https://github.com/Extra-Chill/data-machine/actions/runs/27133307544

Test

test — passed

  • 2 passed

ℹ️ Auto-fix lint issues: homeboy refactor data-machine --from lint --write
ℹ️ Collect coverage: homeboy test data-machine --coverage
ℹ️ Save test baseline: homeboy test data-machine --baseline
ℹ️ Pass args to test runner: homeboy test -- [args]
ℹ️ Full options: homeboy docs commands/test
Deep dive: homeboy test data-machine --changed-since 66f56c0

Artifacts and drill-down
  • CI results artifact: homeboy-ci-results-data-machine-test-quality-Linux-node24 contains immediate command JSON for this action invocation.
  • Observation artifact: homeboy-observations-data-machine-test-quality-Linux-node24 contains exported Homeboy run history for deeper queries.
  • Drill-down: download the observation artifact, then run homeboy runs import <dir>, homeboy runs list, and homeboy runs findings <run-id>.
  • Artifacts are attached to the workflow run: https://github.com/Extra-Chill/data-machine/actions/runs/27133307544

Audit

audit — passed

Deep dive: homeboy audit data-machine --changed-since 66f56c0

Artifacts and drill-down
  • CI results artifact: homeboy-ci-results-data-machine-audit-quality-Linux-node24 contains immediate command JSON for this action invocation.
  • Observation artifact: homeboy-observations-data-machine-audit-quality-Linux-node24 contains exported Homeboy run history for deeper queries.
  • Drill-down: download the observation artifact, then run homeboy runs import <dir>, homeboy runs list, and homeboy runs findings <run-id>.
  • Artifacts are attached to the workflow run: https://github.com/Extra-Chill/data-machine/actions/runs/27133307544
Tooling versions
  • Homeboy CLI: homeboy 0.222.17+69b428aa
  • Extension: wordpress from https://github.com/Extra-Chill/homeboy-extensions
  • Extension revision: 3e5ce80d
  • Action: unknown@unknown

@chubes4 chubes4 merged commit 8e3c6c4 into main Jun 8, 2026
5 checks passed
@chubes4 chubes4 deleted the test/sandbox-opt-in-tool-resolution branch June 8, 2026 11:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant