test: opt-in ability tools resolve for sandbox runs by chubes4 · Pull Request #2601 · Extra-Chill/data-machine

chubes4 · 2026-06-08T11:03:39Z

Summary

Adds a regression test for ToolPolicyResolver proving that an opt-in, ability-projected tool (the exact shape data-machine-code uses for workspace_write) resolves for a Codebox sandbox run when the runtime declares it via allow_only and an allow-mode tool_policy, and stays hidden otherwise.

Why

While debugging why WP Codebox coding agents couldn't edit files, this test deterministically settled an architectural question: Data Machine must not (and does not) know about the sandbox. sandbox is just an unknown mode string that normalizes away; the paired chat mode carries tools. The resolver, opt-in gating, and projection chain all work for the sandbox argument shape — this locks that contract so the runtime tool-delivery path can't silently regress.

The test projects a real registered ability (datamachine/get-wordpress-post) under an opt-in tool name and asserts:

it resolves for modes: ['sandbox','chat'] + allow_only + allow-mode tool_policy
it is excluded without opt-in

Testing

php tests/Unit/AI/Tools/SandboxOptInToolResolutionTest.php (via homeboy test data-machine): OK (2 tests, 4 assertions)

AI assistance

AI assistance: Yes
Tool(s): OpenCode (openai/gpt-5.5)
Used for: Built a deterministic PHPUnit reproduction of the sandbox tool-resolution path to isolate the real blocker (which turned out to be upstream of Data Machine), and added this regression test; Chris directs the work and remains responsible for review/testing.

Adds a regression test proving ToolPolicyResolver surfaces an opt-in ability-projected tool (the shape data-machine-code uses for workspace_write) for a Codebox sandbox run when the runtime declares it via allow_only and an allow-mode tool policy, and hides it otherwise. Data Machine has no sandbox-specific knowledge: 'sandbox' is an unknown mode string that normalizes away while the paired 'chat' mode carries tools. This locks that contract so the runtime tool-delivery path for coding agents cannot silently regress.

homeboy-ci · 2026-06-08T11:05:43Z

Homeboy Results — `data-machine`

Lint

✅ lint — passed

ℹ️ Full options: homeboy docs commands/lint
Deep dive: homeboy lint data-machine --changed-since 66f56c0

Artifacts and drill-down

CI results artifact: homeboy-ci-results-data-machine-lint-quality-Linux-node24 contains immediate command JSON for this action invocation.
Observation artifact: homeboy-observations-data-machine-lint-quality-Linux-node24 contains exported Homeboy run history for deeper queries.
Drill-down: download the observation artifact, then run homeboy runs import <dir>, homeboy runs list, and homeboy runs findings <run-id>.
Artifacts are attached to the workflow run: https://github.com/Extra-Chill/data-machine/actions/runs/27133307544

Test

✅ test — passed

2 passed

ℹ️ Auto-fix lint issues: homeboy refactor data-machine --from lint --write
ℹ️ Collect coverage: homeboy test data-machine --coverage
ℹ️ Save test baseline: homeboy test data-machine --baseline
ℹ️ Pass args to test runner: homeboy test -- [args]
ℹ️ Full options: homeboy docs commands/test
Deep dive: homeboy test data-machine --changed-since 66f56c0

Artifacts and drill-down

CI results artifact: homeboy-ci-results-data-machine-test-quality-Linux-node24 contains immediate command JSON for this action invocation.
Observation artifact: homeboy-observations-data-machine-test-quality-Linux-node24 contains exported Homeboy run history for deeper queries.
Drill-down: download the observation artifact, then run homeboy runs import <dir>, homeboy runs list, and homeboy runs findings <run-id>.
Artifacts are attached to the workflow run: https://github.com/Extra-Chill/data-machine/actions/runs/27133307544

Audit

✅ audit — passed

Deep dive: homeboy audit data-machine --changed-since 66f56c0

Artifacts and drill-down

CI results artifact: homeboy-ci-results-data-machine-audit-quality-Linux-node24 contains immediate command JSON for this action invocation.
Observation artifact: homeboy-observations-data-machine-audit-quality-Linux-node24 contains exported Homeboy run history for deeper queries.
Drill-down: download the observation artifact, then run homeboy runs import <dir>, homeboy runs list, and homeboy runs findings <run-id>.
Artifacts are attached to the workflow run: https://github.com/Extra-Chill/data-machine/actions/runs/27133307544

Tooling versions

Homeboy CLI: homeboy 0.222.17+69b428aa
Extension: wordpress from https://github.com/Extra-Chill/homeboy-extensions
Extension revision: 3e5ce80d
Action: unknown@unknown

chubes4 mentioned this pull request Jun 8, 2026

Sandbox agent run: agents/chat ability unavailable when agents-api is double-loaded Automattic/wp-codebox#831

Closed

chubes4 merged commit 8e3c6c4 into main Jun 8, 2026
5 checks passed

chubes4 deleted the test/sandbox-opt-in-tool-resolution branch June 8, 2026 11:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: opt-in ability tools resolve for sandbox runs#2601

test: opt-in ability tools resolve for sandbox runs#2601
chubes4 merged 1 commit into
mainfrom
test/sandbox-opt-in-tool-resolution

chubes4 commented Jun 8, 2026

Uh oh!

homeboy-ci Bot commented Jun 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

chubes4 commented Jun 8, 2026

Summary

Why

Testing

AI assistance

Uh oh!

homeboy-ci Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Homeboy Results — data-machine

Lint

Test

Audit

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

homeboy-ci Bot commented Jun 8, 2026 •

edited

Loading

Homeboy Results — `data-machine`