Skip to content

Agent sandbox should fail loudly when workspace tools are unavailable #175

@chubes4

Description

@chubes4

Problem

WP Codebox sandbox guidance tells the model to use workspace tools (workspace_read, workspace_ls, workspace_grep, workspace_write, workspace_edit, workspace_apply_patch), but the lab run produced agent replies saying workspace tools were unavailable. The sandbox still returned success and an empty patch.

Evidence from lab run:

  • Fanout run: /tmp/homeboy-wp-codebox-audit-bwz45ims/fanout-run.json
  • Example record: missing_method
  • Example artifact: /tmp/homeboy-wp-codebox-artifacts-pr44ovsz/runtime-mpohjxp3-c85xcu
  • Nested agent result was successful/completed from the runtime perspective, but the reply started: I was unable to read the workspace files directly because the workspace tools are not available in this context.
  • The artifact had files/patch.diff size 0, so no edit was produced.

Relevant WP Codebox code:

  • packages/cli/src/agent-code.ts builds sandbox guidance listing the workspace tools.
  • It also sets client_context.tool_contract and tool_policy with the sandbox tool names.
  • Despite that, the model did not receive/use executable workspace tools in the real run.

Why this matters

If workspace tools are unavailable, the sandbox should fail before or during the agent run with a structured diagnostic. A text-only "I cannot access tools" answer is not an acceptable successful autofix artifact.

Acceptance criteria

  • WP Codebox verifies that requested sandbox workspace tools are registered and visible before starting an agent run.
  • The agent runtime result includes a structured diagnostic listing requested, registered, allowed, and missing tools.
  • Runs fail or return a non-actionable status when required workspace tools are missing/unavailable.
  • The sandbox prompt/tool policy/tool contract uses the same canonical tool names that Data Machine/Data Machine Code exposes.
  • Smoke coverage reproduces a missing-tool case and proves it is not reported as successful.

Related

AI assistance

  • AI assistance: Yes
  • Tool(s): OpenCode (GPT-5.5)
  • Used for: Inspected WP Codebox sandbox configuration and lab agent output, then drafted this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions