Agent sandbox run should fail or timeout when nested agent remains processing with pending tools

## Problem

WP Codebox `wp-codebox.agent-sandbox-run` can complete the recipe step with exit code 0 even when the nested Data Machine `agents/chat` result is still processing, has pending tools, and reached max turns without a final answer or file changes.

This lets parent Homeboy logic see runtime artifacts and attempt late outcome classification, even though the agent did not complete the task.

## Evidence

Parent tracker: https://github.com/Extra-Chill/homeboy/issues/3378

Overlay run:

- Run id: `homeboy-3378-release-channel-fanout-overlay-20260603015451`
- Transcript A: `/var/folders/lr/c_cmmt7s0592m4njz99v5yb40000gn/T/opencode/homeboy-3378-release-channel-fanout-overlay-20260603015451-a-artifacts/runtime-mpxez1y8-ubvf6z/files/transcript.json`
- Agent result A: `/var/folders/lr/c_cmmt7s0592m4njz99v5yb40000gn/T/opencode/homeboy-3378-release-channel-fanout-overlay-20260603015451-a-artifacts/runtime-mpxez1y8-ubvf6z/files/agent-result.json`

Nested `agents/chat` metadata:

```text
status: processing
current_turn: 20
has_pending_tools: true
completed: null/false
```

WP Codebox agent result:

```json
{
  "schema": "wp-codebox/agent-result/v1",
  "status": "completed",
  "actionable": false,
  "summary": "Agent sandbox completed without actionable file changes.",
  "changedFiles": { "count": 0 },
  "patch": { "bytes": 0 },
  "noOpReason": "no_file_changes"
}
```

The recipe command itself had exit code 0, and the parent Homeboy run later preserved empty patch artifacts.

## Expected behavior

A sandbox agent run should not be classified as completed when the nested conversation is still processing or has pending tools at max turns.

## Acceptance criteria

- Inspect nested `agent_runtime.result.completed`, `metadata.status`, `metadata.has_pending_tools`, `metadata.current_turn`, and max-turn diagnostics.
- If nested state is `processing` with pending tools or max turns reached, classify the WP Codebox agent result as `timeout`, `incomplete`, or `failed`, not `completed`.
- Emit a structured diagnostic such as `agent_runtime.incomplete_pending_tools`.
- Produce non-zero recipe outcome when the nested agent cannot complete and no explicit no-op final answer exists.
- Add smoke coverage using a fake nested `agents/chat` result with `status: processing`, `has_pending_tools: true`, and no file changes.

## Related

- #533
- #518
- Extra-Chill/homeboy#3378
- Extra-Chill/homeboy#3383
- Extra-Chill/data-machine#2470

## AI assistance
- **AI assistance:** Yes
- **Tool(s):** OpenCode (openai/gpt-5.5)
- **Used for:** Reproducing the nested agent pending-tools outcome and drafting this tracker. Chris remains responsible for review and prioritization.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agent sandbox run should fail or timeout when nested agent remains processing with pending tools #534

Problem

Evidence

Expected behavior

Acceptance criteria

Related

AI assistance

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Agent sandbox run should fail or timeout when nested agent remains processing with pending tools #534

Description

Problem

Evidence

Expected behavior

Acceptance criteria

Related

AI assistance

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions