Skip to content

Stop reporting broken worktree metadata as probe_timeout #540

@chubes4

Description

@chubes4

Problem

DMC cleanup reports some rows as probe_timeout, but the sampled failures are not timeouts. They are broken or stale git worktree metadata, often with an owning repo/path mismatch.

Example cleanup command:

studio wp datamachine-code workspace worktree cleanup --dry-run --format=json

Current summary included:

{
  "probe_timeout": 32
}

Sample row:

{
  "handle": "sandbox-runtime@agent-runtime-command",
  "repo": "wp-codebox",
  "branch": "feature/recipe-validate",
  "path": "/Users/chubes/Developer/sandbox-runtime@agent-runtime-command",
  "reason_code": "probe_timeout",
  "reason": "cleanup safety probe timed out - leaving in place: Git command failed (exit 128): fatal: not a git repository: /Users/chubes/Developer/sandbox-runtime/.git/worktrees/sandbox-runtime@agent-runtime-command"
}

Raw check:

git -C /Users/chubes/Developer/sandbox-runtime@agent-runtime-command status --short --branch
fatal: not a git repository: /Users/chubes/Developer/sandbox-runtime/.git/worktrees/sandbox-runtime@agent-runtime-command

That is not an execution timeout. It is stale/broken git metadata or an incorrect ownership mapping (handle starts with sandbox-runtime, but repo is wp-codebox).

Why this matters

probe_timeout implies retry/budget tuning. These rows need metadata repair, registry pruning, or stale worktree marker handling. Mislabeling them makes cleanup guidance point at the wrong fix.

Suggested fix

Split git probe failures into more precise reason codes:

  • probe_timeout for actual timeout/budget exhaustion.
  • broken_worktree_metadata or stale_worktree_marker when Git exits with repository/worktree metadata errors.
  • Potentially owner_repo_mismatch when handle-derived repo and normalized repo disagree.

Likely code paths:

  • inc/Workspace/Workspace.php build_worktree_probe_timeout_skip() and callers around dirty/unpushed/fetch probes.
  • inc/Workspace/WorkspaceMetadataReconciliation.php probe failure handling around reconciliation apply/dry-run.

Acceptance criteria

  • Git exit 128 with missing .git/worktrees/... is not reported as probe_timeout.
  • Cleanup output gives a specific next action: prune registry, reconcile metadata, or inspect broken worktree marker.
  • Actual process timeouts still report probe_timeout.
  • Summary buckets count broken metadata separately from runtime probe timeouts.

Suggested tests

  • Fixture with a worktree marker pointing at a missing common-dir classifies as broken metadata.
  • Fixture with a simulated command timeout still classifies as probe_timeout.
  • Fixture with handle/repo mismatch reports a distinct diagnostic or is routed to reconciliation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions