Judge routing: needs_review_on_completion + judge field relationship undocumented; silent footgun

## Summary

Tasks with `needs_review_on_completion=True` and `judge=False` transition correctly to `pending_completion_review` on executor completion, but are then **invisible to fleet-wide judge agents** because `/v2/reviews/pending/` (`PendingReviewsView`) filters on `judge=True` (`src/reviews/views.py:99`). If no project-scoped reviewer polls, the task is stuck.

This is not strictly a bug — `judge=True` is the explicit opt-in for fleet-wide visibility — but the two fields' relationship is undocumented and the failure mode is silent.

## Reproducer (observed 2026-05-23 on vafi-dev)

A Pass-2 evaluation-loop task (`PGieeQr_s9XkcdLsbIwYn`) was created with the canonical Pass-2 recipe (carried in the vafi workspace cheat-sheet for weeks):

```python
Task.objects.create(
    ...,
    required_tags=['claude'],
    needs_review_on_completion=True,
    isolation='sequential',
    status='draft',
)
# judge defaults to False
```

Executor delivered cleanly, task entered `pending_completion_review` at ~06:03 UTC. The judge agent (tags=['judge']) polled `/v2/reviews/pending/` every ~30s and got `200 OK + items=[]` consistently. The judge agent logs were healthy; the task was in the right state; no review was ever picked up.

After ~22 min stuck, manual diagnosis: `t.judge = True; t.save()` → judge picked up the task within one poll cycle → verdict landed.

## Why this matters now

The substrate fix in vafi#36 (kb gotcha `XsPemtnm` — controller note-400 resilience) unblocked the Pass-2 Phase-1 first-real-run, which is what surfaced this. Without `judge=True`, the substrate fix works but the loop is still blocked at the next stage.

## Field semantics today (from reading the model + endpoint)

- `needs_review_on_completion: BooleanField(default=True)` — controls whether `doing → pending_completion_review` or `doing → done` on executor completion (`src/tasks/state_machine.py:248`).
- `judge: BooleanField(default=False)` — controls whether the task is visible to fleet-wide judge agents via `/v2/reviews/pending/` (`src/reviews/views.py:99`).

These are orthogonal in the schema, but coupled in practice: a task that needs a review **and** is intended to be reviewed by a fleet-wide judge needs BOTH set. A task that needs a review by a project member can have `judge=False` (membership-scoped pickup via `/v1/tasks/?status=pending_completion_review` would work — except that scoping is exactly what vtaskforge#6 fixed for judge-role agents, leaving membership-scoped review as the fallback for non-judge reviewers).

## Proposed direction (three options, prefer 1)

1. **Document the relationship + add a soft guard.** Add a clear docstring on both fields (mention the other in each); optionally add a `clean()` method on `Task` that emits a warning (or `GuardViolation` on `todo` entry) when `needs_review_on_completion=True` with no `judge=True` AND no project-scoped reviewer member exists. Cheapest, most conservative; keeps the schema orthogonal.

2. **Make the endpoint UNION-aware.** `PendingReviewsView.get_queryset()` returns `Q(judge=True) | Q(needs_review_on_completion=True, ...)`. Risk: judges then see tasks not intended for them (an executor-team task pending project-member review). Could be filtered by another label, but adds policy where there was none.

3. **Promote `needs_review_on_completion` semantics.** When `needs_review_on_completion=True` AND the project has registered fleet-judge agents, auto-set `judge=True`. Magic, fragile, breaks the principle that flags don't mutate themselves.

## Workaround in use today

Vafi workspace cheat-sheet (handoff 2026-05-23) corrected to set `judge=True` explicitly in the task-creation recipe. Three Pass-2 evaluation-loop tasks have since been created with this recipe and all transitioned cleanly. kb gotcha `uhUSfjkp` recorded.

## References

- `src/reviews/views.py:74-99` (PendingReviewsView)
- `src/tasks/models.py:91` (`judge = BooleanField(default=False)`)
- `src/tasks/models.py:57` (`needs_review_on_completion = BooleanField(default=True)`)
- vafi#36 (substrate fix that unblocked this scenario)
- vtaskforge#6 (the prior silent-fail incident that motivated `/v2/reviews/pending/`)
- vafi kb gotcha `uhUSfjkp` (2026-05-23)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Judge routing: needs_review_on_completion + judge field relationship undocumented; silent footgun #16

Summary

Reproducer (observed 2026-05-23 on vafi-dev)

Why this matters now

Field semantics today (from reading the model + endpoint)

Proposed direction (three options, prefer 1)

Workaround in use today

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Judge routing: needs_review_on_completion + judge field relationship undocumented; silent footgun #16

Description

Summary

Reproducer (observed 2026-05-23 on vafi-dev)

Why this matters now

Field semantics today (from reading the model + endpoint)

Proposed direction (three options, prefer 1)

Workaround in use today

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions