Problem
Eval fix tasks (#97-0102) exist but never get picked because lower-numbered tasks always win in the queue. The E2E eval score is 66/100 — nightshift doesn't reliably work against real repos. Same deprioritization pattern as the release problem.
Fix
Add an eval score gate to docs/prompt/evolve-auto.md:
EVAL SCORE GATE: After running Step 0 evaluation, check the score. If the latest evaluation in docs/evaluations/ scored below 80/100, you MUST pick an eval-related task (any task created by an evaluation report, or any task that would improve eval dimensions) before any other normal-priority task. The product doesn't work in production until the eval score proves it does. This gate overrides the lowest-number-first rule for normal tasks.
Acceptance Criteria
Problem
Eval fix tasks (#97-0102) exist but never get picked because lower-numbered tasks always win in the queue. The E2E eval score is 66/100 — nightshift doesn't reliably work against real repos. Same deprioritization pattern as the release problem.
Fix
Add an eval score gate to
docs/prompt/evolve-auto.md:Acceptance Criteria