feat(studio): real-time eval progress via SSE push

## Summary

AgentV Studio currently polls \`/api/runs\` every 5 seconds to pick up new results. This means:
- Runs started from the CLI appear after up to a 5-second delay
- There is no per-test-case live progress — you only see a run after it fully completes
- No visual indication that an eval is running

Convex Evals shows pending status while evals are in-flight and updates stats in real time as each test case finishes. We should match this.

## Motivation

- A user starts \`agentv eval run\` in the terminal — Studio should immediately show the run as in-progress and update pass/fail counts as each test resolves
- Eliminates the awkward wait after pressing **▶ Run Eval** in Studio where nothing appears to happen for several seconds
- Makes Studio feel live and trustworthy rather than stale

## Design

### Server-Sent Events endpoint

Add a \`GET /api/events\` SSE endpoint in \`serve.ts\` that pushes events to connected Studio clients:

```
event: run_started
data: {"run_id": "...", "eval_file": "...", "target": "...", "total": 12}

event: test_result
data: {"run_id": "...", "test_id": "...", "score": 0.95, "status": "PASS", "passed": 5, "total": 12}

event: run_completed
data: {"run_id": "...", "passed": 10, "failed": 2, "pass_rate": 0.833}
```

The orchestrator already emits per-test results — hook into those to broadcast SSE events.

### Studio client

Replace the 5-second polling interval on \`runListOptions\` with an SSE listener. On \`run_started\`, immediately add an in-progress row to the run list. On \`test_result\`, update the row's pass/fail counts live. On \`run_completed\`, mark the row as done and trigger a full refresh.

Fallback to 5-second polling if SSE connection drops or is unavailable.

### Pending status indicator

While a run is in-progress, show a spinner or pulsing indicator in the run list row instead of the ✓/✗ status dot.

## Acceptance Signals

All signals must be verified **manually using \`agent-browser\`** — no mocking.

- [ ] Start \`agentv studio\` and open it in agent-browser. Run \`agentv eval run\` from a separate terminal. Within 1 second of the CLI command starting, a new in-progress row appears in Studio's Recent Runs tab without any manual refresh.
- [ ] As each test case completes, the Passed/Failed/Total counts on the in-progress row update live (verified by agent-browser snapshotting the row mid-run and confirming counts change between snapshots).
- [ ] When the eval run finishes, the row transitions from in-progress to a final ✓/✗ status dot and the Pass Rate pill shows the final score.
- [ ] If the SSE connection is lost (kill and restart the studio server), the client falls back to polling and the run list still updates within 10 seconds.
- [ ] Pressing **▶ Run Eval** inside Studio shows an in-progress row immediately on submit — no visible delay.
- [ ] Opening Studio while a run is already mid-flight shows the in-progress row with current partial counts (server broadcasts current state on SSE connect).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(studio): real-time eval progress via SSE push #997

Summary

Motivation

Design

Server-Sent Events endpoint

Studio client

Pending status indicator

Acceptance Signals

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat(studio): real-time eval progress via SSE push #997

Description

Summary

Motivation

Design

Server-Sent Events endpoint

Studio client

Pending status indicator

Acceptance Signals

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions