CI: adopt GitHub Merge Queue + tiered CI

## Summary

Adopt GitHub's native [Merge Queue](https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/configuring-pull-request-merges/managing-a-merge-queue) and split CI into two tiers so the heavy integration suite runs **once at merge time** instead of on every PR push, and so PR branches stay automatically up to date with `main`.

## Problem

Two bottlenecks slow down contributors and reviewers today:

1. **Manual branch updates.** Every PR must be manually rebased or "Update branch"-clicked whenever `main` moves. With multiple PRs in flight this serializes contributors and adds round trips that have nothing to do with the change being reviewed.
2. **Per-push integration approval.** The integration + release-validation suite runs on every PR push and is gated by an environment with required reviewers. Approval must be granted on every new push (including WIP commits that will be rewritten), and the approver pool is small relative to the contributor base, which does not scale.

The result: long, repetitive feedback loops, wasted CI minutes on commits that will be force-pushed away, and a single point of contention for getting anything merged.

## Proposed change

Split CI into two tiers, gated by GitHub's native Merge Queue:

- **Tier 1 (every PR push)** — unit tests + binary build. Fast, no CI secrets, fork-safe. Stays in `ci.yml`. Provides quick correctness feedback to contributors.
- **Tier 2 (merge queue only)** — integration tests + release validation. Runs against the tentative merge commit on the `gh-readonly-queue/main/*` ref that the queue creates. Triggered by the `merge_group` event.

When a reviewer adds a PR to the queue, GitHub:

1. Builds a tentative merge of the PR onto the latest `main` (no manual "Update branch" needed).
2. Runs Tier 2 against that ref.
3. Auto-merges if checks pass; ejects the PR if they fail.

Multiple PRs can be batched into one Tier 2 run when traffic is high.

### Before

```mermaid
flowchart LR
 A[PR push] --> B[ci.yml unit + build]
 B -->|workflow_run| C{environment approval}
 C -->|approved| D[smoke]
 D --> E[integration]
 E --> F[release-validation]
 F --> G[commit status back to PR]
 H[main moves] -.-> I[manual Update branch] -.-> A
 C -.->|every push needs re-approval| C
```

### After

```mermaid
flowchart LR
 A[PR push] --> B[ci.yml Tier 1 unit + build no secrets]
 B --> R[review]
 R --> Q[reviewer adds to merge queue]
 Q --> M[merge_group event tentative merge ref]
 M --> T[Tier 2 integration + release-validation]
 T -->|pass| X[auto-merge to main]
 T -->|fail| Y[eject from queue]
```

## Trust model

The trust boundary moves to **write access**. PRs from contributors without write access run only Tier 1 and never touch CI secrets, so fork PRs remain safe. Adding a PR to the queue requires write access; when the queue runs Tier 2, the merged code executes with secrets. This is the standard GitHub trust model: write access implies trust to run code with secrets, and access grants are managed in repo settings.

Because Tier 2 only runs in `merge_group` context (not `pull_request`), there is no longer a need for the `workflow_run` indirection or a per-push environment approval.

## Benefits

- Removes the manual "Update branch" loop entirely.
- Integration suite runs once per merged PR (or once per batched group), not once per WIP push. Large reduction in CI minutes.
- No single approver in the critical path for every PR.
- Native GitHub feature: no third-party bots, no extra services.
- Batching support scales throughput when activity is high.

## Rollout plan

1. **Additive merge_group trigger** — add `merge_group` to `ci.yml` so Tier 1 runs in both contexts. Land alongside the existing `workflow_run` path so nothing changes for PR authors yet.
2. **Rewrite `ci-integration.yml`** to trigger on `merge_group` instead of `workflow_run`. Drop the approval, smoke-as-separate-job, and commit-status shim. Keep the legacy `workflow_run` path running in parallel during a shadow week to compare results.
3. **Verify check names** in a real `merge_group` run and capture them verbatim.
4. **Enable merge queue on `main`** in branch protection settings, with the verified check names as required. Disable "require branches to be up to date" (the queue handles it). Enable repo-wide auto-merge.
5. **Retire the legacy approval environment** and remove the old `workflow_run` path from `ci-integration.yml`.
6. **Update docs** (`CONTRIBUTING.md`, CI internals docs) and add a `CHANGELOG.md` entry.

Each step ships as its own PR so it can be reverted independently.

## Out of scope

- Third-party merge bots (Mergify, Kodiak, bors) — the native GitHub feature covers this.
- Required-review policy changes — orthogonal to CI plumbing.
- Test parallelization, sharding, or runner type changes — separate optimization.
- Live-inference test scheduling (`ci-runtime.yml`) — already isolated and stays as-is.

## Risks and mitigations

- **Required check names must match exactly** in `merge_group` context. Mitigated by verifying in a real run before flipping branch protection.
- **Cutover bug at merge time** would block merges. Mitigated by the additive shadow week before removing the legacy path.
- **Queue stalls on a flaky test** — the queue can be disabled in one click from settings without touching workflow YAML.

## Tracking

This issue is the design doc. Implementation PRs will reference it. Discussion welcome — especially from contributors who have hit either bottleneck.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI: adopt GitHub Merge Queue + tiered CI #770

Summary

Problem

Proposed change

Before

After

Trust model

Benefits

Rollout plan

Out of scope

Risks and mitigations

Tracking

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CI: adopt GitHub Merge Queue + tiered CI #770

Description

Summary

Problem

Proposed change

Before

After

Trust model

Benefits

Rollout plan

Out of scope

Risks and mitigations

Tracking

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions