release: v4.7.0 — auto-rebake PRs + drift issues lead with structured verdict by askalf · Pull Request #320 · askalf/dario

askalf · 2026-05-18T00:46:48Z

What does this PR do?

Closes an ergonomic gap that PR #317 (tonight's first real-world auto-rebake) exposed: the PR body opened with raw `[bake]` log output, then a unified-line diff. A reviewer had to read ~60 lines of detail to decide ship-or-investigate. The common case (text-only drift, ship it) looked identical at a glance to the rare case (tools removed, body_field_order changed, investigate).

v4.7.0 leads with a one-line verdict + per-axis bullets.

Two new exports in `scripts/drift-report.mjs`

`interpretDrift(diff)` — classifies the slot-level diff into a structured summary:

```typescript
{
toolsAdded: string[]
toolsRemoved: string[]
betasAdded: string[]
betasRemoved: string[]
systemPromptDelta: number // chars added/removed, signed
agentIdentityChanged: boolean
bodyFieldOrderChanged: boolean
headerOrderChanged: boolean
verdict: 'benign' | 'moderate' | 'substantive'
}
```

Verdict ladder (conservative — substantive dominates moderate dominates benign):

Verdict	Triggers	Action
`benign`	only text-content drift (system_prompt / agent_identity / tool descriptions)	ship — the 90%+ case
`moderate`	tools added, betas added/removed, agent_identity changed	probably ship, closer read
`substantive`	tools removed, body_field_order or header_order changed	investigate; can break canonical-rebuild

`formatDriftSummary(interpretation)` — renders the structured summary as markdown for embedding in PR + issue bodies. Lead line: `Verdict: ✅ Benign` / `🟡 Moderate` / `🔴 Substantive`, then per-axis bullets with brief context.

Wiring

`scripts/capture-and-bake.mjs --check`: prints the verdict-led summary before the unified-line detail. Also writes `drift-summary.md` to disk so the workflow can drop it verbatim into PR/issue bodies.
`.github/workflows/cc-drift-template-watch.yml`: both the auto-rebake PR body and the drift tracking issue body lead with a "### Summary" section before the existing "### Drift report" code block.

What a reviewer sees on the next drift PR (before reading any detail)

Verdict: ✅ Benign

system_prompt: -2107 chars net (text-content drift — see unified diff below)

Click merge. Done. The unified diff stays inline for the rare case where slot-level signal isn't enough.

Tests

`test/bake-drift-report.mjs` gains 12 headers (20-31) / 27 assertions covering:

Empty diff returns benign verdict, zero counts
Per-slot verdict promotions (benign → moderate → substantive)
Substantive-dominates-moderate ordering
Multi-tool comma-split parsing
`formatDriftSummary` emoji + label + bullet rendering across the three verdicts

69/69 file tests pass; 75/75 full suite green.

How to test

```bash
git fetch origin feat/v4.7.0-drift-summary-header
git checkout feat/v4.7.0-drift-summary-header
npm run build && npm test # 75/75 (no src/ changes; new tests in test/bake-drift-report.mjs)

Optional: simulate the summary output:

node -e "
import('./scripts/drift-report.mjs').then(({ interpretDrift, formatDriftSummary }) => {
const interp = interpretDrift([
{ summary: 'tools added: NewTool' },
{ summary: 'system_prompt content changed (12000 → 12150 chars, delta +150)' },
]);
console.log(formatDriftSummary(interp).join('\n'));
});
"
```

Checklist

`npm run build` passes
`npm test` passes (offline regression test, no credentials required) — 75/75
For changes that touch `proxy.ts`, `cc-template.ts`, or streaming behavior: tested with `dario proxy --verbose` + `node test/compat.mjs` (requires credentials) — N/A: scripts + workflow + tests only
No new runtime dependencies added
No tokens/secrets in code or logs

PR #317 (tonight's first real-world auto-rebake) showed the chain works end-to-end but surfaced an ergonomic gap: the PR body opened with raw [bake] log output, then a unified-line diff. A reviewer had to read ~60 lines to decide ship-or- investigate. The common case (text-only drift, ship it) looked identical at a glance to the rare case (tools removed, investigate). v4.7.0 leads with a one-line verdict + per-axis bullets. scripts/drift-report.mjs gains two exports: - interpretDrift(diff) — classifies the slot-level diff into a structured summary { toolsAdded, toolsRemoved, betasAdded, betasRemoved, systemPromptDelta, agentIdentityChanged, bodyFieldOrderChanged, headerOrderChanged } + a single verdict: 'benign' | 'moderate' | 'substantive'. Verdict ladder is conservative; substantive dominates moderate dominates benign. - formatDriftSummary(interpretation) — renders the structured summary as markdown for direct embedding in PR + issue bodies. Leads with **Verdict:** ✅/🟡/🔴 + label, then per-axis bullets with brief context. Verdict tiers: - benign — only text content changed (the 90%+ case) - moderate — tools added, betas changed, agent_identity changed - substantive — tools REMOVED, body_field_order or header_order changed (can break canonical-rebuild paths) Wiring: - capture-and-bake.mjs --check: prints verdict-led summary before the unified-line detail; writes drift-summary.md to disk so the workflow can drop it verbatim into PR/issue bodies without grep-parsing the [bake] log. - cc-drift-template-watch.yml: both the auto-rebake PR body and the drift tracking issue body lead with a "### Summary" section before the existing "### Drift report" code block. Guarded by `[ -f drift-summary.md ]` for backward compat. Tests: test/bake-drift-report.mjs gains 12 headers / 27 assertions covering empty-diff verdict, per-slot verdict promotions, multi-axis aggregation, comma-split parsing, formatDriftSummary emoji + label + bullet rendering. 69/69 file tests pass; 75/75 full suite green. No src/ changes.

github-actions · 2026-05-18T00:47:23Z

Compat test: ❌ FAILED

Ran node test/compat.mjs against dario proxy --passthrough on the self-hosted runner for commit f6c5c167bdf332dcde360ea14880e5ae326a7fba.

Output

============================================================
  dario Compatibility Validation (--passthrough)
  2026-05-18T00:47:01.991Z
============================================================

--- Anthropic Messages API (Hermes) ---
❌ #1 Anthropic non-stream: HTTP 429: {"type":"error","error":{"type":"rate_limit_error","message":"Rate limited (rejected). Limiting wind
❌ #2 Anthropic stream: HTTP 429: {"type":"error","error":{"type":"rate_limit_error","message":"Rate limited (rejected). Limiting wind
❌ #3 SSE framing: HTTP 429

--- Passthrough Verification ---
❌ #4 No thinking injection: HTTP 429
❌ #5 Client betas preserved: HTTP 429: {"type":"error","error":{"type":"rate_limit_error","message":"Rate limited (rejected). Limiting wind

--- Tool Use (OpenClaw) ---
❌ #6 Tool use: stop_reason=undefined tool=false
❌ #7 Tool use stream: HTTP 429

--- OpenAI Compat ---
❌ #8 OpenAI non-stream: HTTP 429: {"type":"error","error":{"type":"rate_limit_error","message":"Rate limited (rejected). Limiting wind
❌ #9 OpenAI stream: HTTP 429

--- Header Visibility ---
✅ #10 Header visibility: request-id=true | ratelimit=false (0 headers)

============================================================
  RESULTS: 1 passed, 9 failed, 0 warnings
============================================================

Failed:
  #1 Anthropic non-stream: HTTP 429: {"type":"error","error":{"type":"rate_limit_error","message":"Rate limited (rejected). Limiting wind
  #2 Anthropic stream: HTTP 429: {"type":"error","error":{"type":"rate_limit_error","message":"Rate limited (rejected). Limiting wind
  #3 SSE framing: HTTP 429
  #4 No thinking injection: HTTP 429
  #5 Client betas preserved: HTTP 429: {"type":"error","error":{"type":"rate_limit_error","message":"Rate limited (rejected). Limiting wind
  #6 Tool use: stop_reason=undefined tool=false
  #7 Tool use stream: HTTP 429
  #8 OpenAI non-stream: HTTP 429: {"type":"error","error":{"type":"rate_limit_error","message":"Rate limited (rejected). Limiting wind
  #9 OpenAI stream: HTTP 429

Full workflow run

askalf enabled auto-merge (squash) May 18, 2026 00:46

askalf merged commit b72789d into master May 18, 2026
9 of 10 checks passed

askalf deleted the feat/v4.7.0-drift-summary-header branch May 18, 2026 00:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

release: v4.7.0 — auto-rebake PRs + drift issues lead with structured verdict#320

release: v4.7.0 — auto-rebake PRs + drift issues lead with structured verdict#320
askalf merged 1 commit into
masterfrom
feat/v4.7.0-drift-summary-header

askalf commented May 18, 2026

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

askalf commented May 18, 2026

What does this PR do?

Two new exports in `scripts/drift-report.mjs`

Wiring

What a reviewer sees on the next drift PR (before reading any detail)

Tests

How to test

Optional: simulate the summary output:

Checklist

Uh oh!

github-actions Bot commented May 18, 2026

Compat test: ❌ FAILED

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant