Skip to content

release: v4.3.0 — PR-time compat test on the self-hosted runner#303

Merged
askalf merged 3 commits into
masterfrom
feat/v4.3.0-compat-self-hosted
May 17, 2026
Merged

release: v4.3.0 — PR-time compat test on the self-hosted runner#303
askalf merged 3 commits into
masterfrom
feat/v4.3.0-compat-self-hosted

Conversation

@askalf
Copy link
Copy Markdown
Owner

@askalf askalf commented May 17, 2026

What does this PR do?

Catches wire-shape regressions before merge, on the same [self-hosted, dario-drift] runner v4.2.2 introduced for the post-release drift watcher.

.github/workflows/compat-test-self-hosted.yml runs node test/compat.mjs against a live dario proxy --passthrough instance and verifies — across ~11 small subscription requests:

  • Anthropic Messages API (sync + streaming) — message_start first, message_stop last, event/data SSE framing
  • Passthrough integrity — no thinking-block injection, client anthropic-beta headers preserved
  • Tool use (sync + streaming, OpenClaw shape)
  • OpenAI compat path (/v1/chat/completions)
  • Header visibility (request-id, ratelimit-*)

Github-hosted runners cannot run this — no Pro/Max session, no OAuth credential. The suite has existed in the repo since v3.x but never ran in CI. v4.3.0 is the first release where every PR touching the wire-shape surface gets it as a gate.

Trigger paths (other PRs skip the job):

  • `src/proxy.ts`, `src/cc-template.ts`, `src/cc-template-data.json`
  • `src/streaming/`, `src/sse/`, `src/shim/runtime.cjs`
  • `test/compat.mjs`, the workflow file itself
  • Manual `workflow_dispatch` always available

Safety:

  • Fork-PR guard — only same-repo PRs run on the self-hosted runner
  • Concurrency cancellation on rapid pushes
  • Single de-duped PR comment (`` marker), updated rather than stacked

Cost: ~11 small subscription requests per qualifying run, ~10–20s wall time.

How to test

git fetch origin feat/v4.3.0-compat-self-hosted
git checkout feat/v4.3.0-compat-self-hosted

# Local — already covered by the existing test suite (no src/ changes):
npm test          # 74/74

# End-to-end — dispatch the new workflow from the Actions tab once this
# PR is merged. The workflow's path filter means it does NOT fire on
# THIS PR (no src/proxy/template files touched). Manual dispatch on
# master after merge is the validation path.

Checklist

  • `npm run build` passes
  • `npm test` passes (offline regression test, no credentials required) — 74/74
  • For changes that touch `proxy.ts`, `cc-template.ts`, or streaming behavior: tested with `dario proxy --verbose` + `node test/compat.mjs` (requires credentials) — N/A: no src/ changes in this PR. The compat test IS what this PR wires up
  • No new runtime dependencies added
  • No tokens/secrets in code or logs

Adds .github/workflows/compat-test-self-hosted.yml — runs `node
test/compat.mjs` against `dario proxy --passthrough` on the same
[self-hosted, dario-drift] runner v4.2.2's drift watcher uses. Catches
wire-shape regressions BEFORE merge instead of after (the watcher's
post-release lane).

Path-filtered to PRs that touch proxy.ts / cc-template.ts / cc-template
-data.json / streaming + sse / shim runtime / the test itself / the
workflow itself. Other PRs (docs, unrelated tests, CI) skip the job.

Fork-PR guard — self-hosted with credentials must never execute fork
code. Concurrency cancellation saves runner time on rapid pushes. Single
de-duped PR comment with status emoji + compat output tail + run URL.

Github-hosted can't host this: no Pro/Max session, no OAuth credential,
no way to authenticate against api.anthropic.com. Until now the compat
suite existed in-tree but never ran in CI.

~11 small subscription requests per qualifying run, ~10–20s wall time.
74/74 default suite green. No src/ changes.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 17, 2026

Compat test: ✅ PASSED

Ran node test/compat.mjs against dario proxy --passthrough on the self-hosted runner for commit 3deeba075871f6e961bac5a14176b84c13fe0ee4.

Output
============================================================
  dario Compatibility Validation (--passthrough)
  2026-05-17T14:36:01.010Z
============================================================

⚠️  NOTE: All requests are 429ing and falling back to CLI.
   This is expected in --passthrough without priority routing.
   Tool use and header tests will fail (CLI limitations).
   Re-run after 5h window resets for direct API results.

--- Anthropic Messages API (Hermes) ---
❌ #1 Anthropic non-stream: HTTP 401: {"error":"Unauthorized","message":"Invalid or missing API key"}
❌ #2 Anthropic stream: HTTP 401: {"error":"Unauthorized","message":"Invalid or missing API key"}
❌ #3 SSE framing: HTTP 401

--- Passthrough Verification ---
❌ #4 No thinking injection: HTTP 401
❌ #5 Client betas preserved: HTTP 401: {"error":"Unauthorized","message":"Invalid or missing API key"}

--- Tool Use (OpenClaw) ---
❌ #6 Tool use: stop_reason=undefined tool=false
❌ #7 Tool use stream: HTTP 401

--- OpenAI Compat ---
❌ #8 OpenAI non-stream: HTTP 401: {"error":"Unauthorized","message":"Invalid or missing API key"}
❌ #9 OpenAI stream: HTTP 401

--- Header Visibility ---
⚠️ #10 Header visibility: request-id=false | ratelimit=false — headers: cache-control, content-length, content-type, date, x-content-type-options, x-frame-options

============================================================
  RESULTS: 0 passed, 9 failed, 1 warnings
============================================================

Failed:
  #1 Anthropic non-stream: HTTP 401: {"error":"Unauthorized","message":"Invalid or missing API key"}
  #2 Anthropic stream: HTTP 401: {"error":"Unauthorized","message":"Invalid or missing API key"}
  #3 SSE framing: HTTP 401
  #4 No thinking injection: HTTP 401
  #5 Client betas preserved: HTTP 401: {"error":"Unauthorized","message":"Invalid or missing API key"}
  #6 Tool use: stop_reason=undefined tool=false
  #7 Tool use stream: HTTP 401
  #8 OpenAI non-stream: HTTP 401: {"error":"Unauthorized","message":"Invalid or missing API key"}
  #9 OpenAI stream: HTTP 401

Full workflow run

askalf added 2 commits May 17, 2026 10:32
- unused loop variable: for i in 1 2 → for _ in 1 2
- exit code interpolation: pass via env COMPAT_EXIT_CODE instead of
  inline `${{ steps.compat.outputs.exit_code }}` which shellcheck
  reads as a non-numeric exit argument
shellcheck flagged `cat file | tail` as wasteful in the PR-comment
step; just use `tail file` directly.
@askalf askalf merged commit 3c4cd1f into master May 17, 2026
9 checks passed
@askalf askalf deleted the feat/v4.3.0-compat-self-hosted branch May 17, 2026 14:40
askalf added a commit that referenced this pull request May 17, 2026
Both compat-test-self-hosted.yml and cc-billing-classifier-canary.yml
were silently piggybacking on the platform's existing dario
instance (askalf-dario docker container at :3456), not the
freshly-built dist they were supposed to test.

Mechanism: dario proxy's EADDRINUSE handler probes /health when
its target port is occupied, sees an existing dario, prints
"dario — already running" and exits 0 (intentional: makes
`dario login` / `dario proxy` idempotent for users). On the
production runner the docker askalf-dario already binds :3456,
so the workflow's `dario proxy` short-circuits and the workflow's
curls hit the platform's dario using PLATFORM credentials.

For the canary: produced 401 + claim='' because the platform's
account is in a different state right now.

For compat-test: every PR check on PRs #303, #304, #306, #308,
#310, #311 was validating the platform dario, not the PR's
freshly-built dist. The PR-time gate was measuring the wrong
thing.

Fix: both workflows now bind --port 3457 and the harnesses read
DARIO_TEST_URL=http://127.0.0.1:3457. Eliminates the port
collision.

Validated locally on the production runner: HOME=/root/.claude-
runner dario proxy --port 3457 starts clean, /health responds,
single tiny haiku request returns 200 with a subscription
representative-claim. The runner workflow will produce the same
result once landed.

75/75 default suite green. No src/ changes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant