Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@
*.mp4 filter=lfs diff=lfs merge=lfs -text
*.webm filter=lfs diff=lfs merge=lfs -text
*.svg filter=lfs diff=lfs merge=lfs -text
docs/favicon.svg !filter !diff !merge
128 changes: 128 additions & 0 deletions .github/workflows/fix-drift.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
name: Fix Drift
on:
workflow_dispatch:
workflow_run:
workflows: ["Drift Tests"]
types: [completed]
branches: [main]

concurrency:
group: drift-fix
cancel-in-progress: false

jobs:
fix:
if: >-
github.event_name == 'workflow_dispatch' ||
github.event.workflow_run.conclusion == 'failure'
runs-on: ubuntu-latest
timeout-minutes: 30
permissions:
contents: write
pull-requests: write
issues: write
steps:
- uses: actions/checkout@v4
- uses: pnpm/action-setup@v4
- uses: actions/setup-node@v4
with:
node-version: 22
cache: pnpm
- run: pnpm install --frozen-lockfile

# Step 0: Configure git identity and create fix branch
- name: Configure git
run: |
git config user.name "llmock-drift-bot"
git config user.email "drift-bot@copilotkit.ai"
git checkout -B fix/drift-$(date +%Y-%m-%d)-${{ github.run_id }}

# Step 1: Detect drift and produce report
- name: Collect drift report
id: detect
run: |
set +e
npx tsx scripts/drift-report-collector.ts
EXIT_CODE=$?
set -e
echo "exit_code=$EXIT_CODE" >> $GITHUB_OUTPUT
if [ "$EXIT_CODE" -eq 2 ]; then
: # critical drift found, continue
elif [ "$EXIT_CODE" -ne 0 ]; then
echo "::error::Collector script crashed with exit code $EXIT_CODE"
exit $EXIT_CODE
fi
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}

# Always upload the report as an artifact
- name: Upload drift report
if: always()
uses: actions/upload-artifact@v4
with:
name: drift-report
path: drift-report.json
if-no-files-found: warn
retention-days: 30

# Step 2: Exit if no critical drift
- name: Check for critical diffs
id: check
env:
DETECT_EXIT_CODE: ${{ steps.detect.outputs.exit_code }}
run: |
if [ "$DETECT_EXIT_CODE" = "2" ]; then
echo "skip=false" >> $GITHUB_OUTPUT
echo "Critical drift detected"
else
echo "skip=true" >> $GITHUB_OUTPUT
echo "No critical drift detected (exit code: $DETECT_EXIT_CODE) — skipping fix"
fi

# Step 3: Invoke Claude Code to fix
- name: Auto-fix drift
if: steps.check.outputs.skip != 'true'
run: npx tsx scripts/fix-drift.ts
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}

# Upload Claude Code output for debugging
- name: Upload Claude Code logs
if: always()
uses: actions/upload-artifact@v4
with:
name: claude-code-output
path: claude-code-output.log
if-no-files-found: warn
retention-days: 30

# Step 4: Verify fix independently
- name: Verify conformance
if: steps.check.outputs.skip != 'true'
run: pnpm test

- name: Verify drift resolved
if: steps.check.outputs.skip != 'true'
run: pnpm test:drift
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}

# Step 5: Create PR on success
- name: Create PR
if: success() && steps.check.outputs.skip != 'true'
run: npx tsx scripts/fix-drift.ts --create-pr
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

# Step 6: Open issue on failure
- name: Create issue on failure
if: failure() && steps.check.outputs.skip != 'true'
run: npx tsx scripts/fix-drift.ts --create-issue
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
30 changes: 29 additions & 1 deletion .github/workflows/test-drift.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ on:
jobs:
drift:
runs-on: ubuntu-latest
timeout-minutes: 15
steps:
- uses: actions/checkout@v4
- uses: pnpm/action-setup@v4
Expand All @@ -14,8 +15,35 @@ jobs:
node-version: 22
cache: pnpm
- run: pnpm install --frozen-lockfile
- run: pnpm test:drift

- name: Run drift tests
id: drift
run: |
set +e
npx tsx scripts/drift-report-collector.ts
EXIT_CODE=$?
set -e
echo "exit_code=$EXIT_CODE" >> $GITHUB_OUTPUT
if [ "$EXIT_CODE" -eq 2 ]; then
: # critical drift found, continue
elif [ "$EXIT_CODE" -ne 0 ]; then
echo "::error::Collector script crashed with exit code $EXIT_CODE"
exit $EXIT_CODE
fi
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}

- name: Upload drift report
if: always()
uses: actions/upload-artifact@v4
with:
name: drift-report
path: drift-report.json
if-no-files-found: warn
retention-days: 30

- name: Fail if critical drift detected
if: steps.drift.outputs.exit_code == '2'
run: exit 1
9 changes: 9 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,15 @@ entire repo, not just staged files.
- When adding features or fixing bugs, add or update tests
- Run `pnpm test` before pushing

## Drift Remediation

Automated drift remediation lives in `scripts/`:

- `scripts/drift-report-collector.ts` — runs drift tests, produces `drift-report.json`
- `scripts/fix-drift.ts` — reads drift report, invokes Claude Code to fix builders, creates PR or issue

See `DRIFT.md` for full documentation and `.github/workflows/fix-drift.yml` for the CI workflow.

## Commit Messages

- This repo enforces conventional commit prefixes via commitlint: `fix:`, `feat:`, `docs:`, `test:`, `chore:`, `refactor:`, etc.
Expand Down
27 changes: 25 additions & 2 deletions DRIFT.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ When a model is deprecated:

## WebSocket Drift Coverage

In addition to the 19 existing drift tests (16 HTTP response-shape + 3 model deprecation), WebSocket drift tests cover llmock's WS protocols:
In addition to the 19 existing drift tests (16 HTTP response-shape + 3 model deprecation), WebSocket drift tests cover llmock's WS protocols (4 verified + 2 canary = 6 WS tests):

| Protocol | Text | Tool Call | Real Endpoint | Status |
| ------------------- | ---- | --------- | ------------------------------------------------------------------- | ---------- |
Expand Down Expand Up @@ -138,6 +138,29 @@ Drift tests run on a schedule:

See `.github/workflows/test-drift.yml`.

## Automated Drift Remediation

When the daily drift test detects critical diffs on the `main` branch, the `fix-drift.yml` workflow runs automatically:

1. **Collect** — `scripts/drift-report-collector.ts` runs drift tests and produces a structured `drift-report.json`
2. **Fix** — `scripts/fix-drift.ts` (default mode) constructs a prompt from the report and invokes Claude Code to fix the builders
3. **Verify** — Independent `pnpm test` and `pnpm test:drift` steps confirm the fix works
4. **PR** — `scripts/fix-drift.ts --create-pr` stages and commits the changes, bumps the version, and opens a pull request
5. **Issue** (on failure) — `scripts/fix-drift.ts --create-issue` opens a GitHub issue with the drift report and Claude Code output

Steps 2 and 4/5 are separate invocations of `fix-drift.ts` with different modes.

### Artifacts

Both workflows upload artifacts:

- `drift-report.json` — structured drift data (retained 30 days)
- `claude-code-output.log` — Claude Code's reasoning and tool calls (fix workflow only)

### Manual trigger

The fix workflow also supports `workflow_dispatch` for manual runs.

## Cost

~25 API calls per run (16 HTTP response-shape + 3 model listing + 4 WS + 2 canaries) using the cheapest available models (`gpt-4o-mini`, `gpt-4o-mini-realtime-preview`, `claude-haiku-4-5-20251001`, `gemini-2.5-flash`) with 10-100 max tokens each. Under $0.15/week at daily cadence. When Gemini Live text-capable models become available, this will increase to 6 WS calls.
~25 API calls per run (16 HTTP response-shape + 3 model listing + 6 WS including canaries) using the cheapest available models (`gpt-4o-mini`, `gpt-4o-mini-realtime-preview`, `claude-haiku-4-5-20251001`, `gemini-2.5-flash`) with 10-100 max tokens each. Under $0.15/week at daily cadence. When Gemini Live text-capable models become available, the 2 canary tests will become full drift tests, increasing real WS connections from 4 to 6.
33 changes: 3 additions & 30 deletions docs/favicon.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 2 additions & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,9 @@
"typescript-eslint": "^8.35.1",
"@anthropic-ai/sdk": "^0.78.0",
"@google/generative-ai": "^0.24.0",
"@types/node": "^22.0.0",
"openai": "^4.0.0",
"tsx": "^4.19.0",
"vitest": "^3.2.1"
}
}
Loading
Loading