ci: add Claude Code Action workflow#599
Conversation
Listens for @claude mentions in issue comments, PR review comments, PR reviews, and issues, and runs anthropics/claude-code-action@v1. Requires: - Claude GitHub App installed on the repo - ANTHROPIC_API_KEY secret configured in repo settings Template: https://github.com/anthropics/claude-code-action
Offline VBx Pipeline ResultsSpeaker Diarization Performance (VBx Batch Mode)Optimal clustering with Hungarian algorithm for maximum accuracy
Offline VBx Pipeline Timing BreakdownTime spent in each stage of batch diarization
Speaker Diarization Research ComparisonOffline VBx achieves competitive accuracy with batch processing
Pipeline Details:
🎯 Offline VBx Test • AMI Corpus ES2004a • 1049.0s meeting audio • 127.5s processing • Test runtime: 2m 10s • 05/11/2026, 11:14 AM EST |
## Summary Switches the Claude Code Action auth from `ANTHROPIC_API_KEY` to `CLAUDE_CODE_OAUTH_TOKEN`, which uses a Claude Max/Pro subscription instead of pay-per-token API billing. The PR #599 workflow run failed with: \`\`\` Environment variable validation failed: - Either ANTHROPIC_API_KEY or CLAUDE_CODE_OAUTH_TOKEN is required \`\`\` ## Required setup (one-time, maintainer) \`\`\`bash # Generate an OAuth token tied to your Claude account claude setup-token # Store it in repo secrets gh secret set CLAUDE_CODE_OAUTH_TOKEN --repo FluidInference/FluidAudio # (paste the token when prompted) \`\`\` Verify: \`\`\`bash gh secret list --repo FluidInference/FluidAudio \`\`\` ## Test plan - [ ] Maintainer runs the two commands above to populate the secret - [ ] After merge, post \`@claude help\` on a throwaway issue and confirm the workflow runs without env-var errors
ASR Benchmark Results ✅Status: All benchmarks passed Parakeet v3 (multilingual)
Parakeet v2 (English-optimized)
Streaming (v3)
Streaming (v2)
Streaming tests use 5 files with 0.5s chunks to simulate real-time audio streaming 25 files per dataset • Test runtime: 8m31s • 05/11/2026, 11:20 AM EST RTFx = Real-Time Factor (higher is better) • Calculated as: Total audio duration ÷ Total processing time Expected RTFx Performance on Physical M1 Hardware:• M1 Mac: ~28x (clean), ~25x (other) Testing methodology follows HuggingFace Open ASR Leaderboard |
PocketTTS Smoke Test ✅
Runtime: 0m28s Note: PocketTTS uses CoreML MLState (macOS 15) KV cache + Mimi streaming state. CI VM lacks physical GPU — audio quality and performance may differ from Apple Silicon. |
VAD Benchmark ResultsPerformance Comparison
Dataset Details
✅: Average F1-Score above 70% |
Sortformer High-Latency Benchmark ResultsES2004a Performance (30.4s latency config)
Sortformer High-Latency • ES2004a • Runtime: 3m 41s • 2026-05-11T15:34:18.369Z |
Kokoro TTS Smoke Test ✅
Runtime: 0m51s Note: Kokoro TTS uses CoreML flow matching + Vocos vocoder. CI VM lacks physical ANE — performance may differ from Apple Silicon. |
Qwen3-ASR int8 Smoke Test ✅
Performance Metrics
Runtime: 5m10s Note: CI VM lacks physical GPU — CoreML MLState (macOS 15) KV cache produces degraded results on virtualized runners. On Apple Silicon: ~1.3% WER / 2.5x RTFx. |
Speaker Diarization Benchmark ResultsSpeaker Diarization PerformanceEvaluating "who spoke when" detection accuracy
Diarization Pipeline Timing BreakdownTime spent in each stage of speaker diarization
Speaker Diarization Research ComparisonResearch baselines typically achieve 18-30% DER on standard datasets
Note: RTFx shown above is from GitHub Actions runner. On Apple Silicon with ANE:
🎯 Speaker Diarization Test • AMI Corpus ES2004a • 1049.0s meeting audio • 44.3s diarization time • Test runtime: 3m 17s • 05/11/2026, 11:49 AM EST |
Parakeet EOU Benchmark Results ✅Status: Benchmark passed Performance Metrics
Streaming Metrics
Test runtime: 1m25s • 05/11/2026, 11:51 AM EST RTFx = Real-Time Factor (higher is better) • Processing includes: Model inference, audio preprocessing, state management, and file I/O |
Summary
Adds
.github/workflows/claude.ymlso the repo can respond to@claudementions in issues, issue comments, PR reviews, and PR review comments via anthropics/claude-code-action@v1.Motivation: PR #596 had a reviewer post
@claude reviewand nothing happened because no workflow was wired up. This PR fixes that for future reviews.What it does
issue_comment,pull_request_review_comment,pull_request_review,issues(opened/assigned)@claude(cheap filter, prevents wasted runs)ANTHROPIC_API_KEYrepo secret for authreadpermissions on contents/PRs/issues;id-token: writefor OIDCRequired configuration (repo settings)
Before this workflow can run, a maintainer needs to:
FluidInference/FluidAudioANTHROPIC_API_KEYsecret in repo Settings -> Secrets and variables -> ActionsWithout those, the workflow file is inert (no failed runs, just no-op).
Test plan
ANTHROPIC_API_KEY@claude helpon a throwaway issue and confirm the workflow fires@claudecomments do not trigger the job🤖 Generated with Claude Code