audio long-file fixes (whisper-1, quota refund, monthly cap, diagnosability)
Why
Live testing of a real 45-minute meeting recording exposed three issues the 8-second clip hadn't: long files failed to transcribe, failed attempts silently consumed the daily quota, and the failure reason wasn't recoverable from logs.
Fixed
- Default model →
whisper-1.gpt-4o-mini-transcribeis GPT-4o-based with a ~25-min token cap →400 input_too_largeon long recordings (root-caused live; whisper-1 transcribed the same 45-min / ~5MB file in ~2 min). whisper-1 is bounded only by the 25MB request size, which the 16 kHz-mono re-encode satisfies — consistent with the existing size-based chunking. - Quota refund leak. A single-chunk total failure emitted a gap-marker-only
partialTranscript, which made the coordinator treat it as "cost incurred → don't refund" → reserved minutes leaked (two failed 45-min attempts ate 90 phantom minutes and tripped the daily cap).transcribenow returnspartialTranscriptonly when a segment actually succeeded; total failures refund. Refunds target the reserve-time day/month bucket, so a job crossing a period boundary refunds the file it charged. - Failure diagnosability. The underlying transcribe failure (HTTP status + code, e.g.
400 input_too_large) is surfaced into theaudio.failedevent reason. - Cost estimate.
USD_PER_MINUTE0.003 → 0.006 (whisper-1 list price) soaudio.transcribed.estimatedUsdis accurate.
Added
- Optional whole-workspace monthly budget cap —
audio.quota.globalMonthlyMinuteswith a monthly accumulator file; always tracked (so releases refund it), enforced only when configured.
Verified
- Live: a 45-minute file transcribes end-to-end on whisper-1 (
audio.transcribed durationSec=2736 chunks=1 ms=132359→audio.summarized mode=long). - Code-reviewed (subagent): one MEDIUM (month-boundary refund) found, fixed, and covered by a cross-month test.
Tests
@pmk/cli: 847 tests, 100% pass (+ monthly cap incl. cross-month refund, total-failure-no-partial, failure-detail capture, whisper-1 default).
Notes
- Cost (OpenAI bills in USD): whisper-1 $0.006/min. This deployment runs per-user 600/day, global 1800/day, global 7500/month (≈ NT$1,485).
PR: #67.