Skip to content

v0.28.1 — audio long-file fixes (whisper-1, quota refund, monthly cap)

Latest

Choose a tag to compare

@hanfour hanfour released this 29 Jun 05:11

audio long-file fixes (whisper-1, quota refund, monthly cap, diagnosability)

Why

Live testing of a real 45-minute meeting recording exposed three issues the 8-second clip hadn't: long files failed to transcribe, failed attempts silently consumed the daily quota, and the failure reason wasn't recoverable from logs.

Fixed

  • Default model → whisper-1. gpt-4o-mini-transcribe is GPT-4o-based with a ~25-min token cap → 400 input_too_large on long recordings (root-caused live; whisper-1 transcribed the same 45-min / ~5MB file in ~2 min). whisper-1 is bounded only by the 25MB request size, which the 16 kHz-mono re-encode satisfies — consistent with the existing size-based chunking.
  • Quota refund leak. A single-chunk total failure emitted a gap-marker-only partialTranscript, which made the coordinator treat it as "cost incurred → don't refund" → reserved minutes leaked (two failed 45-min attempts ate 90 phantom minutes and tripped the daily cap). transcribe now returns partialTranscript only when a segment actually succeeded; total failures refund. Refunds target the reserve-time day/month bucket, so a job crossing a period boundary refunds the file it charged.
  • Failure diagnosability. The underlying transcribe failure (HTTP status + code, e.g. 400 input_too_large) is surfaced into the audio.failed event reason.
  • Cost estimate. USD_PER_MINUTE 0.003 → 0.006 (whisper-1 list price) so audio.transcribed.estimatedUsd is accurate.

Added

  • Optional whole-workspace monthly budget capaudio.quota.globalMonthlyMinutes with a monthly accumulator file; always tracked (so releases refund it), enforced only when configured.

Verified

  • Live: a 45-minute file transcribes end-to-end on whisper-1 (audio.transcribed durationSec=2736 chunks=1 ms=132359audio.summarized mode=long).
  • Code-reviewed (subagent): one MEDIUM (month-boundary refund) found, fixed, and covered by a cross-month test.

Tests

  • @pmk/cli: 847 tests, 100% pass (+ monthly cap incl. cross-month refund, total-failure-no-partial, failure-detail capture, whisper-1 default).

Notes

  • Cost (OpenAI bills in USD): whisper-1 $0.006/min. This deployment runs per-user 600/day, global 1800/day, global 7500/month (≈ NT$1,485).

PR: #67.