fix(examples/voice): swap deleted gpt-4o-audio-preview → gpt-audio-mini#607
fix(examples/voice): swap deleted gpt-4o-audio-preview → gpt-audio-mini#607drewdrewthis wants to merge 2 commits into
Conversation
|
[grinder] READY for human review CI: green — zero failing, zero pending (all 17 checks SUCCESS or SKIPPED-by-design) Verified by:
ACs (from verification plan):
Do NOT merge — that's your call. |
…p are in #607, not this branch Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
✅ Review + prove-it: READYReview: diff is exactly the model swap ( Prove-it (live, not CI — CI skips these): The Caveat (non-blocking): 4 stale |
… refs after model swap Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…t-gen (#610) * docs(voice/#606): expand STT/TTS doc comments and relax audio-to-text judge criteria Adds deliberate-choice rationale comments to OPENAI_STT_MODEL and OPENAI_TTS_MODEL in both JS (voice-models.ts) and Python (voice_models.py), noting no gpt-5-family transcription/TTS models exist on the public API as of 2026-06. Also documents the Python-only OPENAI_BOT_STT_MODEL gap in the TS file. Relaxes the multimodal-audio-to-text judge criteria from overly-specific assertions (exact voice gender, exact repeat phrasing) to behavioural checks (processed audio, coherent response, non-text format acknowledgement). Updates the stale skip comment to reflect the model swap in PR #607. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(voice/#606): update feature-file contract counts to match post-#561/#604 reality Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(voice/#606): add AC4/AC5 doc comments — STT lock rationale + TTS callable-swap pattern - openai-realtime.ts: explain why `input.transcription.model` is locked to OPENAI_STT_MODEL and not exposed as a constructor option (Realtime API only accepts transcription-class models; callers who need a different model subclass the adapter) - openai-tts.ts: document that the TTS model is not a parameter by design — the pattern is to swap the whole TTSCallable rather than parameterise this one; link to OPENAI_TTS_MODEL for the current-gen rationale Closes #606 (AC4 + AC5) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(examples/voice/#606): correct stale comment — model swap + unskip are in #607, not this branch Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
The voice-to-voice example helper and the audio-to-text example pinned
`gpt-4o-audio-preview`, which OpenAI has removed (404 model_not_found
since 2026-05-19). Any user running the canonical voice example hit an
immediate 404.
Switch to `gpt-audio-mini` — OpenAI's current cost-efficient GA
audio-chat model — matching the Python twin, which already migrated
(python/scenario/config/voice_models.py:44 OPENAI_AUDIO_CHAT_MODEL,
python/examples/test_audio_to_text.py:157). Verified live: gpt-audio-mini
accepts the identical chat.completions shape (modalities:["text","audio"],
audio:{voice,format}) and returns audio. Re-ran the voice-to-voice e2e
against prod LangWatch — success: true, real 2-turn conversation, traces
landed (project_bZspxwkhCD4POvqmIgOr2).
SDK core was unaffected (OpenAIRealtimeAgentAdapter uses gpt-realtime-mini).
This closes a py↔ts example-parity gap left by #561.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… refs after model swap Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
3593191 to
37c0343
Compare
|
Automated low-risk assessment This PR was evaluated against the repository's Low-Risk Pull Requests procedure and does not qualify as low risk.
This PR requires a manual review before merging. |
… refs after model swap Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…e, keep+migrate supported audio examples Cohesive retirement of the legacy gpt-4o-audio-preview voice/audio example surface, folding in the model swap from #607 (cherry-picked) and superseding the unskip plan in #486. Genuinely-dead (retired): - Tombstone docs/docs/pages/examples/multimodal/voice-to-voice.mdx and testing-voice-agents.mdx -> pointer to /voice/getting-started (URLs still 200; the langwatch vocs fork has no redirect layer, so a tombstone is how we avoid 404s on previously-public URLs). - Delete the now-unused LegacyVoiceDeprecation.mdx snippet (no importers left). - (test_voice_to_voice_conversation.py already deleted in an earlier #486 commit.) Supported (kept + migrated to gpt-audio-mini): - audio-to-text.mdx / audio-to-audio.mdx kept and updated: prereq prose now names gpt-audio-mini; LegacyVoiceDeprecation banner removed (these document the CURRENT supported single-call pattern, not a legacy one). - Python test_audio_to_text.py / test_audio_to_audio.py: skip COMMENTS rewritten to the real reason (live E2E -- real OpenAI gpt-audio-mini + LangWatch backend, cost, non-deterministic audio); skipif(CI) markers retained by design. No model literal change (they route through the helper's gpt-audio-mini default). - _generated example partials regenerated to match the migrated test sources. overview.mdx voice-agents link repointed to /voice/getting-started. Why the audio tests stay CI-skipped: they are live end-to-end tests; #486's "unskip to restore CI coverage" premise was never achievable (cost + non-determinism). The right end-state is migrated-and-intentionally-skipped. Docs build: pnpm build exits 0, no broken-link/missing-import errors; tombstone routes render the /voice/getting-started pointer. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Superseded by #612. The two model-swap commits from this PR ( |
Problem
The voice/audio examples pinned
gpt-4o-audio-preview, which OpenAI has deleted —404 model_not_foundsince 2026-05-19. Any user running the canonical voice-to-voice example (or audio-to-text) hits an immediate 404. The SDK core is unaffected (OpenAIRealtimeAgentAdapterusesgpt-realtime-mini).Fix
Swap to
gpt-audio-mini— OpenAI's current cost-efficient GA audio-chat model — matching the Python twin, which already migrated (python/scenario/config/voice_models.py:44OPENAI_AUDIO_CHAT_MODEL,python/examples/test_audio_to_text.py:157). Closes a py↔ts example-parity gap left by #561.Files changed (4 lines):
examples/vitest/tests/helpers/openai-voice-agent.ts— model literal + 2 doc-commentsexamples/vitest/tests/multimodal-audio-to-text.test.ts— model literalVerification (live, against prod LangWatch)
gpt-audio-miniaccepts the identicalchat.completionsshape (modalities:["text","audio"],audio:{voice,format}) and returns audio — confirmed via direct/v1/chat/completionscall.multimodal-voice-to-voice-conversation.test.ts: ✅success: true, real 2-turn conversation, judge passed, traces landed in prod (project_bZspxwkhCD4POvqmIgOr2).multimodal-audio-to-audio.test.ts: ✅ passes.multimodal-audio-to-text.test.ts: no longer 404s, butresult.success=falseon brittle judge criteria ("guesses it's a male voice", "says what format the input was") —gpt-audio-minidoesn't reliably volunteer all three. This is a pre-existing brittle-criteria sensitivity, not a regression from this fix; tracked in Voice STT/TTS defaults still use gpt-4o-* models — decide modernization path #606. Reverting would restore the 404 (strictly worse), so this PR keeps the swap.Draft
Left as draft — model-default modernization (the remaining
gpt-4o-*STT/TTS) is tracked separately in #606. Ready for your review/merge call.🤖 Generated with Claude Code