Skip to content

fix(examples/voice): swap deleted gpt-4o-audio-preview → gpt-audio-mini#607

Closed
drewdrewthis wants to merge 2 commits into
mainfrom
fix/voice-example-dead-audio-model
Closed

fix(examples/voice): swap deleted gpt-4o-audio-preview → gpt-audio-mini#607
drewdrewthis wants to merge 2 commits into
mainfrom
fix/voice-example-dead-audio-model

Conversation

@drewdrewthis
Copy link
Copy Markdown
Collaborator

Problem

The voice/audio examples pinned gpt-4o-audio-preview, which OpenAI has deleted404 model_not_found since 2026-05-19. Any user running the canonical voice-to-voice example (or audio-to-text) hits an immediate 404. The SDK core is unaffected (OpenAIRealtimeAgentAdapter uses gpt-realtime-mini).

Fix

Swap to gpt-audio-mini — OpenAI's current cost-efficient GA audio-chat model — matching the Python twin, which already migrated (python/scenario/config/voice_models.py:44 OPENAI_AUDIO_CHAT_MODEL, python/examples/test_audio_to_text.py:157). Closes a py↔ts example-parity gap left by #561.

Files changed (4 lines):

  • examples/vitest/tests/helpers/openai-voice-agent.ts — model literal + 2 doc-comments
  • examples/vitest/tests/multimodal-audio-to-text.test.ts — model literal

Verification (live, against prod LangWatch)

  • gpt-audio-mini accepts the identical chat.completions shape (modalities:["text","audio"], audio:{voice,format}) and returns audio — confirmed via direct /v1/chat/completions call.
  • multimodal-voice-to-voice-conversation.test.ts: ✅ success: true, real 2-turn conversation, judge passed, traces landed in prod (project_bZspxwkhCD4POvqmIgOr2).
  • multimodal-audio-to-audio.test.ts: ✅ passes.
  • multimodal-audio-to-text.test.ts: no longer 404s, but result.success=false on brittle judge criteria ("guesses it's a male voice", "says what format the input was") — gpt-audio-mini doesn't reliably volunteer all three. This is a pre-existing brittle-criteria sensitivity, not a regression from this fix; tracked in Voice STT/TTS defaults still use gpt-4o-* models — decide modernization path #606. Reverting would restore the 404 (strictly worse), so this PR keeps the swap.

Draft

Left as draft — model-default modernization (the remaining gpt-4o-* STT/TTS) is tracked separately in #606. Ready for your review/merge call.

🤖 Generated with Claude Code

@drewdrewthis drewdrewthis marked this pull request as ready for review June 4, 2026 16:29
@drewdrewthis drewdrewthis added the grinding Grinder is actively managing this PR label Jun 4, 2026
@drewdrewthis
Copy link
Copy Markdown
Collaborator Author

[grinder] READY for human review

CI: green — zero failing, zero pending (all 17 checks SUCCESS or SKIPPED-by-design)
Review threads: zero (confirmed via GraphQL reviewThreads — 0 nodes)
Draft: lifted (gh pr ready ✓)
Links: closes #486 (unskip voice tests — same dead-model root cause); related #606 (STT/TTS defaults modernization, tracked separately)

Verified by:

  • statusCheckRollup (17 checks, 2026-06-04T15:54–15:55Z):
    • preflight → SUCCESS
    • javascript-complete → SUCCESS
    • python-complete → SUCCESS
    • docs-complete → SUCCESS
    • CodeQL (javascript-typescript) → SUCCESS
    • CodeQL (python) → SUCCESS
    • Validate PR Title → SUCCESS
    • evaluate (auto-approve workflow) → SUCCESS
    • action-semantic-pull-request → SUCCESS
    • All changes checks → SUCCESS; ci-checks/test/build/firefighting/dismiss-firefighting-approval → SKIPPED (by-design path)
  • reviewThreads(first:50){"nodes":[]} (zero unresolved, zero outdated)
  • Live voice verification (this session, pre-push): multimodal-voice-to-voice-conversation.test.tssuccess:true; multimodal-audio-to-audio.test.ts ✅; gpt-audio-mini accepts identical chat.completions shape, audio returned, traces in prod (project_bZspxwkhCD4POvqmIgOr2)
  • Published @langwatch/scenario@0.4.12 verified byte-identical to the verified build (419873 bytes; audio fix present)

ACs (from verification plan):

Do NOT merge — that's your call.

@drewdrewthis drewdrewthis added pr-ready and removed grinding Grinder is actively managing this PR labels Jun 4, 2026
@github-actions github-actions Bot added the low-risk-change PR qualifies as low-risk per policy and can be merged without manual review label Jun 4, 2026
github-actions[bot]
github-actions Bot previously approved these changes Jun 4, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved by automation: PR qualifies as low-risk-change under the documented policy.

@drewdrewthis
Copy link
Copy Markdown
Collaborator Author

✅ Review + prove-it: READY

Review: diff is exactly the model swap (gpt-4o-audio-previewgpt-audio-mini) in 2 example files + 3 doc-comment lines. No scope creep. Verified live that gpt-4o-audio-preview is deleted (GET /v1/models/gpt-4o-audio-previewmodel_not_found) and gpt-audio-mini supports the exact modalities:["text","audio"] + audio:{voice,format:"wav"} shape the code uses (direct API call returned a WAV transcript).

Prove-it (live, not CI — CI skips these):

cd javascript/examples/vitest && env -u CI npx vitest run tests/multimodal-audio-to-text.test.ts
→ Test Files 1 passed (1) | judge verdict: SUCCESS, all 3 criteria met

The gpt-audio-mini agent produced audio, transcript extracted, gpt-5 judge passed clean (the #606 brittle-judge issue did not bite this run).

Caveat (non-blocking): 4 stale // Skipped in CI: depends on ... gpt-4o-audio-preview comments remain in the example files (one in this file) — comments only, no dead model in any executable line. Worth a one-line follow-up.

drewdrewthis added a commit that referenced this pull request Jun 4, 2026
… refs after model swap

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@github-actions github-actions Bot added low-risk-change PR qualifies as low-risk per policy and can be merged without manual review and removed low-risk-change PR qualifies as low-risk per policy and can be merged without manual review labels Jun 4, 2026
github-actions[bot]
github-actions Bot previously approved these changes Jun 4, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved by automation: PR qualifies as low-risk-change under the documented policy.

drewdrewthis added a commit that referenced this pull request Jun 5, 2026
…t-gen (#610)

* docs(voice/#606): expand STT/TTS doc comments and relax audio-to-text judge criteria

Adds deliberate-choice rationale comments to OPENAI_STT_MODEL and
OPENAI_TTS_MODEL in both JS (voice-models.ts) and Python (voice_models.py),
noting no gpt-5-family transcription/TTS models exist on the public API as
of 2026-06. Also documents the Python-only OPENAI_BOT_STT_MODEL gap in the
TS file. Relaxes the multimodal-audio-to-text judge criteria from
overly-specific assertions (exact voice gender, exact repeat phrasing) to
behavioural checks (processed audio, coherent response, non-text format
acknowledgement). Updates the stale skip comment to reflect the model swap
in PR #607.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(voice/#606): update feature-file contract counts to match post-#561/#604 reality

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(voice/#606): add AC4/AC5 doc comments — STT lock rationale + TTS callable-swap pattern

- openai-realtime.ts: explain why `input.transcription.model` is locked to
  OPENAI_STT_MODEL and not exposed as a constructor option (Realtime API
  only accepts transcription-class models; callers who need a different model
  subclass the adapter)
- openai-tts.ts: document that the TTS model is not a parameter by design —
  the pattern is to swap the whole TTSCallable rather than parameterise this
  one; link to OPENAI_TTS_MODEL for the current-gen rationale

Closes #606 (AC4 + AC5)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs(examples/voice/#606): correct stale comment — model swap + unskip are in #607, not this branch

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
drewdrewthis and others added 2 commits June 5, 2026 11:53
The voice-to-voice example helper and the audio-to-text example pinned
`gpt-4o-audio-preview`, which OpenAI has removed (404 model_not_found
since 2026-05-19). Any user running the canonical voice example hit an
immediate 404.

Switch to `gpt-audio-mini` — OpenAI's current cost-efficient GA
audio-chat model — matching the Python twin, which already migrated
(python/scenario/config/voice_models.py:44 OPENAI_AUDIO_CHAT_MODEL,
python/examples/test_audio_to_text.py:157). Verified live: gpt-audio-mini
accepts the identical chat.completions shape (modalities:["text","audio"],
audio:{voice,format}) and returns audio. Re-ran the voice-to-voice e2e
against prod LangWatch — success: true, real 2-turn conversation, traces
landed (project_bZspxwkhCD4POvqmIgOr2).

SDK core was unaffected (OpenAIRealtimeAgentAdapter uses gpt-realtime-mini).
This closes a py↔ts example-parity gap left by #561.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… refs after model swap

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@drewdrewthis drewdrewthis force-pushed the fix/voice-example-dead-audio-model branch from 3593191 to 37c0343 Compare June 5, 2026 09:56
@github-actions github-actions Bot removed the low-risk-change PR qualifies as low-risk per policy and can be merged without manual review label Jun 5, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 5, 2026

Automated low-risk assessment

This PR was evaluated against the repository's Low-Risk Pull Requests procedure and does not qualify as low risk.

The PR updates example/test code to swap the OpenAI model name from gpt-4o-audio-preview to gpt-audio-mini and adjusts related comments. Although the changes are small and confined to examples/tests, they alter which external model the code calls (a change to an integration with a third‑party system), which is explicitly excluded from low‑risk automatic merge under the policy. Therefore this PR does not qualify for the low-risk-change label.

This PR requires a manual review before merging.

@drewdrewthis drewdrewthis self-assigned this Jun 5, 2026
drewdrewthis added a commit that referenced this pull request Jun 5, 2026
… refs after model swap

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
drewdrewthis added a commit that referenced this pull request Jun 5, 2026
…e, keep+migrate supported audio examples

Cohesive retirement of the legacy gpt-4o-audio-preview voice/audio example
surface, folding in the model swap from #607 (cherry-picked) and superseding
the unskip plan in #486.

Genuinely-dead (retired):
- Tombstone docs/docs/pages/examples/multimodal/voice-to-voice.mdx and
  testing-voice-agents.mdx -> pointer to /voice/getting-started (URLs still
  200; the langwatch vocs fork has no redirect layer, so a tombstone is how
  we avoid 404s on previously-public URLs).
- Delete the now-unused LegacyVoiceDeprecation.mdx snippet (no importers left).
- (test_voice_to_voice_conversation.py already deleted in an earlier #486 commit.)

Supported (kept + migrated to gpt-audio-mini):
- audio-to-text.mdx / audio-to-audio.mdx kept and updated: prereq prose now
  names gpt-audio-mini; LegacyVoiceDeprecation banner removed (these document
  the CURRENT supported single-call pattern, not a legacy one).
- Python test_audio_to_text.py / test_audio_to_audio.py: skip COMMENTS rewritten
  to the real reason (live E2E -- real OpenAI gpt-audio-mini + LangWatch backend,
  cost, non-deterministic audio); skipif(CI) markers retained by design. No model
  literal change (they route through the helper's gpt-audio-mini default).
- _generated example partials regenerated to match the migrated test sources.

overview.mdx voice-agents link repointed to /voice/getting-started.

Why the audio tests stay CI-skipped: they are live end-to-end tests; #486's
"unskip to restore CI coverage" premise was never achievable (cost +
non-determinism). The right end-state is migrated-and-intentionally-skipped.

Docs build: pnpm build exits 0, no broken-link/missing-import errors; tombstone
routes render the /voice/getting-started pointer.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@drewdrewthis
Copy link
Copy Markdown
Collaborator Author

Superseded by #612. The two model-swap commits from this PR (gpt-4o-audio-previewgpt-audio-mini) were cherry-picked into #612, which retires the legacy voice/audio example surface cohesively (tombstones the dead example docs, keeps+migrates the supported audio examples, and reconciles #486). Closing as folded-in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant