Skip to content

Pronunciation Metric#74

Open
oluwanifemibamgbose wants to merge 7 commits intomainfrom
feat/pronunciation-metric-public
Open

Pronunciation Metric#74
oluwanifemibamgbose wants to merge 7 commits intomainfrom
feat/pronunciation-metric-public

Conversation

@oluwanifemibamgbose
Copy link
Copy Markdown
Collaborator

Summary

  • Adds a new diagnostic audio judge metric pronunciation that evaluates agent speech for phonetic quality (lexical stress, phoneme production) — scoped narrowly so it does not overlap with agent_speech_fidelity (entity/content accuracy).
  • Uses Gemini as the audio judge, reusing the existing SpeechFidelityBaseMetric infrastructure (file upload fallback, retries, per-turn response parsing).
  • Binary 0/1 rating per turn with structured per-dimension evidence (bad_stress, bad_sound)
  • The judge receives audio + turn IDs only — no intended text is passed, so the judge must transcribe from audio alone.

meets a typical customer's expectations (e.g. a non-standard accent that doesn't
impair comprehension).

- **0** (Unacceptable): Pronunciation errors that have a high negative impact on
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not mentioning "bad sound" here?

acronyms, codes) accurately — those are fidelity concerns, not pronunciation.
- Rendering choices for numbers, dates, or amounts — e.g. "fifteen dollars" vs "one
five dollars" is a rendering-style decision, not a pronunciation error.
- Accent or regional variation — do not flag non-American vowels or rhoticity.
Copy link
Copy Markdown
Collaborator

@fanny-riols fanny-riols Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not flag non-American accents, for the agent? In the rating below, unde "Great", we have "in line with a native speaker of General American English.". So I find it confusing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants