Skip to content

Add new speech fidelity metric for s2s#60

Merged
gabegma merged 10 commits intomainfrom
ggm/modify-speech-fidelity-for-s2s
Apr 21, 2026
Merged

Add new speech fidelity metric for s2s#60
gabegma merged 10 commits intomainfrom
ggm/modify-speech-fidelity-for-s2s

Conversation

@gabegma
Copy link
Copy Markdown
Collaborator

@gabegma gabegma commented Apr 15, 2026

No description provided.

@gabegma gabegma self-assigned this Apr 15, 2026
Base automatically changed from ggm/fix-processing-for-s2s-contd to main April 16, 2026 21:39
gabegma and others added 6 commits April 17, 2026 18:00
Makes score optional and introduces an explicit `skipped: bool` flag so
metrics can signal "no applicable data" distinctly from an error. Allows
downstream consumers to treat skipped metrics as a first-class state
instead of inferring it from None scores.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Agent speech fidelity (S2S) and transcription accuracy key entities both
have legitimate cases where no entities exist to score. Previously these
returned score=0.0 with error="Aggregation failed" (for S2S) or a
zero-valued score that conflated with real zero scores. Now they set
skipped=True with score=None so consumers can handle the case correctly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- validation_runner: skipped metrics no longer fail validation; they are
  excluded from threshold checks.
- pass_at_k: skipped trials are excluded from n/c so pass@k is computed
  over the remaining valid trials.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…osite

Previously any None component (missing, errored, or legitimately skipped)
would collapse EVA-A_pass to None, excluding the record from composite
pass statistics. Now a skipped component is excluded from the pass check
while remaining applicable components still determine pass/fail.
Missing or errored components still collapse the composite to None,
since that represents genuine data absence.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@gabegma gabegma force-pushed the ggm/modify-speech-fidelity-for-s2s branch from 7d34aa6 to 7093a0a Compare April 19, 2026 23:42
@gabegma gabegma marked this pull request as ready for review April 19, 2026 23:55
@gabegma gabegma added this pull request to the merge queue Apr 21, 2026
Merged via the queue into main with commit c2a201f Apr 21, 2026
1 check passed
@gabegma gabegma deleted the ggm/modify-speech-fidelity-for-s2s branch April 21, 2026 21:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants