fix(config): halve pitch escalation at I4–I5 to eliminate helium effect#68
Merged
Conversation
…ct (#64) Listening test (2026-05-03) revealed cartoonish pitch at high intensity: VIC female reached +6 st (=36%) at I5, far exceeding the M15 consensus range of +4–10%. The discrepancy was a unit conversion error — research ranges are in percent but style_map values are in semitones (1 st ≈ 6%). Changes: - Speaker YAMLs: cap female VIC/SW pitch_delta_st to +3 st at I5 (was +5/+6) - Speaker YAMLs: cap male AGG/BEN pitch_delta_st to +2 st at I5 (was +3) - SpeakerState: reduce AGG pitch drift targets (I4: 1.5→1.0, I5: 2.0→1.5) - SpeakerState: reduce VIC pitch drift targets (I4: 0.8→0.5, I5: 1.0→0.7) - MAX_F0_DRIFT_ST: tighten from 2.0 to 1.5 semitones - Update unit tests for new drift bound Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
… stale comments Self-review follow-up for #64: - Female VIC/SW I5 pitch_delta_st reduced from +3 st (~18%) to +2 st (~12%), bringing it closer to the M15 consensus range of 4–10% - SW_001/SW_002 I1 pitch corrected from -1 st (-6%) to 0 (within -3% to +2%) - Added pitch clamp (±12 st) in renderer.py to prevent unbounded drift when speaker_state + randomization stack up - Fixed stale docstring in f0_drift_exceeded (2.0 → 1.5) - Fixed stale comment in VIC_003 about old pitch range Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
pr-agent-context report: No unresolved review comments, failing checks, or actionable patch coverage gaps were found on PR #68 in repository https://github.com/DataHackIL/SynthBanshee. Treat this PR as all clear unless new signals appear.Run metadata: |
This was referenced May 5, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #64.
Summary
pitch_delta_stacross all 10 speaker YAMLs (6 female VIC/SW, 4 male AGG/BEN)SpeakerStatepitch drift targets so accumulated cross-turn pitch doesn't compound the problemMAX_F0_DRIFT_STfrom 2.0 → 1.5 semitonesBefore / After (VIC female, 210 Hz baseline, I5 after 3 escalating turns)
Changes per file
configs/examples/speaker_VIC_F_25-40_002.yamlconfigs/examples/speaker_VIC_F_25-40_003.yamlconfigs/speakers/speaker_VIC_F_25-40_004.yamlconfigs/examples/speaker_SW_F_30-45_001.yamlconfigs/speakers/speaker_SW_F_30-45_002.yamlconfigs/speakers/speaker_SW_F_30-45_003.yamlconfigs/examples/speaker_AGG_M_30-45_001.yamlconfigs/examples/speaker_BEN_M_40-55_003.yamlconfigs/speakers/speaker_BEN_M_40-55_004.yamlconfigs/speakers/speaker_BEN_M_40-55_005.yamlsynthbanshee/tts/speaker_state.pytests/unit/test_speaker_state.pyTest plan
ruff check— passedruff format— passedmypy— passed🤖 Generated with Claude Code