investigate(tts): same scene config now yields 63% longer audio (121s → 198s) — confirm new duration regime is intentional

Discovered while re-rendering `sp_neu_a_0001` from its scene config on current `main` for #78.

## Evidence

Same scene config (`configs/scenes/she_proves/sp_neu_a_0001.yaml`, 12 turns, intensity arc `[1,1,1,2,1]`, AGG_M_30-45_001 + VIC_F_25-40_002), rendered before and after the prosody changes that landed in late April / early May:

| Date | Code path | Duration | Vs scene `target_duration_minutes: 3.0` (180 s) | Vs She-Proves window (3–6 min) |
|---|---|---:|---|---|
| 2026-04-15 | pre-M15 / pre-#70 / pre-#68 | 121.0 s | 33% under target | **below** lower bound |
| 2026-05-05 | current main | 197.8 s | 10% over target | within range |

The new duration is actually closer to the spec target than the old one.  But the +63% delta is large enough that downstream code that estimated wall-cost or filename-budget on Tier A clips against the old duration regime is now off.

## Likely cause (not yet confirmed)

Cumulative effect of recent prosody work, in rough order of suspected magnitude:

- **#51 (M15) — SSML prosody tuning with research-validated Hebrew parameters.**  Changes rate multipliers in speaker `style_map`.
- **#70 — inter-word `<break>` tags to prevent Hebrew word merging.**  Each break adds a small pause; for ~100 Hebrew words per turn over 12 turns, even a 50 ms median break adds ~60 s to the clip.
- **#48 (M14) — fix muffled audio / removed 7500 Hz LPF.**  Should not affect duration directly, but the M14 PR also changed the edge-fade behaviour; check whether anything there affects perceived/measured speech end.

A controlled bisect on the same scene config (cached LLM script + cached SSML where possible, only the rendering path changing) will identify the dominant contributor cheaply.

## Why this is not a #78 dependency

#78 is purely about loudness (peak / RMS).  Duration is independent.  But both regressions came in the same window of PRs and surface together when re-rendering Tier A clips, so it's worth tracking the duration delta separately so the loudness fix doesn't accidentally take ownership of \"why are clips longer now.\"

## Decision needed

Two questions:

1. **Is the new duration regime intentional?**  Per CLAUDE.md, She-Proves clips should be 3–6 min.  198 s = 3.3 min satisfies the lower bound; 121 s did not.  If the team agrees the new duration is correct, this issue closes as \"investigate, document, no code change.\"
2. **Does anything downstream budget on duration?**  E.g. label-generator phase boundaries, augmentation event placement, M17 evaluation runtime estimates.  If yes, those budgets need to be re-derived against current TTS output.

## Reproduction

```bash
.venv/bin/synthbanshee generate -c configs/scenes/she_proves/sp_neu_a_0001.yaml \
    -o /tmp/duration_repro -p she_proves
.venv/bin/python -c \"import soundfile as sf; print(sf.info('/tmp/duration_repro/agg_m_30-45_001/sp_neu_a_0001_00.wav'))\"
```

Old reference at `data/m2a_wettest/agg_m_30-45_001/sp_neu_a_0001_00.wav` is 121.0 s.

## References

- Discovered alongside loudness regression in #78.
- Suspect commits: #51 (M15), #70, #48 (M14).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

investigate(tts): same scene config now yields 63% longer audio (121s → 198s) — confirm new duration regime is intentional #81

Evidence

Likely cause (not yet confirmed)

Why this is not a #78 dependency

Decision needed

Reproduction

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Date	Code path	Duration	Vs scene `target_duration_minutes: 3.0` (180 s)	Vs She-Proves window (3–6 min)
2026-04-15	pre-M15 / pre-#70 / pre-#68	121.0 s	33% under target	below lower bound
2026-05-05	current main	197.8 s	10% over target	within range

investigate(tts): same scene config now yields 63% longer audio (121s → 198s) — confirm new duration regime is intentional #81

Description

Evidence

Likely cause (not yet confirmed)

Why this is not a #78 dependency

Decision needed

Reproduction

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions