What
Replace the current page-equals-minute runtime heuristic with a syllable-calibrated estimator that uses per-language speech rates.
Why this matters
The page-equals-minute rule is an Anglo-Hollywood convention (one Courier-formatted page of English screenplay ≈ 60 seconds of screen time). It systematically under- or over-estimates Malayalam runtimes because Malayalam syllable density per visual line is different from English.
A syllable-calibrated estimator gives a far more accurate prediction:
- Count syllables per dialogue line via mlphon.
- Apply per-language base rate (Malayalam ≈ 5.5–6.5 syl/sec for normal delivery; English ≈ 4–5 syl/sec).
- Sum syllables across dialogue, divide by rate → dialogue runtime.
- Add action-line runtime via existing page-eighths math (action is paced page-equals-minute reasonably).
- Total = dialogue runtime + action runtime.
Output: "117 minutes — 49 min dialogue at 5.8 syl/sec, 68 min action at standard pacing."
Why not Tier 2 #5 (naturality scoring)
Skipped per the planning discussion — the Malayalam Speech Corpus isn't dense enough yet to derive per-genre rates with confidence. This issue uses a literature-derived constant (5.8 syl/sec for spoken Malayalam, citation needed in the doc) as the v1 default. Per-character calibration via the Statistics tab can come later if the corpus grows.
Dependency
mlphon's syllable counter. Or, simpler v1: a self-contained Rust syllabifier following Malayalam syllable rules (consonant + dependent vowel = one syllable; consonant cluster + vowel = one syllable; etc. — a few hundred lines).
Technical sketch
- Add a
syllableCount(text: string) Rust helper.
- Walk dialogue blocks in
computeStats, sum syllables.
- Configurable rate constant via Settings → Writing.
- Surface in the Overview tab as the new "Estimated screen time" replacing pageCount-as-minute.
Out of scope
Per-character speech rate tuning, dialect adjustments, accent modeling — all are Tier 2 follow-ups.
What
Replace the current page-equals-minute runtime heuristic with a syllable-calibrated estimator that uses per-language speech rates.
Why this matters
The page-equals-minute rule is an Anglo-Hollywood convention (one Courier-formatted page of English screenplay ≈ 60 seconds of screen time). It systematically under- or over-estimates Malayalam runtimes because Malayalam syllable density per visual line is different from English.
A syllable-calibrated estimator gives a far more accurate prediction:
Output: "117 minutes — 49 min dialogue at 5.8 syl/sec, 68 min action at standard pacing."
Why not Tier 2 #5 (naturality scoring)
Skipped per the planning discussion — the Malayalam Speech Corpus isn't dense enough yet to derive per-genre rates with confidence. This issue uses a literature-derived constant (5.8 syl/sec for spoken Malayalam, citation needed in the doc) as the v1 default. Per-character calibration via the Statistics tab can come later if the corpus grows.
Dependency
mlphon's syllable counter. Or, simpler v1: a self-contained Rust syllabifier following Malayalam syllable rules (consonant + dependent vowel = one syllable; consonant cluster + vowel = one syllable; etc. — a few hundred lines).
Technical sketch
syllableCount(text: string)Rust helper.computeStats, sum syllables.Out of scope
Per-character speech rate tuning, dialect adjustments, accent modeling — all are Tier 2 follow-ups.