Skip to content

fix(azure): correct prosody volume format and spurious validation warnings#43

Merged
OwenMcGirr merged 1 commit into
mainfrom
fix/42-azure-ssml-audit
Apr 12, 2026
Merged

fix(azure): correct prosody volume format and spurious validation warnings#43
OwenMcGirr merged 1 commit into
mainfrom
fix/42-azure-ssml-audit

Conversation

@OwenMcGirr
Copy link
Copy Markdown
Collaborator

Fixes #42

Changes

1. Volume attribute: drop the % suffix (3 locations)

Files: src/engines/azure.ts, src/core/ssml-utils.ts, src/core/abstract-tts.ts

The % suffix in Azure's <prosody> volume attribute has a specific meaning — it is a relative change from the current volume level, not an absolute value:

Format Azure interpretation
volume="75" Absolute level 75 on a 0–100 scale (default 100)
volume="75%" Relative change: increase by 75% from current
volume="-25%" Relative change: decrease by 25% from current

The code was emitting volume="75%" for a SpeakOptions.volume of 75. Azure would interpret this as "louder by 75%", effectively maximising volume rather than reducing it. Since SpeakOptions.volume is already defined as a 0–100 absolute scale, the fix is simply to drop the % — no value conversion needed.

2. Validation ordering: move after processing

File: src/engines/azure.ts (prepareSSML)

validateSSMLForEngine was running on the bare <speak>text</speak> produced by wrapWithSpeakTags, before processSSMLForEngine and ensureAzureSSMLStructure had added the required xmlns and version attributes. This caused two warnings to fire on every plain-text synthesis call:

Engine 'azure' requires xmlns attribute in <speak> tag.
Engine 'azure' requires version attribute in <speak> tag.

Validation now runs after all processing, so warnings only fire if something is genuinely wrong in what Azure actually receives.

Tests

  • New test suite __tests__/azure-ssml.test.ts covers both fixes
  • Updated src/__tests__/ssml-utils.test.ts and src/__tests__/azure-mstts-namespace.test.ts to assert the correct (no-%) volume format

…lns/version warnings (#42)

Drop the % suffix from all three prosody volume emission sites so Azure
receives the correct absolute format (volume="75") rather than a
relative change (volume="75%").

Also reorder prepareSSML so validation runs after processing, removing
two spurious warnings that fired on every plain-text call.
@OwenMcGirr
Copy link
Copy Markdown
Collaborator Author

Live verification

Tested against the real Azure REST API (eastus, riff-24khz-16bit-mono-pcm) with the same sentence synthesised five ways, comparing RMS amplitude of the PCM samples:

SSML RMS vs default
default (no prosody) 2414 baseline
volume="75" 1811 −25% quieter ✓
volume="75%" 3622 +50% louder ✗ (old behaviour)
volume="10" 241 −90% quieter ✓
volume="-75%" 604 −75% quieter ✓

volume="75" correctly reduces volume to ~75% of max. volume="75%" boosts it to ~150% — confirming the old % format was silently wrong.

Rate and pitch were also verified live — both pass through correctly with no format issues.

@OwenMcGirr OwenMcGirr merged commit 6c131e8 into main Apr 12, 2026
3 of 6 checks passed
@OwenMcGirr OwenMcGirr deleted the fix/42-azure-ssml-audit branch April 12, 2026 08:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(azure): SSML audit — volume format bug and spurious validation warnings

1 participant