Release v5.3.0 - OmniVoice + native SRT duration targeting, Visual Tag Builder, Granite ASR improvements · diodiogod/TTS-Audio-Suite

exmpvid_small.mp4

Highlights

This release is technically v5.3.0, but the main feature push here is still the OmniVoice integration that landed in v5.2.0, now paired with the Granite ASR additions and fixes from v5.3.0.

The biggest practical change is this:

OmniVoice is the first TTS engine in the suite where subtitle segment duration can be meaningfully guided at generation time.

That matters because the suite now has a model path that can aim for target SRT timing before fallback stretch/correction has to do the heavy lifting.

OmniVoice

OmniVoice is now integrated into the unified suite with:

official OmniVoice model support
text TTS and SRT workflows
multilingual generation with broad upstream language coverage
instruction-based voice design
narrator cloning support with explicit reference text
interruption support in unified generation flows

Native duration-aware SRT generation

This is the part worth paying attention to.

For TTS SRT, the suite can now send target segment duration directly into OmniVoice. In practice that means:

generated segments can land much closer to subtitle timing targets
stretch_to_fit has less corrective work to do
timing adjustments can stay more natural
precise subtitle dubbing / timing workflows become much more practical

This is not just fake post-speeding. The model is actually being guided with its native duration control during generation.

Visual Tag Builder

This release also introduces the new 📐 Visual Tag Builder.

It started as an OmniVoice helper, but it became a more general visual tag / attribute assembly node.

Current strengths:

playful visual reordering of attributes
built-in OmniVoice preset
reusable custom presets
saved column order
workflow persistence for chosen preset / selections

I’ll add a short demo video showing the interaction separately.

Granite ASR updates in v5.3.0

Granite ASR 4.1 diarization and timestamp improvements
plus-model speaker diarization with suite-native [Speaker] output
fixes for longer transcript cutoff in native timestamp mode
clearer Granite model / diarization documentation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v5.3.0 - OmniVoice + native SRT duration targeting, Visual Tag Builder, Granite ASR improvements

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Highlights

OmniVoice

Native duration-aware SRT generation

Visual Tag Builder

Granite ASR updates in v5.3.0

Uh oh!