Skip to content

v4.4.0 - Silent Speech Analyzer

Choose a tag to compare

@diodiogod diodiogod released this 16 Aug 19:53
· 1158 commits to main since this release

🗣️ Silent Speech Analyzer Release

Major New Feature

NEW: 🗣️ Silent Speech Analyzer node for video mouth movement analysis!

Key Features

  • Experimental Viseme Detection: Approximate detection of vowels (A, E, I, O, U) and consonants (B, F, M, etc.)
  • 3-Level Analysis System: Frame detection → syllable grouping → word prediction
  • Base SRT Generation: Creates timing-focused SRT files for manual editing and TTS integration
  • MediaPipe Integration: Production-ready mouth movement tracking
  • Visual Feedback: Preview videos with detection overlays
  • Dictionary Integration: Uses CMU Pronouncing Dictionary (135K+ words) for phonetic placeholders

Technical Details

  • MediaPipe provider recommended for production use
  • OpenSeeFace provider available but experimental
  • Optimized default values for better detection sensitivity
  • Results require manual editing - provides timing foundation and phonetic suggestions
  • Perfect for creating subtitle timing templates from silent video

Important Notes

⚠️ Experimental Feature: Viseme detection provides approximations, not precise transcription
📝 Manual Editing Required: SRT output designed as timing base for user editing
🎯 Use Case: Foundation for manual subtitle creation and TTS workflows

Installation

This release maintains backward compatibility with all existing workflows while adding the new video analysis capabilities.

See README.md for complete documentation and usage guides.