🚀 SDialog 0.4.7
This release note summarizes the major updates introduced in v0.4.7 (since last version v0.4.6).
📞 Telecommunication & Acoustics Architecture (Major Update)
This release significantly expands and restructures the audio simulation stack:
- New telecommunication simulation engine support in the audio pipeline
- Added telecommunication codecs support
- New modular room-acoustics backend architecture
- Backend split enables clearer separation between pyroomacoustics-based and telecommunication-oriented simulation paths
🎙️ Audio Pipeline Improvements
The audio generation flow was extended and hardened in several areas:
- New text/audio normalization utilities added to the audio module
- Better overlap/pause control and sound-event handling
- Improved post-processing behavior in end-to-end audio generation
- Improved normalization behavior for Qwen3-TTS generation and voice cloning
🛠️ Reliability & Fixes
This cycle includes targeted fixes for robustness, compatibility, and security/privacy:
- Fixed sentence-transformers modality mismatch issues in audio workflows
- Multiple audio robustness fixes across:
- NumPy conversion edge cases
- dScaper integration corner cases
- dataset handling and normalization consistency
- Audio test/tooling compatibility fixes (including torchcodec and qwen_tts stubs/type-hint updates)
- Improved validation error handling in Paraphraser
- Prevented API keys from being persisted in LLM metadata
📚 Documentation & Tutorials
- Added/updated tutorials for:
- telecommunication simulation
- overlaps and pauses
- sound events
- Updated EACL 2026 citation and poster references
- Added/updated examples for agent tools and final-response tool usage
✅ Stability & Quality
v0.4.7 focuses on making the expanded audio stack more robust through:
- architecture modularization
- stronger normalization and determinism
- broader compatibility fixes
- safer metadata handling
📚 Full Details
Full Changelog: View detailed changes