Release v0.4.7 📞 · idiap/sdialog

🚀 SDialog 0.4.7

This release note summarizes the major updates introduced in v0.4.7 (since last version v0.4.6).

This release significantly expands and restructures the audio simulation stack:

New telecommunication simulation engine support in the audio pipeline
Added telecommunication codecs support
New modular room-acoustics backend architecture
Backend split enables clearer separation between pyroomacoustics-based and telecommunication-oriented simulation paths

The audio generation flow was extended and hardened in several areas:

This cycle includes targeted fixes for robustness, compatibility, and security/privacy:

Fixed sentence-transformers modality mismatch issues in audio workflows
Multiple audio robustness fixes across:
- NumPy conversion edge cases
- dScaper integration corner cases
- dataset handling and normalization consistency
Audio test/tooling compatibility fixes (including torchcodec and qwen_tts stubs/type-hint updates)
Improved validation error handling in Paraphraser
Prevented API keys from being persisted in LLM metadata

Added/updated tutorials for:
- telecommunication simulation
- overlaps and pauses
- sound events
Updated EACL 2026 citation and poster references
Added/updated examples for agent tools and final-response tool usage

v0.4.7 focuses on making the expanded audio stack more robust through:

Full Changelog: View detailed changes