v0.4.0 🔊
🔊 Audio Generation Module & Enhanced Interpretability 🎉
This release introduces the new sdialog.audio module, enabling audio conversation generation from text dialogues, along with expanded interpretability capabilities and new backend support.
🎙️ Transform Dialogues into Audio
-
🔊 New
sdialog.audioModule:
Convert any text dialog to audio with a single line:dialog.to_audio(). The module includes:- Multiple TTS engines (Kokoro, Hugging Face models)
- Voice databases with automatic voice assignment based on persona attributes (age, gender, language)
- Acoustic simulation with ray tracing for realistic room acoustics and spatial audio
- Microphone simulation with professional impulse responses (Shure, Sennheiser, Sony)
- Room generation with customizable dimensions, materials, and furniture placement
- Multiple export formats (WAV, MP3, FLAC) with custom sampling rates
- Background/foreground effects for environmental realism
→ 7 comprehensive tutorials covering all audio generation capabilities
→ Install with:pip install sdialog[audio]
🧠 Enhanced Interpretability & Steering
-
🔍 Input Inspection:
Inspect layer and component inputs (not just outputs) withInspector(target="model.layers.15", inspect_input=True) -
📊 Input Token Analysis:
Access activations for input tokens (not only generated ones) viainspector.input[i][j].actwhereiis turn index andjis input token index
→ Enables deeper mechanistic analysis of prompt processing
🌐 New Backend Support
-
🤖 Anthropic Integration:
Added support for Anthropic's Claude models (#100) -
☁️ Azure OpenAI:
Native support for Azure OpenAI deployments (#100)
🐛 Bug Fixes
- 💾 Agent Memory Reset:
Fixed memory reset issue when no prompt system and no persona is provided
🎯 What's Next
This major update enables:
- End-to-end audio dialogue generation for training data, evaluation, and production systems
- Deeper model analysis through enhanced interpretability tools
- Broader deployment options with Anthropic and Azure support
Full Changelog: View detailed changes