Skip to content

v0.4.0 🔊

Choose a tag to compare

@sergioburdisso sergioburdisso released this 19 Nov 12:12
· 293 commits to main since this release

🔊 Audio Generation Module & Enhanced Interpretability 🎉

This release introduces the new sdialog.audio module, enabling audio conversation generation from text dialogues, along with expanded interpretability capabilities and new backend support.

🎙️ Transform Dialogues into Audio

  • 🔊 New sdialog.audio Module:
    Convert any text dialog to audio with a single line: dialog.to_audio(). The module includes:

    • Multiple TTS engines (Kokoro, Hugging Face models)
    • Voice databases with automatic voice assignment based on persona attributes (age, gender, language)
    • Acoustic simulation with ray tracing for realistic room acoustics and spatial audio
    • Microphone simulation with professional impulse responses (Shure, Sennheiser, Sony)
    • Room generation with customizable dimensions, materials, and furniture placement
    • Multiple export formats (WAV, MP3, FLAC) with custom sampling rates
    • Background/foreground effects for environmental realism

    → 7 comprehensive tutorials covering all audio generation capabilities
    → Install with: pip install sdialog[audio]

🧠 Enhanced Interpretability & Steering

  • 🔍 Input Inspection:
    Inspect layer and component inputs (not just outputs) with Inspector(target="model.layers.15", inspect_input=True)

  • 📊 Input Token Analysis:
    Access activations for input tokens (not only generated ones) via inspector.input[i][j].act where i is turn index and j is input token index
    → Enables deeper mechanistic analysis of prompt processing

🌐 New Backend Support

  • 🤖 Anthropic Integration:
    Added support for Anthropic's Claude models (#100)

  • ☁️ Azure OpenAI:
    Native support for Azure OpenAI deployments (#100)

🐛 Bug Fixes

  • 💾 Agent Memory Reset:
    Fixed memory reset issue when no prompt system and no persona is provided

🎯 What's Next

This major update enables:

  • End-to-end audio dialogue generation for training data, evaluation, and production systems
  • Deeper model analysis through enhanced interpretability tools
  • Broader deployment options with Anthropic and Azure support

Full Changelog: View detailed changes