# üéôÔ∏è Voice Soundboard Demo

**AI-powered voice synthesis with natural language control**

This notebook demonstrates the key features of Voice Soundboard:
- 54+ voices with emotions and presets
- Natural language style hints
- Paralinguistic tags ([laugh], [sigh], etc.)
- Multi-speaker dialogue
- Real-time streaming

[![GitHub](https://img.shields.io/badge/GitHub-voice--soundboard-blue?logo=github)](https://github.com/yourusername/voice-soundboard)
[![PyPI](https://img.shields.io/pypi/v/voice-soundboard)](https://pypi.org/project/voice-soundboard/)

## 1. Installation

First, let's install Voice Soundboard and download the required models.

In [None]:
# Install voice-soundboard
!pip install -q voice-soundboard[all]

# Download Kokoro models
!mkdir -p models
!wget -q -O models/kokoro-v1.0.onnx https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files-v1.0/kokoro-v1.0.onnx
!wget -q -O models/voices-v1.0.bin https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files-v1.0/voices-v1.0.bin

print("‚úÖ Installation complete!")

In [None]:
# Setup for audio playback in Colab
from IPython.display import Audio, display
import os

# Set model path
os.environ['KOKORO_MODEL_PATH'] = 'models/kokoro-v1.0.onnx'
os.environ['KOKORO_VOICES_PATH'] = 'models/voices-v1.0.bin'

def play(audio_path):
    """Play audio in Colab notebook."""
    display(Audio(audio_path, autoplay=True))

## 2. Basic Speech Generation

Let's start with simple text-to-speech.

In [None]:
from voice_soundboard import VoiceEngine

# Initialize the engine
engine = VoiceEngine()
print("üé§ VoiceEngine initialized!")

# Simple speech
result = engine.speak("Hello! Welcome to Voice Soundboard.")
print(f"Generated: {result.audio_path}")
print(f"Duration: {result.duration_seconds:.2f}s")
play(result.audio_path)

## 3. Different Voices

Voice Soundboard includes 54+ voices across different accents.

In [None]:
from voice_soundboard import KOKORO_VOICES

# Show available voices
print("Available voices:")
for voice_id, info in list(KOKORO_VOICES.items())[:10]:
    print(f"  {voice_id}: {info['name']} ({info['gender']}, {info['language']})")
print(f"  ... and {len(KOKORO_VOICES) - 10} more!")

In [None]:
# Try different voices
voices = [
    ("af_bella", "American Female - Bella"),
    ("bm_george", "British Male - George"),
    ("am_michael", "American Male - Michael"),
]

for voice_id, description in voices:
    print(f"\nüéôÔ∏è {description}:")
    result = engine.speak("This is my voice. How do I sound?", voice=voice_id)
    play(result.audio_path)

## 4. Voice Presets

Presets combine voice, speed, and style for common use cases.

In [None]:
from voice_soundboard import VOICE_PRESETS

# Show presets
print("Available presets:")
for name, config in VOICE_PRESETS.items():
    print(f"  {name}: {config.get('description', 'No description')}")

In [None]:
# Try each preset
texts = {
    "assistant": "Hi there! How can I help you today?",
    "narrator": "In a world where AI can speak... anything is possible.",
    "announcer": "Breaking news! Voice Soundboard reaches version one point zero!",
    "storyteller": "Once upon a time, in a land far away...",
    "whisper": "Let me tell you a secret...",
}

for preset, text in texts.items():
    print(f"\nüé≠ Preset: {preset}")
    result = engine.speak(text, preset=preset)
    play(result.audio_path)

## 5. Emotions

Apply 19 different emotions to make speech more expressive.

In [None]:
from voice_soundboard import list_emotions

# Show available emotions
emotions = list_emotions()
print(f"Available emotions ({len(emotions)}):")
print(", ".join(emotions))

In [None]:
# Try different emotions
emotion_demos = [
    ("happy", "I'm so happy to see you!"),
    ("sad", "I'm going to miss you..."),
    ("excited", "This is amazing! I can't believe it!"),
    ("calm", "Take a deep breath. Everything will be fine."),
    ("angry", "This is unacceptable! We need to fix this now!"),
]

for emotion, text in emotion_demos:
    print(f"\nüòä Emotion: {emotion}")
    result = engine.speak(text, emotion=emotion)
    play(result.audio_path)

## 6. Natural Language Styles

Describe how you want the speech to sound using natural language.

In [None]:
# Natural language style hints
styles = [
    "warmly and cheerfully",
    "slowly and mysteriously",
    "excitedly, like an announcer",
    "calmly, in a British accent",
    "quickly and nervously",
]

text = "Let me tell you something interesting."

for style in styles:
    print(f"\nüé® Style: '{style}'")
    result = engine.speak(text, style=style)
    play(result.audio_path)

## 7. Emotion Blending

Mix multiple emotions for nuanced expression.

In [None]:
from voice_soundboard import blend_emotions

# Blend emotions
blends = [
    [("happy", 0.5), ("sad", 0.5)],        # Bittersweet
    [("happy", 0.7), ("surprised", 0.3)],  # Pleasant surprise
    [("calm", 0.6), ("happy", 0.4)],       # Content
]

for blend in blends:
    result = blend_emotions(blend)
    weights = " + ".join(f"{int(w*100)}% {e}" for e, w in blend)
    print(f"\nüé≠ Blend: {weights}")
    print(f"   Result: {result.closest_emotion}")
    
    audio = engine.speak(
        f"This is how {result.closest_emotion} sounds.",
        emotion=result.closest_emotion
    )
    play(audio.audio_path)

## 8. Sound Effects

Built-in sound effects for notifications, UI feedback, and more.

In [None]:
from voice_soundboard import list_effects, get_effect

# List available effects
effects = list_effects()
print(f"Available effects ({len(effects)}):")
print(", ".join(effects))

In [None]:
# Play some effects
demo_effects = ["chime", "success", "error", "attention", "whoosh"]

for effect_name in demo_effects:
    print(f"\nüîî Effect: {effect_name}")
    effect = get_effect(effect_name)
    effect.save(f"/tmp/{effect_name}.wav")
    play(f"/tmp/{effect_name}.wav")

## 9. SSML Support

Fine-grained control with Speech Synthesis Markup Language.

In [None]:
from voice_soundboard import parse_ssml

# SSML with various elements
ssml = '''
<speak>
    Hello! <break time="500ms"/>
    <prosody rate="slow">This is spoken slowly.</prosody>
    <break time="300ms"/>
    <emphasis level="strong">This is important!</emphasis>
    <break time="500ms"/>
    The date is <say-as interpret-as="date">2026-01-22</say-as>.
</speak>
'''

text, params = parse_ssml(ssml)
print(f"Parsed text: {text.strip()}")

result = engine.speak(text, speed=params.speed)
play(result.audio_path)

## 10. Multi-Speaker Dialogue

Generate conversations with multiple distinct voices.

In [None]:
import asyncio
from voice_soundboard import DialogueEngine

dialogue_engine = DialogueEngine()

script = """
[S1:narrator] The detective entered the dimly lit room.
[S2:detective] (firmly) Where were you on the night of the crime?
[S3:suspect] (nervously) I... I was at home. Alone.
[S2:detective] (skeptically) Is that so? We have witnesses.
[S3:suspect] (panicking) I swear! I didn't do anything!
"""

async def generate_dialogue():
    result = await dialogue_engine.speak_dialogue(
        script,
        voices={
            "narrator": "bm_george",
            "detective": "am_michael",
            "suspect": "af_nicole"
        },
        turn_pause_ms=500
    )
    return result

print("üé≠ Generating multi-speaker dialogue...")
result = asyncio.get_event_loop().run_until_complete(generate_dialogue())
print(f"Generated: {result.audio_path}")
play(result.audio_path)

## 11. Word-Level Emotion Tags

Change emotions mid-sentence for expressive speech.

In [None]:
from voice_soundboard import EmotionParser

# Parse emotion tags
text = "I was so {happy}excited{/happy} to see you, but then {sad}you had to leave{/sad}."

parser = EmotionParser()
parsed = parser.parse(text)

print(f"Original: {text}")
print(f"\nEmotion spans:")
for span in parsed.spans:
    print(f"  '{span.text}' -> {span.emotion}")

result = engine.speak(parsed.plain_text)
play(result.audio_path)

## 12. Summary

Voice Soundboard provides a comprehensive voice synthesis platform with:

- ‚úÖ 54+ voices across multiple accents
- ‚úÖ 19 emotions with blending support
- ‚úÖ Natural language style hints
- ‚úÖ Multi-speaker dialogue synthesis
- ‚úÖ SSML support for fine control
- ‚úÖ Built-in sound effects
- ‚úÖ Word-level emotion tags
- ‚úÖ Real-time streaming (low latency)
- ‚úÖ Voice cloning (not shown in Colab due to audio input limitations)
- ‚úÖ MCP integration for AI agents

In [None]:
# Final demo - putting it all together
print("üéâ Final Demo: Combining features")

result = engine.speak(
    "Thank you for trying Voice Soundboard! I hope you enjoyed this demo.",
    style="warmly and cheerfully",
    preset="assistant"
)
play(result.audio_path)

print("\n" + "="*50)
print("üì¶ Install: pip install voice-soundboard[all]")
print("üìö Docs: https://github.com/yourusername/voice-soundboard")
print("‚≠ê Star us on GitHub if you found this useful!")
print("="*50)