# 🔊 Converting Text into Audio with OpenAI TTS

## Learn to Create Entertaining Audio Stories!

## 📚 Introduction

### What You'll Learn
In this notebook, you'll learn how to use OpenAI's Text-to-Speech (TTS) API to convert written text into engaging audio stories. You'll create fun fairytales and entertaining stories that come to life with different voices!

### 🎯 What We'll Create

In this tutorial, we'll explore how to:

- **Convert text to speech**: Transform written stories into audio that you can listen to
- **Experiment with different voices**: Try out various voice options to find the perfect narrator
- **Generate creative stories**: Use AI to write fun, scary, or silly tales
- **Create interactive experiences**: Make stories that people can listen to instead of read

### 🔑 Key Concepts

- **Text-to-Speech (TTS)**: Technology that converts written text into spoken audio
- **Voice Synthesis**: The process of generating human-like speech from text
- **Voice Selection**: Choosing different voices for different story styles
- **Audio Playback**: Playing generated audio directly in your notebook

### 🏢 Real-World Applications

- ✅ **Audiobook creation** - Turn written stories into audio format
- ✅ **Accessibility features** - Make content available for people who prefer listening
- ✅ **Educational content** - Create engaging audio lessons
- ✅ **Entertainment** - Generate fun stories and jokes
- ✅ **Multi-language storytelling** - Create stories in different languages

## 🔧 Setup

### Install Dependencies

First, let's install the OpenAI Python library:

In [None]:
!pip install -q openai

### Import Required Libraries

In [None]:
import os
from pathlib import Path
from openai import OpenAI
from IPython.display import Audio, display

### Configure OpenAI API Key

💡 **Recommended**: Store your API key in Colab secrets for security
- Go to 🔑 (left sidebar)
- Click "Add new secret"
- Name: `OPENAI_API_KEY`
- Value: Your OpenAI API key

In [None]:
# Configure OpenAI API key
# Method 1: Try to get API key from Colab secrets (recommended)
try:
    from google.colab import userdata
    OPENAI_API_KEY = userdata.get('OPENAI_API_KEY')
    print("✅ API key loaded from Colab secrets")
except:
    # Method 2: Manual input (fallback)
    from getpass import getpass
    print("💡 To use Colab secrets: Go to 🔑 (left sidebar) → Add new secret → Name: OPENAI_API_KEY")
    OPENAI_API_KEY = getpass("Enter your OpenAI API Key: ")

# Set the API key as an environment variable
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

# Validate that the API key is set
if not OPENAI_API_KEY or OPENAI_API_KEY.strip() == "":
    raise ValueError("❌ ERROR: No API key provided!")

print("✅ Authentication configured!")

# Configure which OpenAI model to use
OPENAI_MODEL = "gpt-5-nano"  # Using gpt-5-nano for cost efficiency
print(f"🤖 Selected Model: {OPENAI_MODEL}")

### Initialize OpenAI Client

In [None]:
# Initialize the OpenAI client
client = OpenAI(api_key=OPENAI_API_KEY)

print("✅ OpenAI client initialized!")

## 🎓 Understanding Text-to-Speech

### How TTS Works

Text-to-Speech technology analyzes written text and generates synthetic speech that sounds natural and human-like. The process involves:

1. **Text Analysis**: Breaking down the text into words, sentences, and punctuation
2. **Pronunciation**: Determining how each word should be pronounced
3. **Prosody Generation**: Adding natural rhythm, intonation, and pacing
4. **Audio Synthesis**: Creating the actual audio waveform

### 📝 Tips for Great Audio Stories

✅ **Good for TTS**:
- Clear, grammatically correct sentences
- Proper punctuation (helps with pacing and dramatic pauses)
- Standard words and common expressions
- Well-formatted text with natural flow
- Descriptive language that paints a picture

⚠️ **Problematic for TTS**:
- Excessive special characters (@, #, $, etc.)
- All caps text (may sound like shouting)
- Unclear formatting or run-on sentences

### 🔧 Key TTS API Parameters

- **`model`**: `tts-1` (standard quality, cost-effective)
- **`voice`**: Choose from 6 voices (alloy, echo, fable, onyx, nova, shimmer)
- **`input`**: The text to convert to speech
- **`response_format`**: Audio format - `mp3` (default)

## 🎯 Example 1: Simple "Hello World"

Let's start with a simple test to verify everything works!

In [None]:
# Simple test text
test_text = "Hello, this is a test of the text-to-speech system."

# Generate audio directly to memory
response = client.audio.speech.create(
    model="tts-1",
    voice="nova",  # Friendly voice for testing
    input=test_text
)

print(f"✅ Test audio generated successfully!")

# Play the audio directly in the notebook (no file needed!)
print(f"\n🎧 Listen to the audio:")
display(Audio(response.content))

## 📖 Example 2: AI-Generated Scary Story

Let's create an entertaining story using AI! We'll use a **two-step process**:
1. **Generate a fun, scary story** using `gpt-5-nano` with a creative prompt
2. **Convert that story to audio** using `tts-1` with a dramatic voice

This example shows how you can combine text generation with text-to-speech to create engaging audio content.

In [None]:
# Creative story prompt
story_prompt = "Write a humorous scary tale about a brave programmer who did not fear AI. Make it entertaining and suitable for children. Keep it short and engaging - around 150 words maximum."

print("🔄 Generating scary story...\n")

# Generate the story using gpt-5-nano
story_response = client.responses.create(
    model=OPENAI_MODEL,
    input=[
        {
            "role": "developer",
            "content": "You are a creative storyteller who writes entertaining, slightly spooky but humorous stories for children. Your stories are engaging, fun, and have a positive message at the end. Keep stories concise."
        },
        {
            "role": "user",
            "content": story_prompt
        }
    ]
)

# Extract generated story text
scary_story = story_response.output_text

# Display generated story
print("="*60)
print("📚 Generated Story: The Brave Programmer")
print("="*60)
print(scary_story)
print("\n" + "="*60)
print(f"📊 Character count: {len(scary_story)} characters")
print(f"📊 Words: ~{len(scary_story.split())} words")

### Step 2: Convert Text to Audio with TTS-1

In [None]:
print("\n🔊 Converting story to audio...\n")

# Generate audio from the story (no file needed!)
audio_response = client.audio.speech.create(
    model="tts-1",
    voice="onyx",  # Deep, dramatic voice perfect for scary stories!
    input=scary_story
)

# Display results
print("="*60)
print("✅ Scary Story Audio Created!")
print("="*60)
print(f"🎤 Voice used: onyx (deep, dramatic)")
print(f"🔊 Format: MP3")
print(f"\n💡 Try changing the voice to 'fable' or 'nova' for a different feel!")

# Play the audio directly in the notebook
print(f"\n🎧 Listen to the scary story:")
display(Audio(audio_response.content))

## 🎨 Exercise: Create Your Own Story!

Now it's your turn! In this exercise, you'll:
1. **Write your own creative prompt** - Think of a fun story idea (fairytale, adventure, funny tale, etc.)
2. **Choose a voice** - Pick the best voice for your story's mood
3. **Generate and listen** - Create your own audio story!

### 💡 Story Ideas to Get You Started:
- A funny tale about a dragon who's afraid of mice
- A bedtime story about a sleepy cloud
- An adventure of a curious robot exploring Earth
- A silly story about vegetables having a party
- A tale about a wizard who keeps forgetting spells

### 🎤 Voice Recommendations:
- **`fable`** - Perfect for warm, classic fairytales
- **`onyx`** - Great for dramatic or mysterious stories
- **`nova`** - Ideal for upbeat, energetic tales
- **`shimmer`** - Best for calm, soothing bedtime stories
- **`alloy`** - Good all-around choice for any story
- **`echo`** - Clear voice for straightforward narration

### 📝 Pro Tip:
To control story length, **specify the desired word count in your prompt** (e.g., "Keep it around 150 words" or "Write a short 100-word story").

### Step 1: Write Your Story Prompt and Choose a Voice

In [None]:
# 🎨 YOUR TURN! Customize these values:

# Write your creative story prompt here (specify length in the prompt)
your_story_prompt = "Write a funny bedtime story about a dragon who's afraid of mice. Make it entertaining and suitable for children. Keep it around 150 words."

# Choose your preferred voice: "alloy", "echo", "fable", "onyx", "nova", or "shimmer"
your_voice = "fable"

print(f"🎯 Your prompt: {your_story_prompt}")
print(f"🎤 Selected voice: {your_voice}")
print("\n" + "="*60)

### Step 2: Generate Your Story with AI

In [None]:
print("🔄 Generating your story...\n")

# Generate the story using gpt-5-nano
your_story_response = client.responses.create(
    model=OPENAI_MODEL,
    input=[
        {
            "role": "developer",
            "content": "You are a creative storyteller who writes entertaining stories for children. Your stories are engaging, fun, and have a positive message. Keep stories concise."
        },
        {
            "role": "user",
            "content": your_story_prompt
        }
    ]
)

# Extract the generated story
your_story = your_story_response.output_text

# Display the story
print("="*60)
print("📚 Your Generated Story")
print("="*60)
print(your_story)
print("\n" + "="*60)
print(f"📊 Character count: {len(your_story)} characters")
print(f"📊 Words: ~{len(your_story.split())} words")

### Step 3: Convert Your Story to Audio

In [None]:
print("\n🔊 Converting your story to audio...\n")

# Generate audio from your story
your_audio_response = client.audio.speech.create(
    model="tts-1",
    voice=your_voice,
    input=your_story
)

# Display results
print("="*60)
print("✅ Your Audio Story is Ready!")
print("="*60)
print(f"🎤 Voice used: {your_voice}")
print(f"🔊 Format: MP3")
print(f"\n💡 Experiment! Try changing the voice or prompt and run again!")

# Play the audio directly in the notebook
print(f"\n🎧 Listen to your story:")
display(Audio(your_audio_response.content))

## 🎯 Key Takeaways & Tips

### ✅ What You've Learned

1. **Text-to-Speech Basics** - How to convert text into natural-sounding audio
2. **Voice Selection** - Different voices create different moods and atmospheres
3. **AI Story Generation** - Combining GPT models with TTS for creative content
4. **Experimentation** - How to adjust prompts, voices, and parameters

### 💡 Tips for Better Audio Stories

✅ **Writing Great Prompts:**
- Be specific about the story type (funny, scary, educational, etc.)
- Mention the target audience (children, adults, etc.)
- Specify the desired length (short, medium, detailed)
- Include the mood you want (cheerful, mysterious, calm, etc.)

✅ **Choosing the Right Voice:**
- **Fairytales & warm stories** → `fable` or `shimmer`
- **Dramatic & scary stories** → `onyx`
- **Fun & energetic tales** → `nova`
- **Calm bedtime stories** → `shimmer`
- **General narration** → `alloy` or `echo`

✅ **Text Formatting for Better Audio:**
- Use proper punctuation for natural pauses
- Write in complete sentences
- Avoid excessive special characters
- Keep sentences clear and well-structured

### 🚀 Ideas for Further Exploration

- **Create a series** - Generate multiple connected stories
- **Try different genres** - Adventure, mystery, comedy, educational tales
- **Multi-voice stories** - Use different voices for different characters
- **Multi-language stories** - Generate stories in different languages
- **Combine with images** - Create illustrated audio stories

### 🎉 Congratulations!

You've successfully learned how to:
- ✅ Set up and use OpenAI's TTS API
- ✅ Generate creative stories with AI
- ✅ Convert text to natural-sounding audio
- ✅ Experiment with different voices and prompts
- ✅ Create engaging audio content for entertainment and education

**Keep experimenting and have fun creating amazing audio stories!** 🎊