# Audio Generation from Paper Summaries

This notebook demonstrates how to use the `OpenAITTSProvider` to generate audio podcasts from paper summaries.

## Setup

In [None]:
import sys
from pathlib import Path
import json
import os

# Add parent directory to path
sys.path.append(str(Path.cwd().parent))

from src.services import OpenAITTSProvider
from src.models import Paper

## Load a Paper

First, let's load a paper from the downloads folder.

In [None]:
# Find metadata files in downloads folder
download_dir = Path("downloads")
metadata_files = list(download_dir.glob("*.json"))

if not metadata_files:
    print("No metadata files found in downloads folder!")
    print("Please download a paper first using the arxiv_example.ipynb notebook")
else:
    # Load the first paper's metadata
    metadata_path = metadata_files[0]
    
    with open(metadata_path, 'r') as f:
        metadata = json.load(f)
    
    paper = Paper.from_dict(metadata)
    
    print(f"Found {len(metadata_files)} paper(s) in downloads folder")
    print(f"\nPaper: {paper.title}")
    print(f"Authors: {', '.join([a.name for a in paper.authors])}")
    print(f"Published: {paper.published.strftime('%B %Y')}")

## Find Summary File

Check if we have a summary for this paper. If not, we'll use a sample text.

In [None]:
summary_text = None

if metadata_files:
    # Find corresponding summary file
    summaries_dir = Path("summaries")
    base_filename = paper.pdf_filename.replace(".pdf", "")
    summary_path = summaries_dir / f"summary_{base_filename}.txt"
    
    if not summary_path.exists():
        print(f"Summary not found at: {summary_path}")
        print("Please generate a summary first using the llm_summarization_example.ipynb notebook")
        print("\nUsing sample text for demonstration...")
        summary_text = "This is a demonstration of text-to-speech generation for research paper summaries."
    else:
        print(f"Found summary at: {summary_path}")
        
        # Load summary
        with open(summary_path, 'r') as f:
            summary_text = f.read()
        print(f"Summary length: {len(summary_text):,} characters")
        print(f"\nFirst 200 characters:")
        print(summary_text[:200] + "...")

## Initialize TTS Provider

The OpenAI TTS service requires an `OPENAI_API_KEY` environment variable to be set.

Available voices: `alloy`, `echo`, `fable`, `onyx`, `nova`, `shimmer`

Available models:
- `tts-1`: Standard quality, faster
- `tts-1-hd`: Higher quality, slower

In [None]:
# Initialize TTS provider with OpenAI
openai_api_key = os.getenv("OPENAI_API_KEY")

tts_provider = OpenAITTSProvider(
    api_key=openai_api_key,
    model="gpt-4o-mini-tts",  # Higher quality model
    voice="nova"        # Default voice (warm, engaging)
)

print("TTS Provider initialized successfully")

## Generate Audio

Now let's generate audio from the paper summary.

In [None]:
if summary_text and metadata_files:
    print("Generating audio...\n")
    
    # Create output directory
    audio_dir = Path("audio")
    audio_dir.mkdir(exist_ok=True)
    
    # Generate filename
    base_filename = paper.pdf_filename.replace(".pdf", "")
    output_path = audio_dir / f"{base_filename}_podcast.mp3"
    
    # Generate audio
    audio_file = tts_provider.generate_audio(
        text=summary_text,
        output_path=str(output_path)
    )
    
    print(f"\nâœ“ Audio generated successfully!")
    print(f"Saved to: {audio_file}")
    print(f"\nFile size: {Path(audio_file).stat().st_size / 1024:.1f} KB")

## Play Audio (if in Jupyter with IPython)

If you're running in a Jupyter environment with IPython display support, you can play the audio directly.

In [None]:
try:
    from IPython.display import Audio, display
    
    if metadata_files and Path(output_path).exists():
        print(f"Playing: {output_path}\n")
        display(Audio(str(output_path)))
except ImportError:
    print("IPython not available. Open the audio file manually to listen.")