# Featherless.ai - Roleplaying LLM-Enhanced Podcast Generator

This notebook transforms extracted text into a dynamic, engaging podcast script using a fine-tuned roleplaying LLM. Leveraging Featherless.ai's access to open-weight LLMs on Hugging Face, it:

1. **Creates vibrant dialogue**: Uses a roleplaying LLM to infuse conversations with personality and flair.
2. **Defines distinct characters**: Speaker 1 as a captivating teacher and Speaker 2 as a curious, expressive learner.
3. **Adds natural interactions**: Incorporates interruptions, reactions, and expressive language for a lively exchange.
4. **Ensures TTS readiness**: Outputs a structured list of tuples tailored for TTS systems with controlled expressions.

The result is a podcast script that captivates listeners while remaining compatible with text-to-speech processing.

## System Prompt Engineering

This specialized prompt is designed for TTS compatibility and structured output:

- Establishes speaker roles and personalities
- Controls speech characteristics (fillers, expressions, etc.)
- Specifies exact output format (list of tuples)
- Includes detailed instructions for natural dialogue flow
- Provides TTS-specific constraints (Speaker 1 cannot do "umms", Speaker 2 can)

The prompt includes specific formatting requirements to ensure the output is both engaging for humans and processable by TTS systems.

In [1]:
SYSTEM_PROMPT = """
You are an international Oscar-winning screenwriter who has worked with multiple award-winning podcasters.

Your job is to rewrite the provided podcast transcript for an AI Text-To-Speech pipeline. The original transcript was written by a less experienced AI, so you need to enhance it significantly.

Create an engaging dialogue between two speakers, each with distinct personalities:

- Speaker 1: A captivating teacher who leads the conversation, explains concepts with vivid analogies and personal anecdotes, and makes the topic accessible and memorable. They speak clearly and confidently, without using filler words like "umm" or "hmm."
- Speaker 2: A curious and enthusiastic learner who keeps the conversation on track by asking follow-up questions. They often get excited or confused, expressing their reactions verbally with phrases like "That's fascinating!", "Wait, I'm not sure I get that," or "Wow, that's like [analogy]."

The conversation should include:

- Realistic anecdotes and analogies to illustrate key points.
- Wild or interesting tangents from Speaker 2, making connections to everyday experiences or popular culture.
- Occasional interruptions from Speaker 2 with questions or comments, creating a natural flow.

Start with a fun, catchy introduction to hook the listeners.

Return the dialogue as a list of tuples, like this:

[
    ("Speaker 1", "Text here"),
    ("Speaker 2", "Text here"),
    ...
]

Begin directly with the dialogue, no additional text.
"""

## API Configuration

These parameters configure our connection to the Featherless AI API:
- The base URL for API requests
- Your API key for authentication
- The specific model to use for content generation

In [2]:
BASE_URL = "https://api.featherless.ai/v1"
FEATHERLESS_API_KEY = "YOUR_FEATHERLESS_API_KEY" # Available in https://featherless.ai/account/api-keys
DEFAULT_MODEL =  "EVA-UNIT-01/EVA-Qwen2.5-72B-v0.2" # Go through our model catalog on https://featherless.ai/models

### Parameter Explanation

| Parameter | Description |
|-----------|-------------|
| `BASE_URL` | The Featherless AI API endpoint URL |
| `FEATHERLESS_API_KEY` | Your personal API key for authentication |
| `DEFAULT_MODEL` | The model used for podcast generation (`Qwen/Qwen2.5-72B-Instruct` is selected for its strong instruction-following capabilities) |
| `max_tokens` | Maximum number of tokens in the response (4000 allows for comprehensive dialogue) |
| `temperature` | Set to 1 for creative, natural-sounding dialogue variation |

The higher temperature setting helps ensure the conversation sounds natural and varied rather than robotic or overly predictable.

In [3]:
import pymupdf4llm
from typing import Optional
import os
import torch
import requests
from tqdm.notebook import tqdm
import warnings

warnings.filterwarnings('ignore')

## Utility Functions

This function handles reading the extracted text from a file with proper error handling and encoding detection.

In [4]:
def read_file_to_string(filename):
    # Try UTF-8 first (most common encoding for text files)
    try:
        with open(filename, 'r', encoding='utf-8') as file:
            content = file.read()
        return content
    except UnicodeDecodeError:
        # If UTF-8 fails, try with other common encodings
        encodings = ['latin-1', 'cp1252', 'iso-8859-1']
        for encoding in encodings:
            try:
                with open(filename, 'r', encoding=encoding) as file:
                    content = file.read()
                print(f"Successfully read file using {encoding} encoding.")
                return content
            except UnicodeDecodeError:
                continue
        
        print(f"Error: Could not decode file '{filename}' with any common encoding.")
        return None
    except FileNotFoundError:
        print(f"Error: File '{filename}' not found.")
        return None
    except IOError:
        print(f"Error: Could not read file '{filename}'.")
        return None

## Podcast Generation Function

This function takes the system prompt and input text and generates TTS-ready podcast content in the specified structured format. The output is designed to be:

1. Easy to parse programmatically (list of tuples)
2. Ready for voice synthesis with proper speaker attribution
3. Optimized for the limitations of TTS systems (controlled expressions)

In [5]:
def generate_podcast(system_prompt, input_prompt):
    """
    Generate TTS-ready podcast content using Featherless AI API
    
    Parameters:
    -----------
    system_prompt : str
        The system prompt that defines behavior, output format, and TTS constraints
    input_prompt : str
        The extracted text content to transform into structured dialogue
        
    Returns:
    --------
    str
        The generated podcast content formatted as a list of tuples for TTS processing
    """
    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": input_prompt},
    ]
    
    try:
        response = requests.post(
            f"{BASE_URL}/chat/completions",
            headers={
                "Content-Type": "application/json",
                "Authorization": f"Bearer {FEATHERLESS_API_KEY}"
            },
            json={
                "model": DEFAULT_MODEL,
                "messages": messages,
                "max_tokens": 4000,
                "temperature": 0.8,
                "min_p": 0.05,
                "repetition_penalty": 1.03
            }
        )
        response.raise_for_status()
        return response.json()["choices"][0]["message"]["content"]
        
    except Exception as e:
        print(f"Error generating podcast content: {str(e)}")
        return None

## Execution Process

The following cell:
1. Reads the input text file that was processed in the previous notebook
2. Generates structured podcast content with the LLM
3. Saves the result to a file with proper encoding
4. Provides a preview of the generated content's beginning and end

In [None]:
# Read the input prompt
INPUT_PROMPT = read_file_to_string('./extracted_text.txt')

# Generate podcast content
podcast_content = generate_podcast(SYSTEM_PROMPT, INPUT_PROMPT)

if podcast_content:
    # Save the generated content
    with open('new_generated_podcast.txt', 'w', encoding='utf-8') as f:
        f.write(podcast_content)
    print("Generated podcast content has been saved to generated_podcast.txt")
    
    # Preview the content
    print("\nPreview of generated content:")
    print("-" * 50)
    print(podcast_content[:1000])
    print("\n...\n")
    print(podcast_content[-1000:])
else:
    print("Failed to generate podcast content")
