# 🤖 Agentic Podcast Generator Interactive

Welcome to the **Agentic Podcast Generator** interactive tutorial! This notebook will guide you through understanding and using this intelligent multi-agent system that creates LinkedIn posts and voice dialog scripts from any topic.

## What is the Agentic Podcast Generator?

The Agentic Podcast Generator is an AI-powered system that uses multiple specialized agents working together to:
- Research topics comprehensively
- Generate SEO-optimized keywords and hashtags
- Create engaging LinkedIn posts
- Convert posts into conversational voice scripts

The system follows a hierarchical agent architecture where a **Master Agent** orchestrates several **Sub-agents**, each specializing in different tasks.



## 📁 Step 1: Explore the Project Structure

Let's start by examining the project structure to understand how the code is organized.

In [1]:
# Let's explore the project structure
import os
from pathlib import Path

# Get the current working directory
current_dir = Path.cwd()
print(f"Current directory: {current_dir}")

# List all files and directories
print("\nProject structure:")
for item in sorted(current_dir.iterdir()):
    if item.is_file():
        print(f"📄 {item.name}")
    else:
        print(f"📁 {item.name}/")
        # Show contents of key directories
        if item.name in ['agents', 'config', 'database', 'services', 'utils']:
            for subitem in sorted(item.iterdir()):
                if subitem.is_file():
                    print(f"   📄 {item.name}/{subitem.name}")
                else:
                    print(f"   📁 {item.name}/{subitem.name}/")

Current directory: /Users/rj/Programs/agentic-podcast-generator

Project structure:
📄 .env
📄 .env.example
📁 .git/
📄 .gitignore
📁 .opencode/
📁 .pytest_cache/
📁 .venv/
📄 README.md
📁 __pycache__/
📄 agentic_podcast_generator_tutorial.ipynb
📄 agentic_system.db
📁 agents/
   📄 agents/__init__.py
   📁 agents/__pycache__/
   📄 agents/base_agent.py
   📄 agents/master_agent.py
   📁 agents/sub_agents/
📁 config/
   📄 config/__init__.py
   📁 config/__pycache__/
   📄 config/settings.py
📁 database/
   📄 database/__init__.py
   📁 database/__pycache__/
   📄 database/connection.py
   📄 database/models.py
📄 main.py
📄 project_structure.md
📄 pyproject.toml
📄 requirements.txt
📄 sequence.md
📁 services/
   📄 services/__init__.py
   📁 services/__pycache__/
   📄 services/logger.py
   📄 services/openrouter_client.py
   📄 services/search_api.py
   📄 services/web_scraper.py
📄 system_architecture.md
📄 technical_specifications.md
📄 test_system.py
📁 tests/
📁 utils/
   📄 utils/__init__.py
   📁 utils/__pycache__/
   📄 u

## 🏗️ Step 2: Understand the System Architecture

The system consists of the following components:

### 1. Master Agent (Coordinator)
- **Role**: Orchestrates the entire workflow
- **Responsibilities**:
  - Manages parallel execution of sub-agents
  - Handles sequential processing
  - Logs all interactions
- **Important Note**: Does NOT use GPT-5 for analysis (despite some outdated documentation)

### 2. Perplexity AI Research (sonar model)
- **Role**: Performs comprehensive research
- **Capabilities**:
  - Uses Perplexity AI for deep research
  - Provides current facts, trends, and insights
  - Validates information credibility
- **Note**: This replaces the originally planned GPT-5 analysis

### 3. Keyword Generator (Gemini 2.0 Flash)
- **Role**: Generates SEO-optimized keywords and hashtags
- **Model**: google/gemini-2.0-flash-001
- **Output**: Keywords, hashtags, and relevance scores

### 4. Post Generator (Grok-3 Mini)
- **Role**: Creates engaging LinkedIn posts
- **Model**: xai/grok-3-mini
- **Style**: Casual, professional, and engaging

### 5. Voice Dialog Generator (Grok-3 Mini)
- **Role**: Converts posts to conversational voice scripts
- **Model**: xai/grok-3-mini
- **Output**: Natural, podcast-ready dialog

### Database Layer
- **SQLite Database**: Persistent storage for sessions, logs, and results
- **Tables**: sessions, agent_logs, research_results, keywords, generated_content

In [None]:
# Let's examine the main entry point
print("=== MAIN.PY - Entry Point ===")
with open('main.py', 'r') as f:
    content = f.read()
    # Show the first 50 lines to understand the structure
    print(content[:1000] + "...")
    
print("\n=== MASTER AGENT - Core Orchestrator ===")
with open('agents/master_agent.py', 'r') as f:
    content = f.read()
    # Show the class definition and key methods
    lines = content.split('\n')
    for i, line in enumerate(lines[:40]):
        print(f"{i+1:2d}: {line}")
    print("...")

## 📋 Step 3: Understand the Workflow

Here's how the system processes a topic:

1. **User Input**: Topic provided via command line
2. **Research Phase**: Perplexity AI (sonar model) researches the topic
3. **Parallel Processing**:
   - Keyword Generator creates keywords/hashtags
   - Post Generator creates LinkedIn content
   - Voice Dialog Generator creates podcast script
4. **Output**: Formatted results with all generated content
5. **Logging**: All interactions stored in database

### Data Flow Diagram
```
Topic Input
    ↓
Perplexity AI Research (sonar)
    ↓
Parallel Execution:
├── Keyword Generation (Gemini 2.0 Flash)
├── Post Generation (Grok-3 Mini)
└── Voice Script Generation (Grok-3 Mini)
    ↓
Final Output + Database Logging
```

### Important Clarification: GPT-5 References

**GPT-5 is NOT used anywhere in the actual running application.** Despite some outdated references in documentation and tests, the system uses:
- **Perplexity AI (sonar)** for research
- **Gemini 2.0 Flash** for keywords
- **Grok-3 Mini** for posts and voice scripts

The original design planned to use GPT-5 for topic analysis, but the implementation uses Perplexity AI instead.

In [None]:
# Let's examine the workflow in the master agent
import re

with open('agents/master_agent.py', 'r') as f:
    content = f.read()
    
# Find the main processing method
pattern = r'async def process_topic_with_research.*?return \{'
match = re.search(pattern, content, re.DOTALL)
if match:
    print("=== MAIN PROCESSING WORKFLOW ===")
    workflow_code = match.group(0)
    lines = workflow_code.split('\n')
    for i, line in enumerate(lines[:30]):
        print(f"{i+1:2d}: {line}")
    print("...")
    
# Show parallel execution part
parallel_pattern = r'# Execute all sub-agents concurrently.*?return_exceptions=True'
match = re.search(parallel_pattern, content, re.DOTALL)
if match:
    print("\n=== PARALLEL EXECUTION ===")
    print(match.group(0))

## 🛠️ Step 4: Check Environment Setup

Let's check if the environment is properly set up for running the system.

In [None]:
# Check Python version
import sys
print(f"Python version: {sys.version}")

# Check if required packages are available
required_packages = ['asyncio', 'json', 'pathlib', 'dotenv']
missing_packages = []

for package in required_packages:
    try:
        __import__(package)
        print(f"✓ {package} is available")
    except ImportError:
        missing_packages.append(package)
        print(f"✗ {package} is missing")

if missing_packages:
    print(f"\nMissing packages: {missing_packages}")
    print("Run: pip install -r requirements.txt")
else:
    print("\n✓ All basic packages are available!")

# Check if .env file exists
from pathlib import Path
env_file = Path('.env')
if env_file.exists():
    print("✓ .env file exists")
else:
    print("✗ .env file missing - copy from .env.example")

# Check if database exists
db_file = Path('agentic_system.db')
if db_file.exists():
    print("✓ Database file exists")
else:
    print("✗ Database file missing - will be created on first run")

## ⚙️ Step 5: Examine Configuration

Let's look at the configuration system to understand how the application is configured.

In [None]:
# Examine the configuration
print("=== CONFIGURATION SETTINGS ===")
try:
    from config.settings import config
    print(f"Log level: {config.log_level}")
    print(f"Database URL: {config.database_url}")
    print(f"OpenRouter API Key: {'Set' if config.openrouter_api_key else 'Not set'}")
    
    # Show model configurations
    print("\nModel configurations:")
    for agent, model_config in config.models.items():
        print(f"  {agent}: {model_config.name}")
        
except Exception as e:
    print(f"Configuration error: {e}")
    print("Make sure .env file is properly configured")

# Show the .env.example file
print("\n=== .ENV.EXAMPLE ===")
try:
    with open('.env.example', 'r') as f:
        print(f.read())
except FileNotFoundError:
    print("No .env.example file found")

## 🔍 Step 6: Examine the Agent Classes

Let's look at the base agent class and one of the sub-agents to understand the architecture.

In [None]:
# Examine the base agent class
print("=== BASE AGENT CLASS ===")
with open('agents/base_agent.py', 'r') as f:
    content = f.read()
    lines = content.split('\n')
    for i, line in enumerate(lines[:40]):
        print(f"{i+1:2d}: {line}")
    print("...")

# Examine a sub-agent
print("\n=== KEYWORD GENERATOR AGENT ===")
with open('agents/sub_agents/keyword_generator.py', 'r') as f:
    content = f.read()
    # Show the execute method
    lines = content.split('\n')
    in_execute = False
    for i, line in enumerate(lines):
        if 'async def execute' in line:
            in_execute = True
            print(f"\nExecute method (line {i+1}):")
        if in_execute:
            print(f"{i+1:3d}: {line}")
            if line.strip() == '':
                break

## 💾 Step 7: Understand the Database Schema

Let's examine the database models to understand how data is stored.

In [None]:
# Examine the database models
print("=== DATABASE MODELS ===")
with open('database/models.py', 'r') as f:
    content = f.read()
    lines = content.split('\n')
    
# Show the main model classes
in_class = False
current_class = ""
for i, line in enumerate(lines):
    if line.startswith('class '):
        if current_class:
            print(f"\n{current_class} class:")
        current_class = line.split('(')[0].replace('class ', '')
        in_class = True
        print(f"\n{current_class} (line {i+1}):")
    elif in_class and line.strip().startswith('__tablename__'):
        print(f"  Table: {line.split('=')[1].strip().strip('"')}")
    elif in_class and line.strip().startswith('id ='):
        print(f"  Primary key: id")
        break  # Just show the first few lines of each class

# Show database initialization
print("\n=== DATABASE INITIALIZATION ===")
with open('database/connection.py', 'r') as f:
    content = f.read()
    lines = content.split('\n')
    for i, line in enumerate(lines[:30]):
        print(f"{i+1:2d}: {line}")

## 🎯 Step 8: Try Basic Usage

Let's try running the system with a simple example. **Note**: This requires a valid OpenRouter API key in your .env file.

In [None]:
# Check if we can import the main components
print("Testing imports...")
try:
    from agents.master_agent import MasterAgent
    from config.settings import config
    print("✓ Core imports successful")
    
    # Check if API key is configured
    if config.openrouter_api_key:
        print("✓ API key is configured")
        
        # Try a simple test (commented out to avoid API calls)
        print("\nTo run the system, use:")
        print("python main.py \"Your Topic Here\"")
        
    else:
        print("✗ API key not configured")
        print("Please set OPENROUTER_API_KEY in your .env file")
        
except Exception as e:
    print(f"✗ Import error: {e}")
    print("Make sure dependencies are installed: pip install -r requirements.txt")

## 🧪 Step 9: Run Tests

Let's run the test suite to see if everything is working correctly.

In [None]:
# Run the tests
import subprocess
import sys

print("Running tests...")
try:
    result = subprocess.run([sys.executable, '-m', 'pytest', 'tests/', '-v'], 
                          capture_output=True, text=True, timeout=60)
    
    print("STDOUT:")
    print(result.stdout)
    
    if result.stderr:
        print("STDERR:")
        print(result.stderr)
        
    print(f"\nReturn code: {result.returncode}")
    
except subprocess.TimeoutExpired:
    print("Tests timed out")
except FileNotFoundError:
    print("pytest not found. Install with: pip install pytest")
except Exception as e:
    print(f"Error running tests: {e}")

## 📊 Step 10: Examine Database Contents

If the database exists, let's examine its contents to see previous runs.

In [None]:
# Examine database contents
import sqlite3
from pathlib import Path

db_path = Path('agentic_system.db')
if db_path.exists():
    print("=== DATABASE CONTENTS ===")
    
    conn = sqlite3.connect(str(db_path))
    cursor = conn.cursor()
    
    # Show all tables
    cursor.execute("SELECT name FROM sqlite_master WHERE type='table';")
    tables = cursor.fetchall()
    print(f"Tables: {[table[0] for table in tables]}")
    
    # Show sessions
    cursor.execute("SELECT COUNT(*) FROM sessions")
    session_count = cursor.fetchone()[0]
    print(f"Total sessions: {session_count}")
    
    if session_count > 0:
        cursor.execute("SELECT id, topic, status, created_at FROM sessions ORDER BY created_at DESC LIMIT 5")
        sessions = cursor.fetchall()
        print("\nRecent sessions:")
        for session in sessions:
            print(f"  ID: {session[0]}, Topic: {session[1]}, Status: {session[2]}, Created: {session[3]}")
    
    # Show agent logs count
    cursor.execute("SELECT COUNT(*) FROM agent_logs")
    log_count = cursor.fetchone()[0]
    print(f"\nTotal agent logs: {log_count}")
    
    conn.close()
    
else:
    print("Database does not exist yet. Run the system to create it.")

## 🔍 Step 11: Check for Outdated GPT-5 References

Let's search for any outdated references to GPT-5 in the codebase to understand what needs to be updated.

In [None]:
# Search for GPT-5 references in the codebase
import os
import re
from pathlib import Path

def search_gpt5_references():
    """Search for GPT-5 references in the codebase"""
    gpt5_files = []
    
    # Walk through all files
    for root, dirs, files in os.walk('.'):
        # Skip .git, __pycache__, etc.
        dirs[:] = [d for d in dirs if not d.startswith('.') and d not in ['__pycache__', 'node_modules']]
        
        for file in files:
            if file.endswith(('.py', '.md', '.ipynb', '.txt')):
                filepath = Path(root) / file
                try:
                    with open(filepath, 'r', encoding='utf-8') as f:
                        content = f.read()
                        if re.search(r'GPT-5|gpt-5', content):
                            # Count occurrences
                            count = re.findall(r'GPT-5|gpt-5', content)
                            gpt5_files.append((str(filepath), len(count)))
                except Exception as e:
                    pass  # Skip files that can't be read
    
    return gpt5_files

# Search for GPT-5 references
gpt5_references = search_gpt5_references()

print("=== GPT-5 REFERENCES FOUND ===")
if gpt5_references:
    print(f"Found {len(gpt5_references)} files with GPT-5 references:")
    for filepath, count in gpt5_references:
        print(f"  {filepath}: {count} reference(s)")
        
    print("\n=== IMPORTANT NOTE ===")
    print("GPT-5 is NOT used in the actual running application!")
    print("These references are in:")
    print("- Documentation files (outdated)")
    print("- Test files (expecting old defaults)")
    print("- Example code (speculative)")
    
    print("\n=== ACTUAL MODELS USED ===")
    print("Research: sonar (Perplexity AI)")
    print("Keywords: google/gemini-2.0-flash-001")
    print("Posts: xai/grok-3-mini")
    print("Voice: xai/grok-3-mini")
    
else:
    print("No GPT-5 references found!")

## 🔧 Step 12: Examine Services and Utilities

Let's look at the supporting services that make the system work.

In [None]:
# Examine the services
print("=== OPENROUTER CLIENT ===")
with open('services/openrouter_client.py', 'r') as f:
    content = f.read()
    lines = content.split('\n')
    for i, line in enumerate(lines[:30]):
        print(f"{i+1:2d}: {line}")
    print("...")

print("\n=== LOGGER SERVICE ===")
with open('services/logger.py', 'r') as f:
    content = f.read()
    lines = content.split('\n')
    for i, line in enumerate(lines[:20]):
        print(f"{i+1:2d}: {line}")
    print("...")

print("\n=== UTILITIES ===")
for util_file in ['utils/retry.py', 'utils/communication.py', 'utils/validators.py']:
    try:
        with open(util_file, 'r') as f:
            content = f.read()
            lines = content.split('\n')
            print(f"\n{util_file}:")
            for i, line in enumerate(lines[:10]):
                if line.strip():
                    print(f"  {i+1:2d}: {line}")
    except FileNotFoundError:
        print(f"{util_file} not found")

## 🚀 Step 13: Advanced Features

### Session Management

- **Resume Sessions**: Use `--session-id` to continue interrupted workflows
- **Session Tracking**: All runs are tracked with unique IDs
- **Status Monitoring**: Check session status in database

### Parallel Processing

- Keyword, Post, and Voice generation run simultaneously
- Significant performance improvement over sequential execution
- Error handling ensures partial failures don't break the system

### Error Handling and Recovery

- **Retry Mechanisms**: Automatic retries for API failures
- **Fallback Models**: Alternative OpenRouter models if primary fails
- **Graceful Degradation**: Continues with partial results
- **Comprehensive Logging**: All errors logged with context

### API Integration

- **OpenRouter API**: Unified access to multiple AI models
- **Perplexity AI**: Advanced research capabilities
- **Async Operations**: Non-blocking API calls
- **Rate Limiting**: Built-in request throttling

In [None]:
# Show how to use the system programmatically
print("=== PROGRAMMATIC USAGE EXAMPLE ===")
print('''
from agents.master_agent import MasterAgent
import asyncio

async def generate_content():
    async with MasterAgent() as master:
        result = await master.process_topic("AI in Healthcare")
        print(f"Post: {result['linkedin_post']}")
        print(f"Keywords: {result['keywords']}")
        print(f"Voice script: {result['voice_dialog']}")

# Run the async function
asyncio.run(generate_content())
''')

# Show CLI usage
print("\n=== CLI USAGE EXAMPLES ===")
print("# Basic usage")
print("python main.py \"Artificial Intelligence in Healthcare\"")
print()
print("# With verbose logging")
print("python main.py \"Machine Learning Trends\" --verbose")
print()
print("# Resume a session")
print("python main.py \"AI Ethics\" --session-id 123")
print()
print("# JSON output")
print("python main.py \"Blockchain Technology\" --output-format json")

## 🆘 Step 14: Troubleshooting

### Common Issues

1. **OpenRouter API Key Error**
   - Ensure API key is correctly set in `.env`
   - Check account has sufficient credits
   - Verify key permissions

2. **Database Connection Error**
   - Ensure write permissions in project directory
   - Check SQLite installation
   - Verify database file integrity

3. **Research Failures**
   - Check internet connection
   - Verify OpenRouter API access
   - Some topics may have limited research data

4. **Agent Timeouts**
   - Increase timeout settings in config
   - Check API rate limits
   - Reduce concurrent operations

### Debug Mode

Run with maximum verbosity:
```bash
python main.py "Topic" --verbose
```

Check logs:
- Console output for real-time debugging
- `agentic_system.log` for persistent logs
- Database tables for detailed execution history

### Getting Help

- Check the logs in `agentic_system.log`
- Review database agent_logs table
- Open issues on GitHub
- Check OpenRouter API status

In [None]:
# Check for common issues
print("=== TROUBLESHOOTING CHECKS ===")

# Check Python version
import sys
version = sys.version_info
if version.major >= 3 and version.minor >= 9:
    print("✓ Python version is compatible")
else:
    print(f"✗ Python version {version.major}.{version.minor} may be too old. Requires 3.9+")

# Check for required files
required_files = ['main.py', 'agents/master_agent.py', 'config/settings.py']
for file in required_files:
    if Path(file).exists():
        print(f"✓ {file} exists")
    else:
        print(f"✗ {file} missing")

# Check imports
try:
    import asyncio
    import dotenv
    import click
    print("✓ Core dependencies available")
except ImportError as e:
    print(f"✗ Missing dependency: {e}")
    print("Run: pip install -r requirements.txt")

# Check environment
from pathlib import Path
if Path('.env').exists():
    print("✓ .env file exists")
    # Check if it has content
    with open('.env', 'r') as f:
        content = f.read().strip()
        if content:
            print("✓ .env file has content")
        else:
            print("✗ .env file is empty")
else:
    print("✗ .env file missing (copy from .env.example)")

print("\n=== SYSTEM INFO ===")
print(f"Platform: {sys.platform}")
print(f"Python executable: {sys.executable}")
print(f"Current directory: {Path.cwd()}")

## 📚 Step 15: Code Examples and Customization

### Creating a Custom Agent

Here's how to create a new agent that extends the system:

In [None]:
# Show how to create a custom agent
print("=== CUSTOM AGENT TEMPLATE ===")
custom_agent_code = '''
from agents.base_agent import BaseAgent
from typing import Dict, Any
import logging

logger = logging.getLogger(__name__)

class CustomAgent(BaseAgent):
    """Custom agent for specialized tasks."""
    
    def __init__(self):
        super().__init__("custom_agent", "custom")
    
    async def execute(self, input_data: Dict[str, Any]) -> Dict[str, Any]:
        """Execute the custom agent logic."""
        
        self.validate_input(input_data, ["topic"])
        
        topic = input_data["topic"]
        logger.info(f"Custom agent processing topic: {topic}")
        
        # Your custom logic here
        # For example, generate a summary, analysis, etc.
        
        result = {
            "topic": topic,
            "custom_output": f"Custom analysis of {topic}",
            "timestamp": "2024-01-01T00:00:00Z"
        }
        
        # Log the result
        await self.log_action("custom_processing", 
                            input_data=input_data, 
                            output_data=result)
        
        return result
'''

print(custom_agent_code)

print("\n=== HOW TO INTEGRATE CUSTOM AGENT ===")
integration_steps = '''
1. Save the custom agent in agents/sub_agents/custom_agent.py
2. Import it in agents/master_agent.py
3. Add it to the sub_agents dictionary in initialize_all_agents()
4. Update the parallel execution in process_topic_with_research()
5. Add configuration in config/settings.py if needed
'''
print(integration_steps)

## 🎉 Step 16: Conclusion and Next Steps

Congratulations! You've explored the entire Agentic Podcast Generator system. Here's what you've learned:

### Key Takeaways

- **Multi-Agent Architecture**: How specialized agents work together
- **Research-First Approach**: Perplexity AI provides comprehensive research
- **Parallel Processing**: Content generation happens simultaneously
- **Database Persistence**: Logging and session management
- **Error Resilience**: Graceful handling of failures
- **Model Accuracy**: GPT-5 is NOT used despite some outdated references
- **Modular Design**: Easy to extend and customize

### What You Can Do Next

1. **Run the System**: Try `python main.py "Your Topic"`
2. **Add Custom Agents**: Extend functionality for your needs
3. **Modify Configurations**: Adjust models and parameters
4. **Monitor Performance**: Check logs and database
5. **Contribute**: Improve the system and share your changes

### Resources

- **README.md**: Complete documentation
- **system_architecture.md**: Detailed technical specs
- **tests/**: Examples of how components work
- **GitHub Issues**: Report bugs and request features

### Important Reminder

**GPT-5 is not used in this system.** The actual models are:
- Research: Perplexity AI (sonar)
- Keywords: Gemini 2.0 Flash
- Posts & Voice: Grok-3 Mini

Happy coding! 🚀

In [None]:
# Final summary
print("=== TUTORIAL SUMMARY ===")
print("You have successfully explored:")
print("✓ Project structure and organization")
print("✓ System architecture and workflow")
print("✓ Configuration and setup")
print("✓ Agent classes and execution")
print("✓ Database schema and models")
print("✓ Testing and debugging")
print("✓ Customization and extension")
print("✓ GPT-5 reference identification")
print()
print("The Agentic Podcast Generator is ready for your topics!")
print("Next: python main.py \"Your Favorite Topic\"")
print()
print("Remember: GPT-5 is NOT used - the system uses Perplexity AI, Gemini, and Grok models!")