# 🎙️ AI Podcast Generator

Transform AI articles into engaging podcast-style audio content!

This notebook provides a complete pipeline that:
1. **Extracts text** from URLs or accepts raw text input
2. **Summarizes** the content using AI (BART model)
3. **Rewrites** it in a conversational podcast tone
4. **Converts** it to high-quality speech using gTTS
5. **Adds** intro/outro and optional background music
6. **Provides** an interactive web interface using Streamlit

## Features
- ✅ Free APIs and open-source tools only
- ✅ URL extraction and raw text input
- ✅ AI-powered summarization
- ✅ Podcast-style text rewriting
- ✅ High-quality text-to-speech
- ✅ Automatic intro/outro
- ✅ Background music support
- ✅ Interactive web interface
- ✅ Audio download functionality

## Requirements
- Python 3.7+
- Internet connection for model downloads
- Google Colab (recommended) or local environment

## 📦 Installation

First, let's install all the required packages:

In [None]:
# Install required packages
!pip install transformers torch beautifulsoup4 requests gTTS pydub streamlit

# Download a sample background music file
!wget -O music.mp3 "https://cdn.pixabay.com/download/audio/2024/12/09/audio_5c5be993bd.mp3?filename=vlog-music-beat-trailer-showreel-promo-background-intro-theme-274290.mp3"

print("✅ All packages installed successfully!")

## 📝 Text Processing Module

This module handles article extraction, summarization, and podcast-style rewriting:

In [None]:
import requests
from bs4 import BeautifulSoup
from transformers import pipeline
import re

def extract_article_text(url: str) -> str:
    """
    Extracts the main text content from a given URL.
    """
    try:
        headers = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
        }
        response = requests.get(url, headers=headers, timeout=10)
        response.raise_for_status()
        
        soup = BeautifulSoup(response.content, 'html.parser')
        
        # Remove script and style elements
        for script in soup(["script", "style"]):
            script.decompose()
        
        # Try to find main content areas
        content_selectors = [
            'article', '.article-content', '.post-content', '.entry-content',
            '.content', 'main', '.main-content', '[role="main"]'
        ]
        
        text = ""
        for selector in content_selectors:
            elements = soup.select(selector)
            if elements:
                text = ' '.join([elem.get_text() for elem in elements])
                break
        
        # Fallback to body if no specific content area found
        if not text:
            text = soup.get_text()
        
        # Clean up the text
        text = re.sub(r'\s+', ' ', text).strip()
        
        return text if text else "No content found"
        
    except Exception as e:
        return f"Error fetching URL: {str(e)}"

def summarize_text(text: str, max_length: int = 150) -> str:
    """
    Summarizes the given text using BART model.
    """
    try:
        # Initialize the summarization pipeline
        summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
        
        # Split text into chunks if it's too long
        max_chunk_length = 1024  # BART's max input length
        
        if len(text) <= max_chunk_length:
            summary = summarizer(text, max_length=max_length, min_length=30, do_sample=False)
            return summary[0]['summary_text']
        else:
            # Split into chunks and summarize each
            chunks = [text[i:i+max_chunk_length] for i in range(0, len(text), max_chunk_length)]
            summaries = []
            
            for chunk in chunks:
                if len(chunk.strip()) > 50:  # Only process meaningful chunks
                    chunk_summary = summarizer(chunk, max_length=max_length//len(chunks), min_length=20, do_sample=False)
                    summaries.append(chunk_summary[0]['summary_text'])
            
            # Combine and re-summarize if needed
            combined_summary = ' '.join(summaries)
            if len(combined_summary) > max_chunk_length:
                final_summary = summarizer(combined_summary, max_length=max_length, min_length=30, do_sample=False)
                return final_summary[0]['summary_text']
            else:
                return combined_summary
                
    except Exception as e:
        return f"Error during summarization: {str(e)}"

def rewrite_to_podcast_style(text: str) -> str:
    """
    Rewrites the text in a conversational podcast style.
    """
    try:
        # Initialize the text generation pipeline for paraphrasing
        generator = pipeline("text2text-generation", model="facebook/bart-large-cnn")
        
        # Add conversational elements
        podcast_intro = "Alright, so let's dive into this. What we're essentially looking at is... "
        podcast_outro = ". And that's pretty fascinating, isn't it?"
        
        # Split text into sentences and add conversational connectors
        sentences = text.split('. ')
        conversational_sentences = []
        
        for i, sentence in enumerate(sentences):
            if sentence.strip():
                conversational_sentences.append(sentence.strip())
                if i < len(sentences) - 1:  # Don't add connector after last sentence
                    conversational_sentences.append("So, ")
        
        # Combine everything
        podcast_text = podcast_intro + ' '.join(conversational_sentences) + podcast_outro
        
        return podcast_text
        
    except Exception as e:
        return f"Error during podcast rewriting: {str(e)}"

print("✅ Text processing module loaded successfully!")

## 🔊 Audio Processing Module

This module handles text-to-speech conversion and audio enhancement:

In [None]:
from gtts import gTTS
from pydub import AudioSegment
import os

def text_to_speech(text: str, output_filename: str, lang="en"):
    """
    Converts text to speech using gTTS and saves it as an MP3 file.
    """
    try:
        tts = gTTS(text=text, lang=lang, slow=False)
        tts.save(output_filename)
        return True
    except Exception as e:
        print(f"Error during TTS conversion: {e}")
        return False

def add_audio_intro_outro(main_audio_path: str, intro_text: str, outro_text: str, output_path: str):
    """
    Adds an intro and outro to the main audio and saves the combined audio.
    """
    try:
        # Generate intro and outro audio
        intro_audio_path = "intro.mp3"
        outro_audio_path = "outro.mp3"

        if not text_to_speech(intro_text, intro_audio_path):
            print("Could not generate intro audio. Skipping intro.")
            intro_audio = AudioSegment.empty()
        else:
            intro_audio = AudioSegment.from_mp3(intro_audio_path)

        if not text_to_speech(outro_text, outro_audio_path):
            print("Could not generate outro audio. Skipping outro.")
            outro_audio = AudioSegment.empty()
        else:
            outro_audio = AudioSegment.from_mp3(outro_audio_path)

        main_audio = AudioSegment.from_mp3(main_audio_path)

        # Concatenate audio segments
        final_audio = intro_audio + main_audio + outro_audio

        final_audio.export(output_path, format="mp3")

        # Clean up temporary files
        if os.path.exists(intro_audio_path):
            os.remove(intro_audio_path)
        if os.path.exists(outro_audio_path):
            os.remove(outro_audio_path)

        return True
    except Exception as e:
        print(f"Error adding intro/outro: {e}")
        return False

def add_background_music(main_audio_path: str, music_path: str, output_path: str, volume_reduction=6):
    """
    Adds background music to the main audio.
    """
    try:
        main_audio = AudioSegment.from_mp3(main_audio_path)
        background_music = AudioSegment.from_mp3(music_path)

        # Ensure background music is at least as long as the main audio
        if len(background_music) < len(main_audio):
            background_music = background_music * (len(main_audio) // len(background_music) + 1)
        background_music = background_music[:len(main_audio)]

        # Reduce volume of background music
        background_music = background_music - volume_reduction

        # Overlay background music
        combined_audio = main_audio.overlay(background_music)

        combined_audio.export(output_path, format="mp3")
        return True
    except Exception as e:
        print(f"Error adding background music: {e}")
        return False

print("✅ Audio processing module loaded successfully!")

## 🌐 Streamlit Web Interface

This creates an interactive web interface for the podcast generator:

In [None]:
%%writefile streamlit_app.py
import streamlit as st
import os
import tempfile
from text_processing import extract_article_text, summarize_text, rewrite_to_podcast_style
from audio_processing import text_to_speech, add_audio_intro_outro, add_background_music

# Set page configuration
st.set_page_config(
    page_title="AI Podcast Generator",
    page_icon="🎙️",
    layout="wide",
    initial_sidebar_state="expanded"
)

# Title and description
st.title("🎙️ AI Podcast Generator")
st.markdown("""
Transform AI articles into engaging podcast-style audio content!

This tool:
1. Extracts text from URLs or accepts raw text input
2. Summarizes the content using AI
3. Rewrites it in a conversational podcast tone
4. Converts it to high-quality speech
5. Adds intro/outro and optional background music
""")

# Sidebar for configuration
st.sidebar.header("Configuration")
add_intro_outro = st.sidebar.checkbox("Add intro/outro", value=True)
add_bg_music = st.sidebar.checkbox("Add background music", value=True)

# Main interface
col1, col2 = st.columns([1, 1])

with col1:
    st.header("Input")
    
    # Input method selection
    input_method = st.radio("Choose input method:", ["URL", "Raw Text"])
    
    if input_method == "URL":
        url_input = st.text_input("Enter article URL:", placeholder="https://example.com/article")
        if st.button("Extract Text from URL") and url_input:
            with st.spinner("Extracting text from URL..."):
                extracted_text = extract_article_text(url_input)
                if extracted_text.startswith("Error"):
                    st.error(extracted_text)
                else:
                    st.session_state.raw_text = extracted_text
                    st.success("Text extracted successfully!")
    else:
        raw_text = st.text_area("Enter article text:", height=300, placeholder="Paste your article text here...")
        if raw_text:
            st.session_state.raw_text = raw_text

    # Process button
    if st.button("Generate Podcast", type="primary") and hasattr(st.session_state, 'raw_text'):
        with st.spinner("Processing..."):
            try:
                # Step 1: Summarize
                st.info("Step 1: Summarizing text...")
                summary = summarize_text(st.session_state.raw_text)
                st.session_state.summary = summary
                
                # Step 2: Rewrite to podcast style
                st.info("Step 2: Converting to podcast style...")
                podcast_text = rewrite_to_podcast_style(summary)
                st.session_state.podcast_text = podcast_text
                
                # Step 3: Generate audio
                st.info("Step 3: Generating audio...")
                
                # Create temporary files
                main_audio_path = "main_podcast.mp3"
                final_audio_path = "final_podcast.mp3"
                
                # Generate main audio
                if text_to_speech(podcast_text, main_audio_path):
                    st.session_state.main_audio_path = main_audio_path
                    
                    if add_intro_outro:
                        intro_text = "Welcome to the AI Podcast! Today we're diving into some fascinating insights from the world of artificial intelligence."
                        outro_text = "Thanks for listening to the AI Podcast! Stay curious and keep exploring the amazing world of AI."
                        
                        if add_audio_intro_outro(main_audio_path, intro_text, outro_text, final_audio_path):
                            st.session_state.final_audio_path = final_audio_path
                        else:
                            st.session_state.final_audio_path = main_audio_path
                    else:
                        st.session_state.final_audio_path = main_audio_path
                    
                    # Add background music if requested and available
                    if add_bg_music and os.path.exists("music.mp3"):
                        music_audio_path = "podcast_with_music.mp3"
                        if add_background_music(st.session_state.final_audio_path, "music.mp3", music_audio_path):
                            st.session_state.final_audio_path = music_audio_path
                    
                    st.success("Podcast generated successfully!")
                else:
                    st.error("Failed to generate audio")
                    
            except Exception as e:
                st.error(f"An error occurred: {str(e)}")

with col2:
    st.header("Output")
    
    # Display summary
    if hasattr(st.session_state, 'summary'):
        st.subheader("📝 Summary")
        st.write(st.session_state.summary)
    
    # Display podcast text
    if hasattr(st.session_state, 'podcast_text'):
        st.subheader("🎙️ Podcast Script")
        st.write(st.session_state.podcast_text)
    
    # Display audio player and download
    if hasattr(st.session_state, 'final_audio_path') and os.path.exists(st.session_state.final_audio_path):
        st.subheader("🔊 Generated Podcast")
        
        # Audio player
        with open(st.session_state.final_audio_path, "rb") as audio_file:
            audio_bytes = audio_file.read()
            st.audio(audio_bytes, format="audio/mp3")
        
        # Download button
        st.download_button(
            label="📥 Download Podcast",
            data=audio_bytes,
            file_name="ai_podcast.mp3",
            mime="audio/mp3"
        )

# Footer
st.markdown("---")
st.markdown("Built with ❤️ using Streamlit, Transformers, and gTTS")

# Clean up temporary files on app restart
if st.button("🗑️ Clear Cache"):
    for file in ["main_podcast.mp3", "final_podcast.mp3", "podcast_with_music.mp3"]:
        if os.path.exists(file):
            os.remove(file)
    
    # Clear session state
    for key in list(st.session_state.keys()):
        del st.session_state[key]
    
    st.success("Cache cleared!")
    st.rerun()

## 🚀 Run the Application

Execute the following cell to start the Streamlit web interface:

In [None]:
# Create the text processing module file
%%writefile text_processing.py
import requests
from bs4 import BeautifulSoup
from transformers import pipeline
import re

def extract_article_text(url: str) -> str:
    """
    Extracts the main text content from a given URL.
    """
    try:
        headers = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
        }
        response = requests.get(url, headers=headers, timeout=10)
        response.raise_for_status()
        
        soup = BeautifulSoup(response.content, 'html.parser')
        
        # Remove script and style elements
        for script in soup(["script", "style"]):
            script.decompose()
        
        # Try to find main content areas
        content_selectors = [
            'article', '.article-content', '.post-content', '.entry-content',
            '.content', 'main', '.main-content', '[role="main"]'
        ]
        
        text = ""
        for selector in content_selectors:
            elements = soup.select(selector)
            if elements:
                text = ' '.join([elem.get_text() for elem in elements])
                break
        
        # Fallback to body if no specific content area found
        if not text:
            text = soup.get_text()
        
        # Clean up the text
        text = re.sub(r'\s+', ' ', text).strip()
        
        return text if text else "No content found"
        
    except Exception as e:
        return f"Error fetching URL: {str(e)}"

def summarize_text(text: str, max_length: int = 150) -> str:
    """
    Summarizes the given text using BART model.
    """
    try:
        # Initialize the summarization pipeline
        summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
        
        # Split text into chunks if it's too long
        max_chunk_length = 1024  # BART's max input length
        
        if len(text) <= max_chunk_length:
            summary = summarizer(text, max_length=max_length, min_length=30, do_sample=False)
            return summary[0]['summary_text']
        else:
            # Split into chunks and summarize each
            chunks = [text[i:i+max_chunk_length] for i in range(0, len(text), max_chunk_length)]
            summaries = []
            
            for chunk in chunks:
                if len(chunk.strip()) > 50:  # Only process meaningful chunks
                    chunk_summary = summarizer(chunk, max_length=max_length//len(chunks), min_length=20, do_sample=False)
                    summaries.append(chunk_summary[0]['summary_text'])
            
            # Combine and re-summarize if needed
            combined_summary = ' '.join(summaries)
            if len(combined_summary) > max_chunk_length:
                final_summary = summarizer(combined_summary, max_length=max_length, min_length=30, do_sample=False)
                return final_summary[0]['summary_text']
            else:
                return combined_summary
                
    except Exception as e:
        return f"Error during summarization: {str(e)}"

def rewrite_to_podcast_style(text: str) -> str:
    """
    Rewrites the text in a conversational podcast style.
    """
    try:
        # Add conversational elements
        podcast_intro = "Alright, so let's dive into this. What we're essentially looking at is... "
        podcast_outro = ". And that's pretty fascinating, isn't it?"
        
        # Split text into sentences and add conversational connectors
        sentences = text.split('. ')
        conversational_sentences = []
        
        for i, sentence in enumerate(sentences):
            if sentence.strip():
                conversational_sentences.append(sentence.strip())
                if i < len(sentences) - 1:  # Don't add connector after last sentence
                    conversational_sentences.append("So, ")
        
        # Combine everything
        podcast_text = podcast_intro + ' '.join(conversational_sentences) + podcast_outro
        
        return podcast_text
        
    except Exception as e:
        return f"Error during podcast rewriting: {str(e)}"

In [None]:
# Create the audio processing module file
%%writefile audio_processing.py
from gtts import gTTS
from pydub import AudioSegment
import os

def text_to_speech(text: str, output_filename: str, lang="en"):
    """
    Converts text to speech using gTTS and saves it as an MP3 file.
    """
    try:
        tts = gTTS(text=text, lang=lang, slow=False)
        tts.save(output_filename)
        return True
    except Exception as e:
        print(f"Error during TTS conversion: {e}")
        return False

def add_audio_intro_outro(main_audio_path: str, intro_text: str, outro_text: str, output_path: str):
    """
    Adds an intro and outro to the main audio and saves the combined audio.
    """
    try:
        # Generate intro and outro audio
        intro_audio_path = "intro.mp3"
        outro_audio_path = "outro.mp3"

        if not text_to_speech(intro_text, intro_audio_path):
            print("Could not generate intro audio. Skipping intro.")
            intro_audio = AudioSegment.empty()
        else:
            intro_audio = AudioSegment.from_mp3(intro_audio_path)

        if not text_to_speech(outro_text, outro_audio_path):
            print("Could not generate outro audio. Skipping outro.")
            outro_audio = AudioSegment.empty()
        else:
            outro_audio = AudioSegment.from_mp3(outro_audio_path)

        main_audio = AudioSegment.from_mp3(main_audio_path)

        # Concatenate audio segments
        final_audio = intro_audio + main_audio + outro_audio

        final_audio.export(output_path, format="mp3")

        # Clean up temporary files
        if os.path.exists(intro_audio_path):
            os.remove(intro_audio_path)
        if os.path.exists(outro_audio_path):
            os.remove(outro_audio_path)

        return True
    except Exception as e:
        print(f"Error adding intro/outro: {e}")
        return False

def add_background_music(main_audio_path: str, music_path: str, output_path: str, volume_reduction=6):
    """
    Adds background music to the main audio.
    """
    try:
        main_audio = AudioSegment.from_mp3(main_audio_path)
        background_music = AudioSegment.from_mp3(music_path)

        # Ensure background music is at least as long as the main audio
        if len(background_music) < len(main_audio):
            background_music = background_music * (len(main_audio) // len(background_music) + 1)
        background_music = background_music[:len(main_audio)]

        # Reduce volume of background music
        background_music = background_music - volume_reduction

        # Overlay background music
        combined_audio = main_audio.overlay(background_music)

        combined_audio.export(output_path, format="mp3")
        return True
    except Exception as e:
        print(f"Error adding background music: {e}")
        return False

In [None]:
# Start the Streamlit application
!streamlit run streamlit_app.py --server.port 8501 --server.address 0.0.0.0 &

# Install ngrok for public access (if needed)
!pip install pyngrok

from pyngrok import ngrok
import time

# Wait a moment for Streamlit to start
time.sleep(5)

# Create a public tunnel
public_url = ngrok.connect(8501)
print(f"🌐 Your AI Podcast Generator is now live at: {public_url}")
print("\n📝 Instructions:")
print("1. Click the link above to open the web interface")
print("2. Choose between URL input or raw text input")
print("3. Configure options in the sidebar (intro/outro, background music)")
print("4. Click 'Generate Podcast' to create your audio")
print("5. Listen to the result and download the MP3 file")
print("\n⚠️ Note: Keep this cell running to maintain the web interface")

## 🧪 Testing the Pipeline

You can also test the individual components directly:

In [None]:
# Test the pipeline with sample text
sample_text = """
Artificial intelligence (AI) is rapidly transforming various industries, from healthcare to finance. 
Its ability to process vast amounts of data and identify patterns has led to breakthroughs in drug discovery, 
personalized medicine, and fraud detection. However, the ethical implications of AI, such as bias in algorithms 
and job displacement, are critical concerns that need to be addressed as the technology continues to evolve.
"""

print("🔄 Testing the AI Podcast Pipeline...")
print("\n📝 Original Text:")
print(sample_text)

# Step 1: Summarize
print("\n📋 Step 1: Summarizing...")
summary = summarize_text(sample_text)
print(f"Summary: {summary}")

# Step 2: Convert to podcast style
print("\n🎙️ Step 2: Converting to podcast style...")
podcast_text = rewrite_to_podcast_style(summary)
print(f"Podcast Script: {podcast_text}")

# Step 3: Generate audio
print("\n🔊 Step 3: Generating audio...")
if text_to_speech(podcast_text, "test_podcast.mp3"):
    print("✅ Audio generated successfully!")
    
    # Add intro/outro
    intro_text = "Welcome to the AI Podcast! Today we're exploring the latest in artificial intelligence."
    outro_text = "Thanks for listening! Stay tuned for more AI insights."
    
    if add_audio_intro_outro("test_podcast.mp3", intro_text, outro_text, "final_test_podcast.mp3"):
        print("✅ Intro and outro added successfully!")
        
        # Add background music if available
        if os.path.exists("music.mp3"):
            if add_background_music("final_test_podcast.mp3", "music.mp3", "final_with_music.mp3"):
                print("✅ Background music added successfully!")
                print("🎵 Final podcast with music: final_with_music.mp3")
            else:
                print("⚠️ Could not add background music")
        else:
            print("⚠️ No background music file found")
            
        print("🎵 Final podcast: final_test_podcast.mp3")
    else:
        print("⚠️ Could not add intro/outro")
else:
    print("❌ Failed to generate audio")

print("\n🎉 Pipeline test completed!")

## 📚 Usage Examples

Here are some example URLs you can try with the podcast generator:

### AI News Articles
- OpenAI announcements and research papers
- Google AI blog posts
- MIT Technology Review AI articles
- Towards Data Science articles
- AI research papers from arXiv

### Sample Text for Testing
You can also paste any AI-related text directly into the interface:

```
Machine learning has revolutionized the way we approach complex problems across various domains. 
From computer vision to natural language processing, deep learning models have achieved remarkable 
performance improvements. However, these advances come with challenges including computational 
requirements, data privacy concerns, and the need for interpretable AI systems.
```

## 🎨 Customization Options

You can customize various aspects of the podcast generator:

### Text Processing
- Modify the summarization model (try different BART variants)
- Adjust summary length parameters
- Customize the podcast style rewriting

### Audio Generation
- Change TTS language and voice settings
- Modify intro/outro text
- Adjust background music volume
- Add audio effects or filters

### Web Interface
- Customize the Streamlit theme
- Add more configuration options
- Implement user authentication
- Add podcast history and management

## 🔧 Troubleshooting

### Common Issues and Solutions

1. **Model Download Issues**
   - Ensure stable internet connection
   - Models are downloaded automatically on first use
   - Large models may take time to download

2. **Audio Generation Fails**
   - Check internet connection for gTTS
   - Verify text is not empty
   - Try shorter text if processing fails

3. **Web Interface Issues**
   - Restart the Streamlit application
   - Clear browser cache
   - Check if all dependencies are installed

4. **Memory Issues**
   - Use smaller models for limited memory
   - Process shorter texts
   - Restart the notebook if needed

## 🎯 Conclusion

This AI Podcast Generator provides a complete pipeline for transforming written AI content into engaging audio podcasts. The system is designed to be:

- **Free**: Uses only open-source tools and free APIs
- **Modular**: Each component can be used independently
- **Extensible**: Easy to customize and enhance
- **User-friendly**: Simple web interface for non-technical users

### Next Steps
- Deploy to a permanent hosting solution
- Add more advanced TTS options
- Implement batch processing
- Add podcast RSS feed generation
- Integrate with podcast platforms

### Contributing
Feel free to fork this project and contribute improvements! Some ideas:
- Better text extraction algorithms
- More sophisticated podcast style rewriting
- Advanced audio processing features
- Multi-language support
- Voice cloning capabilities

---

**Built with ❤️ using Python, Transformers, gTTS, and Streamlit**

*Happy podcasting! 🎙️*