# üéôÔ∏è Unified Podcast Generator - Google Colab

**Full-featured AI Podcast Generator**
- üìÑ Document upload (PDF, DOCX, TXT)
- ü§ñ LLM script generation (Mistral/Llama)
- üé§ Voice cloning (XTTS v2 via coqui-tts)
- üéõÔ∏è Advanced audio settings

## ‚ö° Setup: Enable GPU ‚Üí `Runtime` ‚Üí `Change runtime type` ‚Üí `T4 GPU`

## Step 1: Clone & Install

In [None]:
# Clone repository
!git clone https://github.com/beginner4a3/pod2.git
%cd pod2

# Set HuggingFace token (REPLACE with your token)
import os
os.environ['HF_TOKEN'] = "your_token_here"  # Get from: https://huggingface.co/settings/tokens

# Install ALL dependencies
!pip install -r requirements.txt -q

print("\n‚úÖ All dependencies installed!")

## Step 2: Download Models

In [None]:
# Download TTS model
from huggingface_hub import snapshot_download
import os

print("üì• Downloading Indic-ParlerTTS model (~3GB)...")
snapshot_download(repo_id="ai4bharat/indic-parler-tts", token=os.environ.get('HF_TOKEN'))
print("‚úÖ TTS Model ready!")

# Download LLM model
print("\nüì• Downloading Mistral-7B LLM (~4GB)...")
!wget -q --show-progress https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF/resolve/main/mistral-7b-instruct-v0.2.Q4_K_M.gguf
print("‚úÖ LLM Model ready!")

## Step 3: Verify Installation

In [None]:
import sys
sys.path.insert(0, '/content/pod2')

from src.tts.indic_parler import IndicParlerTTS
from src.tts.xtts_cloner import is_voice_cloning_available

print("‚úÖ Indic-ParlerTTS: Ready")
print(f"‚úÖ Voice Cloning (XTTS v2): {'Available' if is_voice_cloning_available() else 'Not installed'}")
print(f"‚úÖ Languages: {len(IndicParlerTTS.get_languages())} supported")

## Step 4: Launch UI üöÄ

In [None]:
from src.ui.gradio_app import create_interface

print("üöÄ Starting Podcast Generator...")
print("üí° LLM Model Path: mistral-7b-instruct-v0.2.Q4_K_M.gguf\n")

demo = create_interface()
demo.queue().launch(share=True)

---

## Alternative: Manual Usage

In [None]:
# Quick TTS test
from src.tts.indic_parler import IndicParlerTTS
from IPython.display import Audio

tts = IndicParlerTTS()
audio = tts.generate("‡§®‡§Æ‡§∏‡•ç‡§§‡•á! ‡§Ø‡§π ‡§è‡§ï ‡§™‡§∞‡•Ä‡§ï‡•ç‡§∑‡§£ ‡§π‡•à‡•§", speaker="Rohit", emotion="happy")
tts.save(audio, "test.wav")
Audio("test.wav")

In [None]:
# Voice cloning - upload reference audio
from google.colab import files
from src.tts.xtts_cloner import XTTSCloner

print("Upload a ~6 second voice sample:")
uploaded = files.upload()
ref_audio = list(uploaded.keys())[0]

cloner = XTTSCloner()
cloned = cloner.generate("Hello, this is my cloned voice!", ref_audio, "en")

import soundfile as sf
sf.write("cloned.wav", cloned, 24000)
Audio("cloned.wav")

In [None]:
# Generate podcast
from src.tts.indic_parler import IndicParlerTTS
from src.audio.mixer import PodcastMixer, AudioClip
from IPython.display import Audio

script = """Speaker1: ‡§®‡§Æ‡§∏‡•ç‡§§‡•á ‡§¶‡•ã‡§∏‡•ç‡§§‡•ã‡§Ç! ‡§Ü‡§ú ‡§π‡§Æ AI ‡§ï‡•á ‡§¨‡§æ‡§∞‡•á ‡§Æ‡•á‡§Ç ‡§¨‡§æ‡§§ ‡§ï‡§∞‡•á‡§Ç‡§ó‡•á‡•§
Speaker2: ‡§π‡§æ‡§Å, ‡§Ø‡§π ‡§¨‡§π‡•Å‡§§ interesting topic ‡§π‡•à‡•§"""

tts = IndicParlerTTS()
clips = []

for line in script.strip().split('\n'):
    if ':' in line:
        speaker, text = line.split(':', 1)
        speaker_name = "Rohit" if "1" in speaker else "Divya"
        audio = tts.generate(text.strip(), speaker=speaker_name, emotion="conversation")
        clips.append(AudioClip(audio=audio, sample_rate=tts.sample_rate, speaker=speaker_name))

mixer = PodcastMixer(sample_rate=tts.sample_rate)
final = mixer.mix_turns(clips, gap_ms=100, add_noise=True)

import soundfile as sf
sf.write("podcast.wav", final, tts.sample_rate)
Audio("podcast.wav")

In [None]:
# Download audio
from google.colab import files
files.download("podcast.wav")