
---

### **Introduction to the Generative Podcast Creator Capstone**

Welcome to the Generative Podcast Creator Capstone Project! In this project, you'll embark on an exciting journey to develop a system that seamlessly combines cutting-edge generative AI techniques to produce engaging podcast episodes. This capstone challenges you to integrate text, audio, and music generation technologies into a cohesive pipeline, allowing users to input a topic and receive a fully generated podcast complete with narration and background music.

#### **What You'll Learn**
This project is designed to provide hands-on experience with:
- **Text Generation**: Leveraging pre-trained language models to create structured and contextually relevant podcast scripts.
- **Text-to-Speech Conversion**: Using text-to-speech technologies to bring the script to life with realistic narration.
- **Audio Processing**: Enhancing the podcast with background music and sound effects to create a professional and engaging final product.
- **Multimodal Integration**: Combining various AI and software components that process text, audio, and music into a seamless, end-to-end workflow."

#### **Who Is This For?**
This project is ideal for:
- Learners with a basic understanding of Python and Swarmauri frameworks.
- Those looking to gain practical experience in developing generative AI systems.
- Aspiring developers interested in audio and multimedia content creation.

#### **Outcome**
Upon completion, you'll have a fully functional Generative Podcast Creator, demonstrating your ability to integrate AI models into real-world applications. You'll also gain valuable experience in designing systems that bridge AI and multimedia.

---

#### **Step 0. Setup & Dependencies**

In [9]:
# Generative Podcast Creator

from pydub import AudioSegment
import os
from swarmauri.conversations.concrete.Conversation import Conversation
from swarmauri.messages.concrete.HumanMessage import HumanMessage
from swarmauri.messages.concrete.SystemMessage import SystemMessage
from swarmauri.llms.concrete.OpenAIAudioTTS import OpenAIAudioTTS
from swarmauri.llms.concrete.OpenAIModel import OpenAIModel

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

#### **Step 1: Generate Podcast Script**
- This function uses a language model to generate a podcast script based on the user's input topic.

In [10]:
def generate_script(topic, host_name):
    text_gen_model = OpenAIModel(api_key=OPENAI_API_KEY)
    system_context = "You are a podcast host creating a script for an episode on a specific topic."
    human_context = f"Create a podcast script on the topic: {topic}. the name of the host is {host_name}."

    conversation = Conversation()

    conversation.add_messages([SystemMessage(content=system_context), HumanMessage(content=human_context)])
    text_gen_model.predict(conversation=conversation)
    return  conversation.get_last().content


#### **Step 2: Convert Script to Audio**
- This function converts the generated script into speech using OpenAI Audio Text to Speech model.

In [11]:
def script_to_audio(script, output_path="podcast_voice.mp3"):
    text_to_speech_model = OpenAIAudioTTS(api_key=OPENAI_API_KEY)

    output_path = text_to_speech_model.predict(text=script, audio_path=output_path)

    return output_path

#### **Step 3: Add Background Music**
- This function adds background music to the podcast voice.

In [12]:
def add_background_music(voice_path, music_path, output_path="final_podcast.mp3"):
    voice = AudioSegment.from_file(voice_path)
    music = AudioSegment.from_file(music_path).low_pass_filter(3000).apply_gain(-15)

    # Loop the background music if it's shorter than the voice
    while len(music) < len(voice):
        music += music

    # Trim the music to match the length of the voice
    music = music[:len(voice)]

    # Combine voice and music
    podcast = voice.overlay(music)
    podcast.export(output_path, format="mp3")
    return output_path

#### **Step 4: Generate Podcast Audio**
- This function generates the podcast audio with and without background music, and the text script

In [13]:
import gradio as gr

# Function to generate podcast audio with and without background music, and the text script
def generate_podcast(topic, host_name, background_music_path):
    # Simulate generating the podcast audio

    text_script = generate_script(topic, host_name)

    audio_without_music = script_to_audio(text_script)

    audio_with_music = add_background_music(audio_without_music, background_music_path)

    return text_script, audio_without_music, audio_with_music

### **Create the Gradio interface**

In [14]:
# Create the Gradio interface
with gr.Blocks() as demo:
    with gr.Row():
        with gr.Column():
            # Input components
            topic_input = gr.Textbox(label="Topic")
            host_name_input = gr.Textbox(label="Host Name")
            background_music_input = gr.Audio(label="Background Music", type="filepath", sources=["upload"])
            # Generate button
            generate_button = gr.Button("Generate Podcast")
        with gr.Column():
            # Output components
            text_script_output = gr.Textbox(label="Podcast Text Script")
            audio_without_music_output = gr.Audio(label="Podcast without Background Music", streaming=True)
            audio_with_music_output = gr.Audio(label="Podcast with Background Music", streaming=True)

    # Define the event listener for the button click
    generate_button.click(
        fn=generate_podcast,
        inputs=[topic_input, host_name_input, background_music_input],
        outputs=[text_script_output, audio_without_music_output, audio_with_music_output]
    )

# Launch the Gradio app
if __name__ == "__main__":
    demo.launch(show_error=True)

* Running on local URL:  http://127.0.0.1:7862

To create a public link, set `share=True` in `launch()`.


**Usage Instructions:**


1. Enter the topic for your podcast
2. Provide the host's name
3. Upload background music file (MP3 format)
4. Click "Generate Podcast"
5. View/download:
    - Generated script
    - Audio without music
    - Final podcast with background music

The Gradio interface will be available at http://127.0.0.1:7861/ by default.
