# YouTube Knowledge Extraction Tutorial

This tutorial demonstrates how to use the Atomic Agents library to create an agent that extracts knowledge and insights from YouTube video transcripts.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/KennyVaneetvelde/atomic_agents/blob/main/examples/notebooks/youtube_knowledge_extraction.ipynb)

## Prerequisites

Before proceeding, make sure you have a good understanding of the following libraries:

- **Pydantic**: A data validation and settings management library using Python type annotations. ([Pydantic GitHub](https://github.com/pydantic/pydantic))
- **Instructor**: A Python library that simplifies working with structured outputs from large language models (LLMs). ([Instructor GitHub](https://github.com/jxnl/instructor))

You'll also need an OpenAI API key to use the GPT models. If you don't have one, you can sign up at [OpenAI's website](https://openai.com/).


## Step 1: Install Required Packages

First, let's install the necessary packages for our YouTube knowledge extraction agent.

In [1]:
# Install required packages
%pip install atomic-agents openai instructor python-dotenv

Note: you may need to restart the kernel to use updated packages.


## Step 2: Set Up OpenAI API Key

To use the OpenAI API, you need to set up your API key. You have three options:

1. Enter it directly in the code (not recommended for shared notebooks)
2. Use a .env file
3. Input the key manually when prompted

Choose the method that best suits your needs and security requirements.

In [2]:
import os
from dotenv import load_dotenv

# Option 1: Set the API key directly (replace with your actual API key)
# os.environ["OPENAI_API_KEY"] = "your-api-key-here"

# Option 2: Load from .env file
load_dotenv()

# Option 3: Input the key manually
if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = input("Enter your OpenAI API key: ")

# Verify that the API key is set
if not os.getenv("OPENAI_API_KEY"):
    raise ValueError("OpenAI API key is not set. Please set it using one of the provided methods.")

## Step 3: Import Libraries

Now, let's import the necessary libraries for creating the YouTube knowledge extraction agent.

In [3]:
import instructor
import openai
from pydantic import BaseModel, Field
from rich.console import Console
from rich.markdown import Markdown
from typing import List, Optional

from atomic_agents.agents.base_agent import BaseAgent, BaseAgentConfig
from atomic_agents.lib.components.system_prompt_generator import SystemPromptContextProviderBase, SystemPromptGenerator, SystemPromptInfo
from atomic_agents.lib.tools.yt_transcript_scraper import YouTubeTranscriptTool, YouTubeTranscriptToolConfig, YouTubeTranscriptToolSchema

## Step 4: Initialize Components

Let's initialize the necessary components including the console, client, and YouTube transcript scraper tool.

In [4]:
console = Console()
client = instructor.from_openai(openai.OpenAI())
yt_scraper_tool = YouTubeTranscriptTool(config=YouTubeTranscriptToolConfig())

## Step 5: Define System Prompt Information

Now, we'll define the system prompt information including background, steps, output instructions, and context providers. This information guides the agent's behavior and output format.

In [5]:
class YtTranscriptProvider(SystemPromptContextProviderBase):
    def __init__(self, title):
        super().__init__(title)
        self.transcript = None
        self.duration = None
        self.metadata = None
    
    def get_info(self) -> str:
        return f'VIDEO TRANSCRIPT: "{self.transcript}"\n\nDURATION: {self.duration}\n\nMETADATA: {self.metadata}'
        
transcript_provider = YtTranscriptProvider(title='YouTube Transcript')

system_prompt_info = SystemPromptInfo(
    background=[
        'This Assistant is an expert at extracting knowledge and other insightful and interesting information from YouTube transcripts.'
    ],
    steps=[
        'Analyse the YouTube transcript thoroughly to extract the most valuable insights, facts, and recommendations.',
        'Adhere strictly to the provided schema when extracting information from the input content.',
        'Ensure that the output matches the field descriptions, types and constraints exactly.',
    ],
    output_instructions=[
        'Only output Markdown-compatible strings.',
        'Ensure you follow ALL these instructions when creating your output.'
    ],
    context_providers={'yt_transcript': transcript_provider}
)

## Step 6: Define Response Model

Let's define the response model with detailed descriptions, constraints, and field types. This model will structure the output of our knowledge extraction agent.

In [6]:
class ResponseModel(BaseModel):
    summary: str = Field(..., description="A short summary of the content, including who is presenting and the content being discussed.")
    insights: List[str] = Field(..., min_items=5, max_items=5, description="exactly 5 of the best insights and ideas from the input.")
    quotes: List[str] = Field(None, min_items=5, max_items=5, description="exactly 5 of the most surprising, insightful, and/or interesting quotes from the input.")
    habits: Optional[List[str]] = Field(None, min_items=5, max_items=5, description="exactly 5 of the most practical and useful personal habits mentioned by the speakers.")
    facts: List[str] = Field(..., min_items=5, max_items=5, description="exactly 5 of the most surprising, insightful, and/or interesting valid facts about the greater world mentioned in the content.")
    recommendations: List[str] = Field(..., min_items=5, max_items=5, description="exactly 5 of the most surprising, insightful, and/or interesting recommendations from the content.")    
    references: List[str] = Field(..., description="All mentions of writing, art, tools, projects, and other sources of inspiration mentioned by the speakers.")
    one_sentence_takeaway: str = Field(..., description="The most potent takeaways and recommendations condensed into a single 20-word sentence.")

## Step 7: Create the Chat Agent

Now, let's create a chat agent with the specified model, system prompt generator, and response model.

In [7]:
agent = BaseAgent(
    config=BaseAgentConfig(
        client=client,
        system_prompt_generator=SystemPromptGenerator(system_prompt_info),
        model='gpt-3.5-turbo',
        output_schema=ResponseModel
    )
)

## Step 8: Get YouTube Video URL

Let's prompt the user to enter a YouTube video URL and use the YouTube transcript scraper tool to extract the transcript, duration, and metadata.

In [8]:
video_url = input('Enter the YouTube video URL: ')
scraped_transcript = yt_scraper_tool.run(YouTubeTranscriptToolSchema(video_url=video_url))
transcript_provider.transcript = scraped_transcript.transcript
transcript_provider.duration = scraped_transcript.duration
transcript_provider.metadata = scraped_transcript.metadata

print(f"Successfully scraped transcript for video: {scraped_transcript.metadata.get('title', 'Unknown Title')}")
print(f"Video duration: {scraped_transcript.duration} seconds")

DefaultCredentialsError: Your default credentials were not found. To set up Application Default Credentials, see https://cloud.google.com/docs/authentication/external/set-up-adc for more information.

## Step 9: Run the Agent

Now, let's run the agent to extract knowledge and insights from the YouTube video transcript.

In [None]:
print("Analyzing the transcript. This may take a few moments...")
response = agent.run(agent.input_schema(chat_message='Perform your assignment on the YouTube video transcript present in your context. Do not reply with anything other than the output of the assignment.'))
print("Analysis complete!")

## Step 10: Format and Print the Response

Finally, let's convert the response to a dictionary, format it as a Markdown string, and print the response in a pretty Markdown format.

In [None]:
response_dict = response.model_dump()

def format_markdown_section(title, items):
    if isinstance(items, list):
        return f"## {title}\n" + "\n".join([f"- {item}" for item in items]) + "\n"
    return f"## {title}\n{items}\n"

markdown_string = ""
for key, value in response_dict.items():
    title = key.replace('_', ' ').title()
    markdown_string += format_markdown_section(title, value)

markdown_response = Markdown(markdown_string)
console.print(markdown_response)

## Conclusion

Congratulations! You've successfully created a YouTube knowledge extraction agent using the Atomic Agents library. Here's a summary of what we accomplished:

1. Set up the environment and installed necessary packages.
2. Configured the OpenAI API key securely.
3. Initialized components including the OpenAI client and YouTube transcript scraper tool.
4. Defined a system prompt to guide the agent's behavior.
5. Created a structured response model to format the extracted information.
6. Built a chat agent using the Atomic Agents library.
7. Extracted and processed a YouTube video transcript.
8. Ran the agent to analyze the transcript and generate insights.
9. Formatted and displayed the results in a readable Markdown format.

This agent can be further customized by modifying the system prompt information, response model, and other components to suit specific requirements. For example, you could adapt it to focus on different aspects of the video content or integrate it with other tools and APIs.

Remember to handle API keys securely in your projects, especially when sharing or deploying your code.