# YouTube Video Agent

Code authored by: Shaw Talebi

[Video link](https://youtu.be/-BUs1CPHKfU) <br>
[Blog link](https://shawhin.medium.com/how-to-improve-llms-with-tools-69cc68c804ed)

### imports

In [1]:
from youtube_transcript_api import YouTubeTranscriptApi
import re
from agents import Agent, function_tool, Runner, ItemHelpers, RunContextWrapper
from openai.types.responses import ResponseTextDeltaEvent
from dotenv import load_dotenv
import asyncio

In [2]:
# import environment variables from .env file
load_dotenv()

True

### define instructions

In [3]:
instructions = "You provide help with tasks related to YouTube videos."

### define tool

In [4]:
@function_tool
def fetch_youtube_transcript(url: str) -> str:
    """
    Extract transcript with timestamps from a YouTube video URL and format it for LLM consumption
    
    Args:
        url (str): YouTube video URL
        
    Returns:
        str: Formatted transcript with timestamps, where each entry is on a new line
             in the format: "[MM:SS] Text"
    """
    # Extract video ID from URL
    video_id_pattern = r'(?:v=|\/)([0-9A-Za-z_-]{11}).*'
    video_id_match = re.search(video_id_pattern, url)
    
    if not video_id_match:
        raise ValueError("Invalid YouTube URL")
    
    video_id = video_id_match.group(1)
    
    try:
        transcript = YouTubeTranscriptApi.get_transcript(video_id)
        
        # Format each entry with timestamp and text
        formatted_entries = []
        for entry in transcript:
            # Convert seconds to MM:SS format
            minutes = int(entry['start'] // 60)
            seconds = int(entry['start'] % 60)
            timestamp = f"[{minutes:02d}:{seconds:02d}]"
            
            formatted_entry = f"{timestamp} {entry['text']}"
            formatted_entries.append(formatted_entry)
        
        # Join all entries with newlines
        return "\n".join(formatted_entries)
    
    except Exception as e:
        raise Exception(f"Error fetching transcript: {str(e)}")

### define agent

In [5]:
agent = Agent(
    name="YouTube Transcript Agent",
    instructions=instructions,
    tools=[fetch_youtube_transcript],
)

### function to run agent

In [6]:
async def main():
    input_items = []

    print("=== YouTube Transcript Agent ===")
    print("Type 'exit' to end the conversation")
    print("Ask me anything about YouTube videos!")

    while True:
        # Get user input
        user_input = input("\nYou: ").strip()
        input_items.append({"content": user_input, "role": "user"})
        
        # Check for exit command
        if user_input.lower() in ['exit', 'quit', 'bye']:
            print("\nGoodbye!")
            break
            
        if not user_input:
            continue

        print("\nAgent: ", end="", flush=True)
        result = Runner.run_streamed(
            agent,
            input=input_items,
        )

        async for event in result.stream_events(): # not all events are available at outset, hence the async
            # We'll ignore the raw responses event deltas
            if event.type == "raw_response_event" and isinstance(event.data, ResponseTextDeltaEvent):
                print(event.data.delta, end="", flush=True)
            elif event.type == "agent_updated_stream_event":
                continue
            elif event.type == "run_item_stream_event":
                if event.item.type == "tool_call_item":
                    print("\n-- Fetching transcript...")
                elif event.item.type == "tool_call_output_item":
                    input_items.append({"content": f"Transcript:\n{event.item.output}", "role": "system"})
                    print("-- Transcript fetched.")
                elif event.item.type == "message_output_item":
                    input_items.append({"content": f"{event.item.raw_item}", "role": "assistant"})
                else:
                    pass  # Ignore other event types

        print("\n")  # Add a newline after each response

In [None]:
await main()

=== YouTube Transcript Agent ===
Type 'exit' to end the conversation
Ask me anything about YouTube videos!



You:  Can you summarize this video? https://youtu.be/ZaY5_ScmiFE



Agent: 
-- Fetching transcript...
-- Transcript fetched.
The video by Shaw introduces AI agents, explaining what they are and why they're significant. Three main types of AI agents with varying levels of agency are discussed, focusing on definitions from leading organizations like OpenAI, Hugging Face, and Anthropic.

1. **Definition Challenges**: The term "AI agent" lacks a universally agreed definition. Different organizations highlight different facets like autonomy, tool usage, and planning.

2. **Core Features**: AI agents typically involve large language models (LLMs), tool use, and a degree of autonomy. Tools help LLMs interact with the real world, overcoming limitations like hallucinations, and enhancing capabilities beyond text generation.

3. **Agent Levels**:
   - **Level 1**: Simple combination of LLMs with tools, allowing for basic tasks like web searches and code execution.
   - **Level 2**: LLM workflows involve predefined sequences of tasks, often using multiple LLMs. 


You:  Can you write chapter timestamps for this?



Agent: Certainly! Here are chapter timestamps for the video:

1. **Introduction to AI Agents** - [00:00](https://youtu.be/ZaY5_ScmiFE?t=0)
2. **Definitions and Challenges** - [00:21](https://youtu.be/ZaY5_ScmiFE?t=21)
3. **Core Features of AI Agents** - [01:59](https://youtu.be/ZaY5_ScmiFE?t=119)
4. **Levels of AI Agents** - [07:09](https://youtu.be/ZaY5_ScmiFE?t=429)
   - Level 1: LLM + Tools - [07:09](https://youtu.be/ZaY5_ScmiFE?t=429)
   - Level 2: LLM Workflows - [11:30](https://youtu.be/ZaY5_ScmiFE?t=690)
   - Level 3: LLM in a Loop - [19:05](https://youtu.be/ZaY5_ScmiFE?t=1145)
5. **Importance and Future of AI Agents** - [03:36](https://youtu.be/ZaY5_ScmiFE?t=216)
6. **Frameworks and Strategies** - [13:22](https://youtu.be/ZaY5_ScmiFE?t=802)
7. **Conclusion and Series Preview** - [22:52](https://youtu.be/ZaY5_ScmiFE?t=1372)

Feel free to click on the timestamps to jump to that section in the video!




You:  It seems like the chapter timestamps are not in order can you fix that?



Agent: Certainly! Here are the revised chapter timestamps:

1. **Introduction to AI Agents** - [00:00](https://youtu.be/ZaY5_ScmiFE?t=0)
2. **Definitions and Challenges** - [00:21](https://youtu.be/ZaY5_ScmiFE?t=21)
3. **Core Features of AI Agents** - [01:59](https://youtu.be/ZaY5_ScmiFE?t=119)
4. **Importance and Future of AI Agents** - [03:36](https://youtu.be/ZaY5_ScmiFE?t=216)
5. **Levels of AI Agents** - [07:09](https://youtu.be/ZaY5_ScmiFE?t=429)
   - **Level 1: LLM + Tools** - [07:09](https://youtu.be/ZaY5_ScmiFE?t=429)
   - **Level 2: LLM Workflows** - [11:30](https://youtu.be/ZaY5_ScmiFE?t=690)
   - **Level 3: LLM in a Loop** - [19:05](https://youtu.be/ZaY5_ScmiFE?t=1145)
6. **Frameworks and Strategies** - [13:22](https://youtu.be/ZaY5_ScmiFE?t=802)
7. **Conclusion and Series Preview** - [22:52](https://youtu.be/ZaY5_ScmiFE?t=1372)

Feel free to click on the timestamps to jump to specific sections in the video!

