# YouTube Agent with OpenAI Agents SDK
## ABB #5 - Session 4

Code authored by: Shaw Talebi

**Resources**
- [YouTube video](https://youtu.be/-BUs1CPHKfU)
- [Blog post](https://medium.com/@shawhin/how-to-improve-llms-with-tools-69cc68c804ed?sk=3ffd8308ce4905617b136a02cfa8dd83)

### imports

In [1]:
from youtube_transcript_api import YouTubeTranscriptApi
import re
from agents import Agent, function_tool, Runner
from openai.types.responses import ResponseTextDeltaEvent
from dotenv import load_dotenv
import asyncio

In [2]:
import logging
# Suppress httpx INFO logs to reduce console output
logging.getLogger("httpx").setLevel(logging.WARNING)

In [3]:
# import environment variables from .env file
load_dotenv()

True

### define instructions

In [4]:
instructions = "You provide help with tasks related to YouTube videos."

### define tool

In [5]:
@function_tool
def fetch_youtube_transcript(url: str) -> str:
    """
    Extract transcript with timestamps from a YouTube video URL and format it for LLM consumption
    
    Args:
        url (str): YouTube video URL
        
    Returns:
        str: Formatted transcript with timestamps, where each entry is on a new line
             in the format: "[MM:SS] Text"
    """
    # Extract video ID from URL
    video_id_pattern = r'(?:v=|\/)([0-9A-Za-z_-]{11}).*'
    video_id_match = re.search(video_id_pattern, url)
    
    if not video_id_match:
        raise ValueError("Invalid YouTube URL")
    
    video_id = video_id_match.group(1)
    
    try:
        transcript = YouTubeTranscriptApi.get_transcript(video_id)
        
        # Format each entry with timestamp and text
        formatted_entries = []
        for entry in transcript:
            # Convert seconds to MM:SS format
            minutes = int(entry['start'] // 60)
            seconds = int(entry['start'] % 60)
            timestamp = f"[{minutes:02d}:{seconds:02d}]"
            
            formatted_entry = f"{timestamp} {entry['text']}"
            formatted_entries.append(formatted_entry)
        
        # Join all entries with newlines
        return "\n".join(formatted_entries)
    
    except Exception as e:
        raise Exception(f"Error fetching transcript: {str(e)}")

### create agent

In [6]:
agent = Agent(
    name="YouTube Transcript Agent",
    instructions=instructions,
    tools=[fetch_youtube_transcript],
)

### main() function

In [7]:
async def main():
    input_items = []

    print("=== YouTube Transcript Agent ===")
    print("Type 'exit' to end the conversation")
    print("Ask me anything about YouTube videos!")

    while True:
        # Get user input
        user_input = input("\nYou: ").strip()
        input_items.append({"content": user_input, "role": "user"})
        
        # Check for exit command
        if user_input.lower() in ['exit', 'quit', 'bye']:
            print("\nGoodbye!")
            break
            
        if not user_input:
            continue

        print("\nAgent: ", end="", flush=True)
        result = Runner.run_streamed(
            agent,
            input=input_items,
        )

        async for event in result.stream_events(): # not all events are available at outset, hence the async
            # We'll ignore the raw responses event deltas
            if event.type == "raw_response_event" and isinstance(event.data, ResponseTextDeltaEvent):
                print(event.data.delta, end="", flush=True)
            elif event.type == "run_item_stream_event":
                if event.item.type == "tool_call_item":
                    print("\n-- Fetching transcript...")
                elif event.item.type == "tool_call_output_item":
                    input_items.append({"content": f"Transcript:\n{event.item.output}", "role": "system"})
                    print("-- Transcript fetched.")
                elif event.item.type == "message_output_item":
                    input_items.append({"content": f"{event.item.raw_item.content[0].text}", "role": "assistant"})
                else:
                    pass  # Ignore other event types

        print("\n")  # Add a newline after each response

In [8]:
await main()
# what is this video about? https://youtu.be/ZaY5_ScmiFE

=== YouTube Transcript Agent ===
Type 'exit' to end the conversation
Ask me anything about YouTube videos!



You:  what is this video about? https://youtu.be/ZaY5_ScmiFE



Agent: 
-- Fetching transcript...
-- Transcript fetched.
The video is an introductory piece in a series on AI agents by Shaw. It explores the concept of AI agents, discussing their definitions and importance. Shaw examines varying definitions by organizations like OpenAI and Hugging Face and emphasizes key components of AI agents: the involvement of large language models (LLMs), tool usage, and autonomy.

Key Points:
1. **Definitions**: The video highlights differing definitions of AI agents, focusing on their configurations with instructions, tools, and planning capabilities.
2. **Components of AI Agents**:
   - **LLMs**: Central to AI agents, often augmented with additional software or frameworks.
   - **Tool Use**: Extends the capabilities of LLMs beyond text generation, enabling interactions with reality.
   - **Autonomy**: Involves planning, reasoning, and feedback for decision-making processes.

3. **Levels of Agency**:
   - **Level 1**: LLMs plus tools, useful for simple tasks.


You:  Can you create chapter timestamps for this?



Agent: Sure, here are the chapter timestamps based on the video content:

1. **Introduction to AI Agents** - 00:00
2. **Defining AI Agents** - 00:36
3. **Key Features of AI Agents** - 01:58
4. **Tools and Usage** - 02:35
5. **Autonomy and Agency** - 05:16
6. **Levels of Agency** - 07:05
7. **Level 1: LLMs Plus Tools** - 07:14
8. **Level 2: LLM Workflows** - 11:31
9. **Common Design Patterns** - 13:10
10. **Level 3: LLM in a Loop** - 19:06
11. **Future Video Content** - 22:52

These timestamps should help navigate the key sections of the video.




You:  Can you give me links to each chapter



Agent: Certainly! Here are the links to each chapter in the video:

1. [**Introduction to AI Agents**](https://youtu.be/ZaY5_ScmiFE?t=0)
2. [**Defining AI Agents**](https://youtu.be/ZaY5_ScmiFE?t=36)
3. [**Key Features of AI Agents**](https://youtu.be/ZaY5_ScmiFE?t=118)
4. [**Tools and Usage**](https://youtu.be/ZaY5_ScmiFE?t=155)
5. [**Autonomy and Agency**](https://youtu.be/ZaY5_ScmiFE?t=316)
6. [**Levels of Agency**](https://youtu.be/ZaY5_ScmiFE?t=425)
7. [**Level 1: LLMs Plus Tools**](https://youtu.be/ZaY5_ScmiFE?t=434)
8. [**Level 2: LLM Workflows**](https://youtu.be/ZaY5_ScmiFE?t=691)
9. [**Common Design Patterns**](https://youtu.be/ZaY5_ScmiFE?t=790)
10. [**Level 3: LLM in a Loop**](https://youtu.be/ZaY5_ScmiFE?t=1146)
11. [**Future Video Content**](https://youtu.be/ZaY5_ScmiFE?t=1372)

You can click these links to jump directly to each chapter in the video.




You:  exit



Goodbye!


INFO:httpx:HTTP Request: POST https://api.openai.com/v1/traces/ingest "HTTP/1.1 204 No Content"


In [None]:
Intro - 0:00
What are AI agents? - 0:18
Why Agents? - 3:37
Level 1: LLM + Tool Use - 7:20
Level 2: LLM Workflows - 11:29
Level 3: LLM in a Loop - 19:05
What's Next? - 22:52

In [9]:
# # to run in a .py script use
# if __name__ == "__main__":
#     asyncio.run(main())