# YouTube Agent with OpenAI Agents SDK
## ABB #5 - Session 4

Code authored by: Shaw Talebi

**Resources**
- [YouTube video](https://youtu.be/-BUs1CPHKfU)
- [Blog post](https://medium.com/@shawhin/how-to-improve-llms-with-tools-69cc68c804ed?sk=3ffd8308ce4905617b136a02cfa8dd83)

### imports

In [1]:
from youtube_transcript_api import YouTubeTranscriptApi
import re
from agents import Agent, function_tool, Runner
from openai.types.responses import ResponseTextDeltaEvent
from dotenv import load_dotenv
import asyncio

In [2]:
import logging
# Suppress httpx INFO logs to reduce console output
logging.getLogger("httpx").setLevel(logging.WARNING)

In [3]:
# import environment variables from .env file
load_dotenv()

True

### define instructions

In [4]:
instructions = "You provide help with tasks related to YouTube videos."

### define tool

In [5]:
@function_tool
def fetch_youtube_transcript(url: str) -> str:
    """
    Extract transcript with timestamps from a YouTube video URL and format it for LLM consumption
    
    Args:
        url (str): YouTube video URL
        
    Returns:
        str: Formatted transcript with timestamps, where each entry is on a new line
             in the format: "[MM:SS] Text"
    """
    # Extract video ID from URL
    video_id_pattern = r'(?:v=|\/)([0-9A-Za-z_-]{11}).*'
    video_id_match = re.search(video_id_pattern, url)
    
    if not video_id_match:
        raise ValueError("Invalid YouTube URL")
    
    video_id = video_id_match.group(1)
    
    try:
        transcript = YouTubeTranscriptApi.get_transcript(video_id)
        
        # Format each entry with timestamp and text
        formatted_entries = []
        for entry in transcript:
            # Convert seconds to MM:SS format
            minutes = int(entry['start'] // 60)
            seconds = int(entry['start'] % 60)
            timestamp = f"[{minutes:02d}:{seconds:02d}]"
            
            formatted_entry = f"{timestamp} {entry['text']}"
            formatted_entries.append(formatted_entry)
        
        # Join all entries with newlines
        return "\n".join(formatted_entries)
    
    except Exception as e:
        raise Exception(f"Error fetching transcript: {str(e)}")

### create agent

In [6]:
agent = Agent(
    name="YouTube Transcript Agent",
    instructions=instructions,
    tools=[fetch_youtube_transcript],
)

### main() function

In [7]:
async def main():
    input_items = []

    print("=== YouTube Transcript Agent ===")
    print("Type 'exit' to end the conversation")
    print("Ask me anything about YouTube videos!")

    while True:
        # Get user input
        user_input = input("\nYou: ").strip()
        input_items.append({"content": user_input, "role": "user"})
        
        # Check for exit command
        if user_input.lower() in ['exit', 'quit', 'bye']:
            print("\nGoodbye!")
            break
            
        if not user_input:
            continue

        print("\nAgent: ", end="", flush=True)
        result = Runner.run_streamed(
            agent,
            input=input_items,
        )

        async for event in result.stream_events(): # not all events are available at outset, hence the async
            # We'll ignore the raw responses event deltas
            if event.type == "raw_response_event" and isinstance(event.data, ResponseTextDeltaEvent):
                print(event.data.delta, end="", flush=True)
            elif event.type == "run_item_stream_event":
                if event.item.type == "tool_call_item":
                    print("\n-- Fetching transcript...")
                elif event.item.type == "tool_call_output_item":
                    input_items.append({"content": f"Transcript:\n{event.item.output}", "role": "system"})
                    print("-- Transcript fetched.")
                elif event.item.type == "message_output_item":
                    input_items.append({"content": f"{event.item.raw_item.content[0].text}", "role": "assistant"})
                else:
                    pass  # Ignore other event types

        print("\n")  # Add a newline after each response

In [8]:
await main()
# what is this video about? https://youtu.be/ZaY5_ScmiFE

=== YouTube Transcript Agent ===
Type 'exit' to end the conversation
Ask me anything about YouTube videos!



You:  what is this video about? https://youtu.be/ZaY5_ScmiFE



Agent: 
-- Fetching transcript...
-- Transcript fetched.
The video is an introduction to a larger series about AI agents by Shaw. It starts by discussing what AI agents are and their significance. It reviews definitions from major organizations like OpenAI, Hugging Face, and Anthropic, comparing their emphasis on tools, planning, and autonomy.

The video identifies three key features common to AI agents:

1. **Large Language Models (LLMs)**: Central to AI agents, playing a major role in how they're defined.
   
2. **Tool Use**: Enhances the capabilities of LLMs, allowing them to interact with the real world, perform web searches, run Python scripts, etc.

3. **Autonomy**: Different definitions focus on autonomy, the ability for agents to plan and dynamically control how they solve tasks.

The video also discusses different levels of AI agentic systems:

- **Level 1**: LLM plus tools.
- **Level 2**: LLM workflows, like email responders.
- **Level 3**: LLM in a loop, improving outputs t


You:  Can you write chapter timestamps for it?



Agent: Sure, here are the chapter timestamps based on the video content:

1. **Introduction to AI Agents** - [00:00](#00:00)
2. **Definitions of AI Agents** - [00:28](#00:28)
3. **Key Features of AI Agents** - [01:54](#01:54)
   - Large Language Models (LLMs)
   - Tool Use
   - Autonomy
4. **Comparison with Chatbots** - [03:36](#03:36)
5. **Levels of Agentic Systems** - [06:36](#06:36)
   - Level 1: LLM + Tools
   - Level 2: LLM Workflows
   - Level 3: LLM in a Loop
6. **Workflow Design Patterns** - [14:01](#14:01)
   - Chaining
   - Routing
   - Parallelization
   - Orchestrator-Workers
   - Evaluator-Optimizer
7. **Example of LLM in a Loop** - [18:50](#18:50)
8. **Conclusion and Series Preview** - [22:52](#22:52) 

These timestamps provide a structured overview of the topics covered in the video.




You:  Can you provide youtube links for each chapter?



Agent: Certainly! Here are the chapter timestamps with YouTube links:

1. **[Introduction to AI Agents](https://youtu.be/ZaY5_ScmiFE?t=0)**
2. **[Definitions of AI Agents](https://youtu.be/ZaY5_ScmiFE?t=28)**
3. **[Key Features of AI Agents](https://youtu.be/ZaY5_ScmiFE?t=114)**
   - Large Language Models (LLMs)
   - Tool Use
   - Autonomy
4. **[Comparison with Chatbots](https://youtu.be/ZaY5_ScmiFE?t=216)**
5. **[Levels of Agentic Systems](https://youtu.be/ZaY5_ScmiFE?t=396)**
   - Level 1: LLM + Tools
   - Level 2: LLM Workflows
   - Level 3: LLM in a Loop
6. **[Workflow Design Patterns](https://youtu.be/ZaY5_ScmiFE?t=841)**
   - Chaining
   - Routing
   - Parallelization
   - Orchestrator-Workers
   - Evaluator-Optimizer
7. **[Example of LLM in a Loop](https://youtu.be/ZaY5_ScmiFE?t=1130)**
8. **[Conclusion and Series Preview](https://youtu.be/ZaY5_ScmiFE?t=1372)**

You can click on these links to jump directly to each section in the video.




You:  exit



Goodbye!


 (LLMs), tool usage, and autonomy. Definitions from various organizations like OpenAI and Hugging Face are discussed, highlighting different characteristics emphasized by each.

The speaker explains key features of AI agents, namely LLM involvement, tool usage, and autonomy. These features allow agents to overcome limitations of traditional chatbots, interact with the real world, and solve complex tasks.

Different levels of AI agents are presented:

1. **Level 1: LLM Plus Tools** - Basic systems using tools to enhance LLM capabilities, such as web search or code execution.
   
2. **Level 2: LLM Workflows** - More advanced systems using workflows to manage tasks among multiple LLMs, improving performance through specialized roles.

3. **Level 3: LLM in a Loop** - Systems that provide real-world feedback to refine outputs continuously, potentially using reinforcement learning for optimization.

The video aims to lay the groundwork for future parts in the series, which will delve into te

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/traces/ingest "HTTP/1.1 204 No Content"


In [9]:
# # to run in a .py script use
# if __name__ == "__main__":
#     asyncio.run(main())