# YouTube Agent with OpenAI Agents SDK
## ABB #6 - Session 4

Code authored by: Shaw Talebi

**Resources**
- [YouTube video](https://youtu.be/-BUs1CPHKfU)
- [Blog post](https://medium.com/@shawhin/how-to-improve-llms-with-tools-69cc68c804ed?sk=3ffd8308ce4905617b136a02cfa8dd83)

### imports

In [1]:
from youtube_transcript_api import YouTubeTranscriptApi
import re
from agents import Agent, function_tool, Runner
from openai.types.responses import ResponseTextDeltaEvent
from dotenv import load_dotenv
import asyncio

In [2]:
import logging
# Suppress httpx INFO logs to reduce console output
logging.getLogger("httpx").setLevel(logging.WARNING)

In [3]:
# import environment variables from .env file
load_dotenv()

True

### define instructions

In [4]:
instructions = "You provide help with tasks related to YouTube videos."

### define tool

In [5]:
@function_tool
def fetch_youtube_transcript(url: str) -> str:
    """
    Extract transcript with timestamps from a YouTube video URL and format it for LLM consumption
    
    Args:
        url (str): YouTube video URL
        
    Returns:
        str: Formatted transcript with timestamps, where each entry is on a new line
             in the format: "[MM:SS] Text"
    """
    # Extract video ID from URL
    video_id_pattern = r'(?:v=|\/)([0-9A-Za-z_-]{11}).*'
    video_id_match = re.search(video_id_pattern, url)
    
    if not video_id_match:
        raise ValueError("Invalid YouTube URL")
    
    video_id = video_id_match.group(1)
    
    try:
        ytt_api = YouTubeTranscriptApi()
        transcript = ytt_api.fetch(video_id)
        
        # Format each entry with timestamp and text
        formatted_entries = []
        for entry in transcript:
            # Convert seconds to MM:SS format
            minutes = int(entry.start // 60)
            seconds = int(entry.start % 60)
            timestamp = f"[{minutes:02d}:{seconds:02d}]"
            
            formatted_entry = f"{timestamp} {entry.text}"
            formatted_entries.append(formatted_entry)
        
        # Join all entries with newlines
        return "\n".join(formatted_entries)
    
    except Exception as e:
        raise Exception(f"Error fetching transcript: {str(e)}")

### create agent

In [6]:
agent = Agent(
    name="YouTube Transcript Agent",
    instructions=instructions,
    tools=[fetch_youtube_transcript],
)

### main() function

In [7]:
async def main():
    input_items = []

    print("=== YouTube Transcript Agent ===")
    print("Type 'exit' to end the conversation")
    print("Ask me anything about YouTube videos!")

    while True:
        # Get user input
        user_input = input("\nYou: ").strip()
        input_items.append({"content": user_input, "role": "user"})
        
        # Check for exit command
        if user_input.lower() in ['exit', 'quit', 'bye']:
            print("\nGoodbye!")
            break
            
        if not user_input:
            continue

        print("\nAgent: ", end="", flush=True)
        result = Runner.run_streamed(
            agent,
            input=input_items,
        )

        async for event in result.stream_events(): # not all events are available at outset, hence the async
            # We'll ignore the raw responses event deltas
            if event.type == "raw_response_event" and isinstance(event.data, ResponseTextDeltaEvent):
                print(event.data.delta, end="", flush=True)
            elif event.type == "run_item_stream_event":
                if event.item.type == "tool_call_item":
                    print("\n-- Fetching transcript...")
                elif event.item.type == "tool_call_output_item":
                    input_items.append({"content": f"Transcript:\n{event.item.output}", "role": "system"})
                    print("-- Transcript fetched.")
                elif event.item.type == "message_output_item":
                    input_items.append({"content": f"{event.item.raw_item.content[0].text}", "role": "assistant"})
                else:
                    pass  # Ignore other event types

        print("\n")  # Add a newline after each response

In [8]:
await main()
# what is this video about? https://youtu.be/ZaY5_ScmiFE

=== YouTube Transcript Agent ===
Type 'exit' to end the conversation
Ask me anything about YouTube videos!



You:  what is this video about? https://youtu.be/ZaY5_ScmiFE



Agent: 
-- Fetching transcript...
-- Transcript fetched.
**Summary of the Video: "What is an AI Agent? (2025: The Year of AI Agents)"**  
[https://youtu.be/ZaY5_ScmiFE](https://youtu.be/ZaY5_ScmiFE)

**Host:** Shaw

**Overview:**
This is the first video in a series about AI agents. Shaw introduces what AI agents are, why they matter, and the different kinds of AI agents you'll encounter.

---

### Key Points:

#### 1. **What is an AI Agent?**
- **2025 is being called “the year of the AI agent.”**
- There is no single, agreed-upon definition of an AI agent. Different organizations provide slightly different perspectives:
    - **OpenAI:** An agent is a large language model (LLM) with instructions and tools.
    - **Hugging Face:** An agent is a system where an LLM executes complex tasks using planning and tools.
    - **Anthropic:** An agent is an LLM-based system that autonomously directs its own processes and tool use.

#### 2. **Three Key Features of AI Agents:**
- **1. LLM Involvem


You:  Can you write me a LinkedIn post about the design patterns?



Agent: Absolutely! Here’s a LinkedIn post summarizing the design patterns for building AI agent workflows, inspired by the video:

---

🚀 **Unlocking the Power of AI Agents: Essential Design Patterns for Workflows**

As AI agents move from hype to reality, understanding how to structure them is key for building robust, reliable, and scalable solutions. Recently, I reviewed several common **design patterns** used in agentic workflows—here’s a quick breakdown that can help teams design smarter, more autonomous LLM-powered systems:

🔗 **Chaining:**  
Think assembly lines. Each module does a specific task and hands its output to the next step. Perfect for processes like content creation: one module generates ideas, another selects the best, another drafts the post.

🛣️ **Routing:**  
Branch your workflows. Use agent “decision nodes” to direct different types of inputs down custom paths—for example, sorting incoming emails into “spam,” “easy reply,” and “requires human attention.”

⚡ **Par


You:  That was terrible. Just give me the chapter timestamps for the video.



Agent: Certainly! Here are the main chapters and their corresponding timestamps for the video "[What is an AI Agent? (2025: The Year of AI Agents)](https://youtu.be/ZaY5_ScmiFE)":

---

### Chapter Timestamps

- **00:00 – Introduction:**  
  Shaw introduces himself and the topic: understanding AI agents.

- **00:28 – Why AI Agents Matter & Confusion in Definitions:**  
  Discusses the 2025 "year of AI agents" and lack of a universal definition.

- **00:36 – Definitions from OpenAI, Hugging Face, Anthropic:**  
  Reviews how different organizations define AI agents.

- **02:00 – Core Features of AI Agents:**  
  LLMs, tool use, and autonomy.

- **03:36 – Why Agents Are Exciting:**  
  The shift from chatbots to agents and the promise of ROI.

- **05:16 – Tools Take LLMs into Reality:**  
  Importance of tool use and "test-time compute scaling".

- **06:36 – Agency as a Spectrum:**  
  Introduction to agentic systems and levels of agency.

- **07:10 – Level 1: LLM + Tools:**  
  Simple 


You:  Please provide links



Agent: Here are chapter timestamps from the video "[What is an AI Agent? (2025: The Year of AI Agents)](https://youtu.be/ZaY5_ScmiFE)" with clickable links to jump to each section:

---

### Chapter Timestamps with Links

- **[00:00 – Introduction](https://youtu.be/ZaY5_ScmiFE?t=0)**
- **[00:28 – Why AI Agents Matter & Confusion in Definitions](https://youtu.be/ZaY5_ScmiFE?t=28)**
- **[00:36 – Definitions from OpenAI, Hugging Face, Anthropic](https://youtu.be/ZaY5_ScmiFE?t=36)**
- **[02:00 – Core Features of AI Agents](https://youtu.be/ZaY5_ScmiFE?t=120)**
- **[03:36 – Why Agents Are Exciting](https://youtu.be/ZaY5_ScmiFE?t=216)**
- **[05:16 – Tools Take LLMs into Reality](https://youtu.be/ZaY5_ScmiFE?t=316)**
- **[06:36 – Agency as a Spectrum](https://youtu.be/ZaY5_ScmiFE?t=396)**
- **[07:10 – Level 1: LLM + Tools](https://youtu.be/ZaY5_ScmiFE?t=430)**
- **[11:28 – Limitations of Level 1 and Introduction to Level 2](https://youtu.be/ZaY5_ScmiFE?t=688)**
- **[11:31 – Level 2: LLM Work


You:  exit



Goodbye!


In [9]:
# # to run in a .py script use
# if __name__ == "__main__":
#     asyncio.run(main())