# L2: Building a Simple Web Agent

<p style="background-color:#fff6e4; padding:15px; border-width:3px; border-color:#f5ecda; border-style:solid; border-radius:6px"> ⏳ <b>Note <code>(Kernel Starting)</code>:</b> This notebook takes about 30 seconds to be ready to use. You may start and watch the video while you wait.</p>

In [1]:
import asyncio
import json
import os
import nest_asyncio
import pprint
import base64
from io import BytesIO
import pandas as pd
from playwright.async_api import async_playwright
from openai import OpenAI
from PIL import Image
from tabulate import tabulate
from IPython.display import display, HTML, Markdown
from pydantic import BaseModel
from helper import get_openai_api_key, visualizeCourses

<div style="background-color:#fff6ff; padding:13px; border-width:3px; border-color:#efe6ef; border-style:solid; border-radius:6px">
<p> 💻 &nbsp; <b>Access <code>requirements.txt</code> and <code>helper.py</code> files:</b> 1) click on the <em>"File"</em> option on the top menu of the notebook and then 2) click on <em>"Open"</em>.

<p> ⬇ &nbsp; <b>Download Notebooks:</b> 1) click on the <em>"File"</em> option on the top menu of the notebook and then 2) click on <em>"Download as"</em> and select <em>"Notebook (.ipynb)"</em>.</p>

<p> 📒 &nbsp; For more help, please see the <em>"Appendix – Tips, Help, and Download"</em> Lesson.</p>
</div>

In [2]:
client = OpenAI(api_key=get_openai_api_key())
nest_asyncio.apply()

## WebScraper Agent

In [3]:
class WebScraperAgent:
    def __init__(self):
        self.playwright = None
        self.browser = None
        self.page = None

    async def init_browser(self):
      self.playwright = await async_playwright().start()
      self.browser = await self.playwright.chromium.launch(
          headless=True,
          args=[
              "--disable-dev-shm-usage",
              "--no-sandbox",
              "--disable-setuid-sandbox",
              "--disable-accelerated-2d-canvas",
              "--disable-gpu",
              "--no-zygote",
              "--disable-audio-output",
              "--disable-software-rasterizer",
              "--disable-webgl",
              "--disable-web-security",
              "--disable-features=LazyFrameLoading",
              "--disable-features=IsolateOrigins",
              "--disable-background-networking"
          ]
      )
      self.page = await self.browser.new_page()

    async def scrape_content(self, url):
        if not self.page or self.page.is_closed():
            await self.init_browser()
        await self.page.goto(url, wait_until="load")
        await self.page.wait_for_timeout(2000)  # Wait for dynamic content
        return await self.page.content()

    async def take_screenshot(self, path="screenshot.png"):
        await self.page.screenshot(path=path, full_page=True)
        return path
    async def screenshot_buffer(self):
        screenshot_bytes = await self.page.screenshot(type="png", full_page=False)
        return screenshot_bytes

    async def close(self):
        await self.browser.close()
        await self.playwright.stop()
        self.playwright = None
        self.browser = None
        self.page = None

In [4]:
scraper = WebScraperAgent()

## Structured Data Format

In [5]:
class DeeplearningCourse(BaseModel):
    title: str
    description: str
    presenter: list[str]
    imageUrl: str
    courseURL: str

class DeeplearningCourseList(BaseModel):
    courses: list[DeeplearningCourse]

## LLM Client for Open AI

In [6]:
async def process_with_llm(html, instructions, truncate = False):
    completion = client.beta.chat.completions.parse(
        model="gpt-4o-mini-2024-07-18",
        messages=[{
            "role": "system",
            "content": f"""
            You are an expert web scraping agent. Your task is to:
            Extract relevant information from this HTML to JSON 
            following these instructions:
            {instructions}
            
            Extract the title, description, presenter, 
            the image URL and course URL for each of 
            all the courses for the deeplearning.ai website

            Return ONLY valid JSON, no markdown or extra text."""
        }, {
            "role": "user",
            "content": html[:150000]  # Truncate to stay under token limits
        }],
        temperature=0.1,
        response_format=DeeplearningCourseList,
        )
    return completion.choices[0].message.parsed

In [7]:
async def webscraper(target_url, instructions):
    result = None
    try:
        # Scrape content and capture screenshot
        print("Extracting HTML Content \n")
        html_content = await scraper.scrape_content(target_url)

        print("Taking Screenshot \n")
        screenshot = await scraper.screenshot_buffer()
        # Process content

        print("Processing..")
        result: DeeplearningCourseList = await process_with_llm(html_content, instructions, False)
        print("\nGenerated Structured Response")
    except Exception as e:
        print(f"❌ Error: {str(e)}")
    finally:
        await scraper.close()
    return result, screenshot

## Example 1

In [8]:
target_url = "https://www.deeplearning.ai/courses"  # Deeplearning AI courses
base_url="https://deeplearning.ai"

In [9]:
instructions = """
    Get all the courses
"""
result, screenshot = await webscraper(target_url, instructions)

Extracting HTML Content 

Taking Screenshot 

Processing..

Generated Structured Response


In [10]:
await visualizeCourses(result=result, 
                       screenshot=screenshot, 
                       target_url=target_url, 
                       instructions=instructions, 
                       base_url=base_url)

### Scraped Course Data:

title,description,presenter,imageUrl,courseURL
Building and Evaluating Data Agents,"Build, evaluate, and improve a multi-agent system that plans its steps, connects to data sources, and provides insights.",Snowflake,,Building and Evaluating Data Agents
Build AI Apps with MCP Servers: Working with Box Files,Build an LLM app that uses tools from the Box MCP server to discover Box files and extract text from them. Transform it into a multi-agent system that communicates using A2A.,Box,,Build AI Apps with MCP Servers: Working with Box Files
Knowledge Graphs for AI Agent API Discovery,Construct a knowledge graph and use it to enable your AI agent to find and call the right APIs in the right order.,SAP,,Knowledge Graphs for AI Agent API Discovery
Agentic Knowledge Graph Construction,"Build a multi-agent system that plans, designs, and constructs a knowledge graph.",Neo4j,,Agentic Knowledge Graph Construction
Fast Prototyping of GenAI Apps with Streamlit,"Prototype and deploy GenAI apps using an MVP workflow, prompt engineering, and RAG.",Snowflake,,Fast Prototyping of GenAI Apps with Streamlit
Claude Code: A Highly Agentic Coding Assistant,"Explore, build, and refine codebases with Claude Code.",Anthropic,,Claude Code: A Highly Agentic Coding Assistant
Pydantic for LLM Workflows,Build reliable LLM applications with structured outputs and validated data using Pydantic.,DeepLearning.AI,,Pydantic for LLM Workflows
Retrieval Augmented Generation (RAG),"Gain fundamental understanding and the practical knowledge to develop production-ready RAG applications, from architecture to deployment and evaluation.",DeepLearning.AI,,Retrieval Augmented Generation (RAG)
Post-training of LLMs,"Adapt LLMs for specific tasks and behaviors using post-training techniques like SFT, DPO, and online RL.","University of Washington, NexusFlow",,Post-training of LLMs
ACP: Agent Communication Protocol,Build agents that communicate and collaborate across different frameworks using ACP.,IBM Research's BeeAI,,ACP: Agent Communication Protocol


### Website Screenshot:

## Example with RAG courses

In [11]:
subject = "Retrieval Augmented Generation (RAG) "

instructions = f"""
Read the description of the courses and only 
provide the three courses that are about {subject}. 
Make sure that we don't have any other
cources in the output
"""
result, screenshot = await webscraper(target_url, instructions)

Extracting HTML Content 

Taking Screenshot 

Processing..

Generated Structured Response


In [12]:
await visualizeCourses(result=result, 
                       screenshot=screenshot, 
                       target_url=target_url, 
                       instructions=instructions, 
                       base_url=base_url)

### Scraped Course Data:

title,description,presenter,imageUrl,courseURL
Retrieval Augmented Generation (RAG),"Gain fundamental understanding and the practical knowledge to develop production-ready RAG applications, from architecture to deployment and evaluation.",DeepLearning.AI,,Retrieval Augmented Generation (RAG)
Fast Prototyping of GenAI Apps with Streamlit,"Prototype and deploy GenAI apps using an MVP workflow, prompt engineering, and RAG.",Snowflake,,Fast Prototyping of GenAI Apps with Streamlit
Building AI Apps with MCP Servers: Working with Box Files,Build an LLM app that uses tools from the Box MCP server to discover Box files and extract text from them. Transform it into a multi-agent system that communicates using A2A.,Box,,Building AI Apps with MCP Servers: Working with Box Files


### Website Screenshot:

## Challenges in the web agents

In [13]:
subject = "Retrieval Augmented Generation (RAG) "
instructions = f"""
Can you get the summary of the top course on
{subject} provide the learnings from it
"""
result, screenshot = await webscraper(target_url, instructions)

Extracting HTML Content 

Taking Screenshot 

Processing..

Generated Structured Response


In [14]:
await visualizeCourses(result=result,
                       screenshot=screenshot,
                       target_url=target_url,
                       instructions=instructions,
                       base_url=base_url)

### Scraped Course Data:

title,description,presenter,imageUrl,courseURL
Retrieval Augmented Generation (RAG),"Gain fundamental understanding and the practical knowledge to develop production-ready RAG applications, from architecture to deployment and evaluation.",DeepLearning.AI,,Retrieval Augmented Generation (RAG)
AI Python for Beginners,"Learn Python programming with AI assistance. Gain skills writing, testing, and debugging code efficiently, and create real-world AI applications.",DeepLearning.AI,,AI Python for Beginners
ChatGPT Prompt Engineering for Developers,"Learn the fundamentals of prompt engineering for ChatGPT. Learn effective prompting, and how to use LLMs for summarizing, inferring, transforming, and expanding.",OpenAI,,ChatGPT Prompt Engineering for Developers
Generative AI for Everyone,"Learn how to use generative AI's capabilities & limitations. Get an overview of real-world examples, and impact on business & society for effective strategies.",DeepLearning.AI,,Generative AI for Everyone
Machine Learning Specialization,"Learn foundational AI concepts through an intuitive visual approach, then learn the code needed to implement the algorithms and math for ML.",Stanford Online,,Machine Learning Specialization
Multi AI Agent Systems with crewAI,Automate business workflows with multi-AI agent systems. Exceed the performance of prompting a single LLM by designing and prompting a team of AI agents through natural language.,crewAI,,Multi AI Agent Systems with crewAI
Building and Evaluating Data Agents,"Build, evaluate, and improve a multi-agent system that plans its steps, connects to data sources, and provides insights.",Snowflake,,Building and Evaluating Data Agents
Build AI Apps with MCP Servers: Working with Box Files,Build an LLM app that uses tools from the Box MCP server to discover Box files and extract text from them. Transform it into a multi-agent system that communicates using A2A.,Box,,Build AI Apps with MCP Servers: Working with Box Files
Knowledge Graphs for AI Agent API Discovery,Construct a knowledge graph and use it to enable your AI agent to find and call the right APIs in the right order.,SAP,,Knowledge Graphs for AI Agent API Discovery
Agentic Knowledge Graph Construction,"Build a multi-agent system that plans, designs, and constructs a knowledge graph.",Neo4j,,Agentic Knowledge Graph Construction


### Website Screenshot: