# Browser-Use Agent Sample

Enable AI to control your browser 🤖. Browser use is the easiest way to connect your AI agents with the browser.

This notebook demonstrates how to leverage the Browser-Use agent, a powerful tool that allows Language Models (LLMs) to control a browser and interact with web content. It provides a seamless interface for AI-driven web browsing and data extraction.

For comprehensive documentation and more information, visit the official GitHub repository:
[https://github.com/browser-use/browser-use](https://github.com/browser-use/browser-use)

In the following examples, we'll explore setting up and utilizing the Browser-Use agent to programmatically interact with websites and extract valuable information.


In [2]:
#Install required package
%pip install browser-use langchain_openai

Collecting argparse>=1.0 (from lmnr>=0.4.53->lmnr[langchain]>=0.4.53->browser-use)
  Using cached argparse-1.4.0-py2.py3-none-any.whl.metadata (2.8 kB)
Using cached argparse-1.4.0-py2.py3-none-any.whl (23 kB)
Installing collected packages: argparse
Successfully installed argparse-1.4.0
Note: you may need to restart the kernel to use updated packages.


In [1]:
# Setup playwright
!playwright install

Downloading Chromium 131.0.6778.33 (playwright build v1148)[2m from https://playwright.azureedge.net/builds/chromium/1148/chromium-mac.zip[22m
Chromium 131.0.6778.33 (playwright build v1148) downloaded to /Users/kan/Library/Caches/ms-playwright/chromium-1148
Downloading Chromium Headless Shell 131.0.6778.33 (playwright build v1148)[2m from https://playwright.azureedge.net/builds/chromium/1148/chromium-headless-shell-mac.zip[22m
Chromium Headless Shell 131.0.6778.33 (playwright build v1148) downloaded to /Users/kan/Library/Caches/ms-playwright/chromium_headless_shell-1148
Downloading Firefox 132.0 (playwright build v1466)[2m from https://playwright.azureedge.net/builds/firefox/1466/firefox-mac.zip[22m
Firefox 132.0 (playwright build v1466) downloaded to /Users/kan/Library/Caches/ms-playwright/firefox-1466
Downloading Webkit 18.2 (playwright build v2104)[2m from https://playwright.azureedge.net/builds/webkit/2104/webkit-mac-15.zip[22m
Webkit 18.2 (playwright build v2104) download

## Choose LLM

For this example, we'll use Azure OpenAI as our Language Model provider. However, Browser-Use is flexible and can work with various LLM clients. Here are a few options:

1. OpenAI
2. Azure OpenAI (used in this example)
3. DeepSeek-R1 (recently supported)
...

Choose the LLM client that best suits your needs and API access. Make sure to adjust the import statement and initialization accordingly.

Note: Remember to set up your environment variables or API keys for the chosen LLM provider before running the code.


In [2]:
# Load environment variables
from dotenv import load_dotenv
load_dotenv()

# OpenAI 
# from langchain_openai import ChatOpenAI
# openai = ChatOpenAI(model="gpt-4o")

# Azure OpenAI
from langchain_openai import AzureChatOpenAI
openai = AzureChatOpenAI(model='gpt-4o', api_version='2024-02-15-preview')

In [5]:
# Initialize Tasks (any task can be used in native language)
task = "Go to TechCrunch and find the first news about AI"

# Initialize Agent
from browser_use import Agent

agent = Agent(
    task=task, # Task description
    llm=openai # Language model
)

# Run Agent
result = await agent.run()
print(result)

INFO     [agent] 🚀 Starting task: Go to TechCrunch and find the first news about AI
INFO     [agent] 
📍 Step 1
INFO     [agent] 👍 Eval: Success - The browser is open and ready for interaction.
INFO     [agent] 🧠 Memory: Opened a blank tab, need to navigate to TechCrunch.
INFO     [agent] 🎯 Next goal: Navigate to TechCrunch's website to look for the news.
INFO     [agent] 🛠️  Action 1/1: {"go_to_url":{"url":"https://techcrunch.com"}}
INFO     [controller] 🔗  Navigated to https://techcrunch.com
INFO     [agent] 
📍 Step 2
INFO     [agent] 👍 Eval: Success - Navigated to the TechCrunch website.
INFO     [agent] 🧠 Memory: Browsing on TechCrunch, looking for AI news.
INFO     [agent] 🎯 Next goal: Identify the first news related to AI and extract it.
INFO     [agent] 🛠️  Action 1/1: {"click_element":{"index":25}}
INFO     [controller] 🖱️  Clicked button with index 25: Why Reid Hoffman feels optimistic about our AI future
INFO     [agent] 
📍 Step 3
INFO     [agent] 👍 Eval: Success - The article

## Running the Browser-Use Agent

When you execute the code above, you'll see the following process:

1. A Chromium browser window will open automatically.
2. The browser will navigate to the TechCrunch website (https://techcrunch.com).
3. The AI agent will search for and identify the first news article related to AI.
4. Once found, the agent will extract the relevant information.
5. The browser window will close.
6. The extracted information will be returned and printed in the notebook.

![Screenshot](../../../data/screenshots/browser-use/screenshot-1.png)

This automated process demonstrates the power of the Browser-Use agent in controlling a real browser to interact with web content and extract specific information based on the given task.

Note: The actual browser window may flash briefly on your screen as the agent performs its task.


In [None]:
# make the task more difficult
task = "Go to booking.com and find the cheapest hotel in Okinawa for 2 nights from 2025-08-01 to 2025-08-03"

agent = Agent(
    task=task, # Task description
    llm=openai # Language model
)

result = await agent.run()
print(result)


INFO     [agent] 🚀 Starting task: Go to booking.com and find the cheapest hotel in Okinawa for 2 nights from 2025-08-01 to 2025-08-03
INFO     [agent] 
📍 Step 1
INFO     [agent] 🤷 Eval: Unknown - The page is blank, no previous actions to evaluate.
INFO     [agent] 🧠 Memory: Task requires searching for a hotel in Okinawa from 2025-08-01 to 2025-08-03 on Booking.com.
INFO     [agent] 🎯 Next goal: Navigate to Booking.com.
INFO     [agent] 🛠️  Action 1/1: {"go_to_url":{"url":"https://www.booking.com"}}
INFO     [controller] 🔗  Navigated to https://www.booking.com
INFO     [agent] 
📍 Step 2
INFO     [agent] 👍 Eval: Success - Successfully navigated to Booking.com.
INFO     [agent] 🧠 Memory: On Booking.com homepage, need to fill in location and date details to search for hotel in Okinawa.
INFO     [agent] 🎯 Next goal: Initiate a search for hotels in Okinawa including setting check-in and check-out dates.
INFO     [agent] 🛠️  Action 1/3: {"click_element":{"index":32}}
INFO     [agent] 🛠️  Acti

> Note: The current implementation of the agent has a limitation where it cannot properly set the date range on booking.com. Instead of setting the specified dates (2025-08-01 to 2025-08-03), the agent continuously scrolls and clicks into individual hotels without first setting the date range. As a result, the website does not display any pricing information since no dates are selected, making it impossible for the agent to find and compare hotel prices.

In [6]:
# Let's make it easier
task = "Go to price.com.hk to find the Nvidia RTX 4090 latest price"

agent = Agent(
    task=task, # Task description
    llm=openai # Language model
)

result = await agent.run()
print(result)


INFO     [agent] 🚀 Starting task: Go to price.com.hk to find the Nvidia RTX 4090 latest price
INFO     [agent] 
📍 Step 1
INFO     [agent] ⚠ Eval: Failed - I am currently on a blank page which does not provide any information about the Nvidia RTX 4090 price.
INFO     [agent] 🧠 Memory: Start browsing price.com.hk to find Nvidia RTX 4090 latest price.
INFO     [agent] 🎯 Next goal: Navigate to price.com.hk to search for the Nvidia RTX 4090 price.
INFO     [agent] 🛠️  Action 1/1: {"go_to_url":{"url":"https://www.price.com.hk"}}
ERROR    [agent] ❌ Result failed 1/3 times:
 Error executing action go_to_url: Page.goto: Timeout 30000ms exceeded.
Call log:
  - navigating to "https://www.price.com.hk/", waiting until "load"

INFO     [agent] 
📍 Step 2
INFO     [agent] 🤷 Eval: Unknown - Page loaded successfully with interactive options.
INFO     [agent] 🧠 Memory: Plan to search for Nvidia RTX 4090 on the price.com.hk page using the search box.
INFO     [agent] 🎯 Next goal: Dismiss any cookies mess

# Results and Observations

When attempting to scrape price.com.hk, the agent encountered several challenges:
1. The website seemed to be stuck in an infinite loading state, causing the agent to wait indefinitely
2. An advertisement popup appeared and blocked the screen, preventing the agent from accessing the main content
3. Interestingly, after failing to interact with price.com.hk, the agent attempted to fall back to Google search to find the information instead

![Screenshot 2](../../../data/screenshots/browser-use/screenshot-2.png)

Screenshot 2 show the advertisement popup blocked the screen

![Screenshot 3](../../../data/screenshots/browser-use/screenshot-3.png)

Screenshot 3 show the agent go to Google and search the result

This demonstrates some common challenges in web scraping, such as dealing with slow-loading pages, handling popup advertisements, and the agent's attempt to find alternative data sources when the primary source is inaccessible.