# Google Gemini 2.0 Browser Demo 

This notebook demonstrates how to use Google Gemini 2.0 with `browser-use` to automate real browser interactions. Browser Use provides a powerful yet simple interface that enables AI to control your browser, making websites accessible for AI agents. Key features include direct browser control (navigate, click, interact with elements), complex workflow automation (multi-step tasks), and real-world applications like writing in Google Docs, automated job applications, flight searches, and data collection.

**Example Use Cases**

- Document Management: Write documents in Google Docs, save files in different formats (PDF, etc.), organize and manage content
- Job Search Automation: Parse CVs and match with job listings, bulk apply to relevant positions, open applications in multiple tabs
- Travel Planning: Search flights across platforms, compare prices and options, track travel deals
- Data Collection: Scrape and analyze web content, save structured data to files, sort and filter information


Learn more about Browser Use capabilities at [Browser Use Documentation](https://docs.browser-use.com/introduction). Browser use, uses the Playwright library and LangChain under the hood.

In [None]:
# Note: Browser use required Python 3.11+
%pip install langchain-google-genai "browser-use=0.1.40" pillow

Browser use uses `playwright` to control the browser. Playwright works the best using it own Chromium browser. We can run `playwright install` to install the latest version of the Chromium browser. 

_Note: This downloads a supported version of Chromium to your machine. This is the same version that is used by the Browser use library._

In [None]:
!playwright install


In [6]:
from langchain_google_genai import ChatGoogleGenerativeAI
from browser_use import Agent, Browser, BrowserContextConfig, BrowserConfig
from browser_use.browser.browser import BrowserContext
from pydantic import SecretStr
import os
os.environ["ANONYMIZED_TELEMETRY"] = "false" # Disable telemetry https://docs.browser-use.com/development/telemetry
from dotenv import load_dotenv
load_dotenv() # Loads the GEMINI_API_KEY form .env file 

# Initialize the model
llm = ChatGoogleGenerativeAI(model='gemini-2.0-flash', api_key=SecretStr(os.getenv('GEMINI_API_KEY')))

# Configure the browser to connect to your Chrome instance
browser = Browser(
    config=BrowserConfig(
        # Specify the path to your Chrome executable if you are not using playwright managed browser can be different on different operating systems
        # chrome_instance_path='/Applications/Google Chrome.app',  # default macOS path
        # For Windows, typically: 'C:\Program Files\Google\Chrome\Application\chrome.exe'
        # For Linux, typically: '/usr/bin/google-chrome' or '/usr/bin/chrome'
    )
)
context_config = BrowserContextConfig(
    wait_for_network_idle_page_load_time=5.0, 
    # browser_window_size={'width': 768, 'height': 768},
    highlight_elements=True,
    save_recording_path='./recordings'
)
context = BrowserContext(browser=browser, config=context_config)


In [7]:
# Create agent with the model
agent = Agent(
        task="What is the cheapest flight from Berlin to San Francisco?",
        llm=llm,
    		browser_context=context,
        use_vision=True, # Enable vision capabilities, default is True
    )
result = await agent.run()
# Manually close the browser 
await browser.close()

print(result.final_result())    

INFO     [agent] 🚀 Starting task: What is the cheapest flight from Berlin to San Francisco?


INFO     [agent] 📍 Step 1


I0000 00:00:1740409852.589767 18282214 fork_posix.cc:75] Other threads are currently calling into gRPC, skipping fork() handlers






INFO     [agent] 🤷 Eval: The previous goal was to start the browser. It was successfull.
INFO     [agent] 🧠 Memory: I have to find the cheapest flight from Berlin to San Francisco. I am on step 1/100.
INFO     [agent] 🎯 Next goal: Search on google for the cheapest flight from Berlin to San Francisco.
INFO     [agent] 🛠️  Action 1/1: {"search_google":{"query":"cheapest flight from Berlin to San Francisco"}}
INFO     [controller] 🔍  Searched for "cheapest flight from Berlin to San Francisco" in Google
INFO     [agent] 📍 Step 2
INFO     [agent] 🤷 Eval: I searched for cheapest flights from Berlin to San Francisco. It was successfull.
INFO     [agent] 🧠 Memory: I have to find the cheapest flight from Berlin to San Francisco. I am on step 2/100.
INFO     [agent] 🎯 Next goal: Accept cookies.
INFO     [agent] 🛠️  Action 1/1: {"click_element":{"index":5}}
INFO     [controller] 🖱️  Clicked button with index 5: Alle akzeptieren
INFO     [agent] 📍 Step 3
INFO     [agent] 🤷 Eval: I accepted the coo