# Browser Toolkit with LangGraph React Agent

This notebook demonstrates how to use the AWS Browser Toolkit with a LangGraph React Agent to perform automated web browsing tasks. The Browser Toolkit provides a set of tools for navigating websites, interacting with web elements, and extracting information.

## Setup and Installation

First, make sure you have the required packages installed:

In [None]:
%pip install -q langgraph langchain 'langchain-aws[tools]'

## Import Required Libraries

In [1]:
import asyncio
import logging

from langchain.chat_models import init_chat_model
from langchain_aws.tools import create_browser_toolkit
from langgraph.prebuilt import create_react_agent

# Set up logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

## Create Browser Toolkit

The Browser Toolkit provides tools for interacting with websites through a remote browser managed by AWS. It includes tools for navigating to URLs, clicking elements, extracting text, and more.

In [2]:
# Create the browser toolkit
# The toolkit handles browser session management based on thread IDs
toolkit, browser_tools = create_browser_toolkit(region="us-west-2")

# Display available tools
print(f"Available browser tools: {[tool.name for tool in browser_tools]}")

Available browser tools: ['navigate_browser', 'click_element', 'navigate_back', 'extract_text', 'extract_hyperlinks', 'get_elements', 'current_webpage']


## Initialize LLM

Set up the language model that will power our agent. We'll use Claude 3.5 Haiku through Bedrock.

In [3]:
# Initialize the language model using bedrock_converse provider
provider = "bedrock_converse"
model_id = "us.anthropic.claude-3-5-haiku-20241022-v1:0"
model_with_provider = f"{provider}:{model_id}"

# Create the model instance
model = init_chat_model(model_with_provider)

## Create React Agent

Now we'll create a React agent using LangGraph's prebuilt agent. The React agent uses a reasoning and acting approach to solve tasks step by step.

In [4]:
# Create the React agent with browser tools
agent = create_react_agent(
    model=model,
    tools=browser_tools,
    # Customize the agent prompt for web browsing tasks
    prompt="""You are a web browsing assistant that can navigate websites, interact with web elements, and extract information. 
    Use the available browser tools to complete web browsing tasks. 
    When navigating websites, first go to the URL, then explore the page content to understand its structure before performing actions.
    For any search tasks, find the search box, enter the query, and submit the form.
    When extracting information, be precise about what elements you want to extract.
    
    Follow these steps for each task:
    1. Navigate to the required website
    2. Analyze the page structure
    3. Interact with elements as needed
    4. Extract and report requested information
    5. Provide a clear, concise summary of findings"""
)

## Execute the Agent

Now let's run the agent on a web browsing task. We'll set up a function to execute tasks and print the results.

In [11]:
async def run_browser_agent(query: str, session_id: str = "browser_session1"):
    """
    Run the browser agent on a specific query with session tracking
    
    Args:
        query: The task to perform
        session_id: Unique identifier for the browser session
        
    Returns:
        Agent response
    """
    try:
        # Configure the session ID for thread-aware tools
        config = {
            "configurable": {
                "thread_id": session_id
            }
        }
        
        # Invoke the agent with the query
        result = await agent.ainvoke(
            {
                "messages": [{"role": "user", "content": query}]
            },
            config=config
        )
        
        return result
    except Exception as e:
        logger.error(f"Error running browser agent: {e}")
        raise

## Example Tasks

Let's run some example web browsing tasks to demonstrate the agent's capabilities.

In [14]:
# Example 1: Visit a website and extract information
task1 = "Go to https://news.ycombinator.com/ and tell me the titles of the top 5 posts"

result1 = await run_browser_agent(task1)
print("Task 1 Result:")
print(result1["messages"][-1].content)

INFO:langchain_aws.chat_models.bedrock_converse:Using Bedrock Converse API to generate response
INFO:langchain_aws.chat_models.bedrock_converse:Using Bedrock Converse API to generate response
INFO:langchain_aws.chat_models.bedrock_converse:Using Bedrock Converse API to generate response
INFO:langchain_aws.chat_models.bedrock_converse:Using Bedrock Converse API to generate response
INFO:langchain_aws.chat_models.bedrock_converse:Using Bedrock Converse API to generate response


Task 1 Result:
Based on the extracted text, here are the titles of the top 5 posts on Hacker News:

1. "LLM Inevitabilism" (1246 points)
2. "Kiro: A new agentic IDE" (955 points)
3. "Cognition (Devin AI) to Acquire Windsurf" (470 points)
4. "Apple's MLX adding CUDA support" (486 points)
5. "NIST Ion Clock Sets New Record for Most Accurate Clock in the World" (70 points)

These titles represent the most upvoted stories on Hacker News at the time of browsing, covering topics like AI, technology, and scientific advances.


In [15]:
# Example 2: Search for information
task2 = "Go to https://www.wikipedia.org, search for 'artificial intelligence', and summarize the first paragraph"

result2 = await run_browser_agent(task2)
print("Task 2 Result:")
print(result2["messages"][-1].content)

INFO:langchain_aws.chat_models.bedrock_converse:Using Bedrock Converse API to generate response
INFO:langchain_aws.chat_models.bedrock_converse:Using Bedrock Converse API to generate response
INFO:langchain_aws.chat_models.bedrock_converse:Using Bedrock Converse API to generate response
INFO:langchain_aws.chat_models.bedrock_converse:Using Bedrock Converse API to generate response
INFO:langchain_aws.chat_models.bedrock_converse:Using Bedrock Converse API to generate response
INFO:langchain_aws.chat_models.bedrock_converse:Using Bedrock Converse API to generate response


Task 2 Result:
Now, I'll find and extract the first paragraph about artificial intelligence:

The first paragraph reads:

"Artificial intelligence (AI) is the capability of computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of research in computer science that develops and studies methods and software that enable machines to perceive their environment and use learning and intelligence to take actions that maximize their chances of achieving defined goals."

Would you like me to elaborate on any part of this summary?


## Clean Up Resources

Always clean up browser resources when done to avoid unnecessary resource usage.

In [None]:
# Clean up browser resources
await toolkit.cleanup()
print("Browser resources cleaned up successfully")