# Web Automation Agent with Steel and LangChain

This notebook demonstrates how to create an agent that can perform web automation tasks using Steel's browser infrastructure and LangChain's agent framework.

We'll create an agent that can:
1. Navigate web pages
2. Extract information
3. Handle CAPTCHAs and complex interactions
4. Use proxy networks for reliable access

## Setup

First, install the required packages:

In [None]:
!pip install langchain playwright

Import the necessary modules:

In [None]:
import os
from typing import Optional
from langchain_community.document_loaders import SteelWebLoader
from langchain.agents import Tool, AgentExecutor, create_react_agent
from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate

Set up your API keys:

In [None]:
os.environ["OPENAI_API_KEY"] = "your-openai-key"
os.environ["STEEL_API_KEY"] = "your-steel-key"

## Create Web Automation Tools

Let's create some tools that our agent can use for web automation:

In [None]:
def scrape_webpage(url: str, extract_strategy: str = "text") -> str:
    """Scrape content from a webpage using Steel."""
    loader = SteelWebLoader(
        url,
        extract_strategy=extract_strategy,
        use_proxy=True,
        solve_captcha=True
    )
    documents = loader.load()
    return documents[0].page_content if documents else ""

def extract_structured_data(url: str) -> str:
    """Extract structured data (HTML) from a webpage."""
    return scrape_webpage(url, extract_strategy="html")

# Create tools for the agent
tools = [
    Tool(
        name="ScrapeWebpage",
        func=scrape_webpage,
        description="Useful for scraping text content from webpages. Input should be a URL."
    ),
    Tool(
        name="ExtractStructuredData",
        func=extract_structured_data,
        description="Useful for extracting structured HTML data from webpages. Input should be a URL."
    )
]

## Create the Agent

Now let's create an agent that can use these tools:

In [None]:
# Create the agent prompt
prompt = PromptTemplate.from_template(
    """
You are a web automation expert that helps users extract information from websites.
You have access to Steel's browser automation capabilities through the following tools:

1. ScrapeWebpage: Gets clean text content from a webpage
2. ExtractStructuredData: Gets HTML content for more structured data

Use these tools to help users get the information they need.

Question: {input}
Thought: Let me approach this step by step:
{agent_scratchpad}
"""
)

# Initialize the language model
llm = ChatOpenAI(temperature=0)

# Create the agent
agent = create_react_agent(llm, tools, prompt)

# Create the agent executor
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

## Example Usage

Let's try some example tasks:

In [None]:
# Example 1: Basic text extraction
result = agent_executor.invoke(
    {
        "input": "What is the main heading on example.com?"
    }
)
print("Result:", result["output"])

In [None]:
# Example 2: Structured data extraction
result = agent_executor.invoke(
    {
        "input": "Find all the navigation links on example.com"
    }
)
print("Result:", result["output"])

## Advanced Example: Multi-step Web Automation

Let's create a more complex example that involves multiple steps:

In [None]:
def search_and_extract(query: str) -> str:
    """Search a website and extract relevant information."""
    # First, navigate to the search page
    search_url = f"https://example.com/search?q={query}"
    
    # Get search results
    loader = SteelWebLoader(
        search_url,
        use_proxy=True,
        solve_captcha=True
    )
    documents = loader.load()
    
    if not documents:
        return "No results found"
    
    return documents[0].page_content

# Add the search tool
tools.append(
    Tool(
        name="SearchAndExtract",
        func=search_and_extract,
        description="Search a website and extract relevant information. Input should be a search query."
    )
)

# Update the agent with the new tool
agent = create_react_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

# Try a complex query
result = agent_executor.invoke(
    {
        "input": "Search example.com for articles about AI, then extract the main points from the first result"
    }
)
print("Result:", result["output"])

## Best Practices

1. **Error Handling**: Steel handles common web automation challenges:
   - CAPTCHAs are automatically solved
   - Proxy network provides reliable access
   - Session management is automated

2. **Performance**: 
   - Use appropriate extraction strategies
   - Leverage session reuse when possible
   - Set reasonable timeouts

3. **Debugging**:
   - Use Steel's session viewer for visual debugging
   - Enable verbose mode in the agent executor
   - Check metadata for session information