# LangChain Anchor Browser Integration Tutorial

This tutorial demonstrates how to integrate Anchor Browser tools with LangChain to create powerful web automation workflows.

## What You'll Learn

By the end of this tutorial, you'll be able to:
- Extract content from web pages
- Take screenshots programmatically
- Perform AI-driven web tasks
- Create intelligent agents that can interact with the web
- Chain multiple tools together for complex workflows

## Prerequisites

- Python 3.8+
- OpenAI API key
- Anchor Browser API key
- Basic knowledge of LangChain

## 1. Basic Setup and Imports

Let's import the necessary modules and setting up our environment.

### Setup

First, install the required packages:

In [None]:
!pip install langchain anchorbrowser pydantic langchain-openai

In [None]:
import sys
import os
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), "..")))

In [None]:
# Import Anchor Browser tools
from langchain_anchorbrowser.AnchorWebTaskTool import AnchorWebTaskToolKit
from langchain_anchorbrowser.AnchorScreenshotTool import AnchorScreenshotTool
from langchain_anchorbrowser.AnchorContentTool import AnchorContentTool

# Import LangChain components
from langchain.agents import AgentExecutor, create_openai_functions_agent
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.chat_models import init_chat_model

import getpass
import os

print("✅ All imports successful!")

## 2. API Key Configuration

You'll need both OpenAI and Anchor Browser API keys. You can set them as environment variables or enter them interactively.

In [None]:
# Configure API keys
if not os.environ.get("OPENAI_API_KEY"):
    os.environ["OPENAI_API_KEY"] = getpass.getpass("🔑 Enter your OpenAI API key: ")

print("✅ API keys configured!")
print(f"OpenAI API key: {'✅ Set' if os.environ.get('OPENAI_API_KEY') else '❌ Not set'}")
print(f"Anchor Browser API key: {'✅ Set' if os.environ.get('ANCHORBROWSER_API_KEY') else '🔄 Will prompt when needed'}")

## 3. Initialize Language Model and Tools

Now let's set up our language model and create all the available Anchor Browser tools.

In [None]:
# Initialize the language model
print("🤖 Initializing language model...")
llm = init_chat_model("gpt-4o-mini", model_provider="openai")
print("✅ Language model ready!")

# Create all available tools
print("\n🔧 Creating Anchor Browser tools...")
anchor_browser_tools = [
    AnchorScreenshotTool(), 
    AnchorContentTool()
] + AnchorWebTaskToolKit().get_tools()

print(f"✅ Created {len(anchor_browser_tools)} tools:")
for i, tool in enumerate(anchor_browser_tools, 1):
    print(f"  {i}. {tool.name} - {tool.description[:50]}...")

## 4. Understanding the Tools

Let's explore what each tool does and when to use them:

### Tool Overview:
1. **Screenshot Tool** - Captures visual snapshots of web pages
2. **Content Tool** - Extracts text content from web pages
3. **Web Task Tool** - AI-powered web interactions
    Different pydantic schemas for different configuration:
    - **Simple Web Task**
    - **Standard Web Task**
    - **Advanced Web Task**

## 5. Direct Tool Usage Examples

Let's start by using each tool directly. This is useful when you know exactly which tool you need for a specific task.

### 5.1 Taking Screenshots

The screenshot tool allows you to capture visual snapshots of web pages with custom dimensions.

In [None]:
# Example: Take a screenshot of a webpage
print("📸 Taking a screenshot...")
screenshot_tool = AnchorScreenshotTool()

screenshot_result = screenshot_tool.invoke({
    "url": "https://www.example.com",
    "width": 1280,
    "height": 720
})

print(f"✅ Screenshot captured!")
print(f"📊 Result preview: {screenshot_result[:100]}...")
print(f"💡 The result is base64 encoded image data that can be saved or displayed")

### 5.2 Extracting Content

The content tool extracts text content from web pages in various formats (HTML, markdown, text).

In [None]:
# Example: Extract content from a webpage
print("📄 Extracting webpage content...")
content_tool = AnchorContentTool()

content_result = content_tool.invoke({
    "url": "https://www.example.com",
    "format": "markdown"
})

print(f"✅ Content extracted!")
print(f"📊 Content length: {len(content_result)} characters")
print(f"📝 Content preview:\n{content_result[:300]}...")

### 5.3 Performing AI Web Tasks

The web task tools are the most powerful - they use AI to perform intelligent tasks on web pages.

In [None]:
# Example: Perform an AI web task
print("🤖 Performing AI web task...")
web_task_tool = anchor_browser_tools[2]  # Simple web task tool

web_task_result = web_task_tool.invoke({
    "prompt": "What is the weather in Tokyo?"
})

print(f"✅ Web task completed!")
print(f"🤖 AI Response: {web_task_result}")

## 6. Creating Intelligent Agents

Now let's create a LangChain agent that can intelligently choose which tool to use based on your request. This is much more powerful than using tools individually!

### 6.1 Setting Up the Agent

We'll create an agent with a system prompt that explains all available tools.

In [None]:
# Define the system prompt for our agent
system_prompt = "You are a helpful assistant that can interact with web pages using Anchor Browser.\n\nYou have access to these powerful tools:\n\n1. **Screenshot Tool** - Take visual snapshots of webpages\n2. **Content Tool** - Extract text content from webpages\n3. **Simple Web Task Tool** - Perform basic AI-powered web tasks\n4. **Standard Web Task Tool** - Perform enhanced web tasks with more context\n5. **Advanced Web Task Tool** - Perform comprehensive web analysis and automation\n\nChoose the most appropriate tool based on what the user wants to accomplish. For simple requests, use the basic tools. For complex analysis, use the advanced tools."

# Create the prompt template
prompt = ChatPromptTemplate.from_messages([
    ("system", system_prompt),
    ("human", "{input}"),
    MessagesPlaceholder(variable_name="agent_scratchpad"),
])

# Create the agent
print("🤖 Creating intelligent agent...")
agent = create_openai_functions_agent(llm, anchor_browser_tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=anchor_browser_tools, verbose=True)

print("✅ Intelligent agent created successfully!")

### 6.2 Testing the Agent

Let's test our agent with different types of requests to see how it intelligently chooses tools.

In [None]:
# Test different scenarios with our intelligent agent
test_scenarios = [
    "Take a screenshot of example.com",
    "Get the content of example.com in markdown format",
    "Go to a random website and tell me what the page is about",
    "Analyze the structure of example.com and extract key information"
]

print("🧪 Testing intelligent agent with different scenarios...")
print("=" * 60)

for i, scenario in enumerate(test_scenarios, 1):
    print(f"\n🔍 Test {i}: {scenario}")
    print("-" * 40)
    
    try:
        result = agent_executor.invoke({"input": scenario})
        print(f"✅ Result: {result['output']}")
    except Exception as e:
        print(f"❌ Error: {e}")
    
    print("-" * 40)

## 7. Advanced: Tool Chaining

For complex workflows, you can chain multiple tools together. This allows you to build sophisticated web automation pipelines.

In [None]:
# Example: Complex workflow - Get content → Analyze → Generate summary
print("🔗 Creating a complex workflow with tool chaining...")
print("=" * 60)

# Step 1: Extract content from a webpage
print("\n📄 Step 1: Extracting webpage content...")
content = content_tool.invoke({
    "url": "https://www.example.com",
    "format": "markdown"
})
print(f"✅ Content extracted ({len(content)} characters)")

# Step 2: Take a screenshot for visual reference
print("\n📸 Step 2: Taking a screenshot...")
screenshot = screenshot_tool.invoke({
    "url": "https://www.example.com",
    "width": 1024,
    "height": 768
})
print(f"✅ Screenshot captured")

# Step 3: Use AI to analyze the content
print("\n🤖 Step 3: AI analysis of the content...")
analysis_result = web_task_tool.invoke({
    "prompt": f"Based on this webpage content: {content[:500]}..., provide a comprehensive analysis including:\n1. Main topic and purpose\n2. Key information presented\n3. Overall structure and layout\n4. Target audience"
})

print(f"✅ Analysis completed!")
print(f"\n📊 Analysis Result:\n{analysis_result}")

print("\n🎉 Complex workflow completed successfully!")

## 8. Summary and Conclusion

Congratulations! You've successfully learned how to use the LangChain Anchor Browser integration. Here's what we covered:

### ✅ What You've Accomplished:
1. **Set up the environment** with proper API keys and dependencies
2. **Used individual tools** for specific tasks (screenshots, content extraction, AI tasks)
3. **Created intelligent agents** that can choose the right tool for any request
4. **Built complex workflows** by chaining multiple tools together

### 🚀 What You Can Do Now:
- Automate web scraping and data extraction
- Create intelligent web assistants
- Build sophisticated web automation pipelines
- Integrate web capabilities into your LangChain applications

### 📚 Further Learning:
- Experiment with different system prompts and agent configurations
- Build custom tools and integrate them with Anchor Browser

### 💡 Pro Tip:
You can now run all these examples automatically using the `tutorial-demo.py` script, which demonstrates the complete workflow in a single execution!

Happy building! 🎉