# Stagehand Quickstart Jupyter Notebook

This notebook demonstrates how to use Stagehand with local browser automation and custom LLM endpoints.

## Prerequisites

Make sure you have:
- Python 3.8+ installed
- An OpenAI API key (or other LLM provider API key)
- Chrome/Chromium browser installed

In [None]:
# Install Stagehand if not already installed
%pip install stagehand

In [None]:
import os
from pprint import pprint
from typing import List

In [None]:
# Load environment variables from .env file
import dotenv
dotenv.load_dotenv()

## Configure Stagehand

Configure Stagehand to use local browser automation with a custom LLM endpoint. This example uses OpenAI, but you can configure it for other providers like Anthropic, Together AI, or Groq.

In [None]:
from stagehand import Stagehand, StagehandConfig

# Configure Stagehand for local browser automation with Alibaba Bailian (DashScope)
config = StagehandConfig(
    model_name="qwen-turbo",  # Use Alibaba Bailian model
    model_client_options={
        "api_base": os.getenv("ALIBABA_ENDPOINT", "https://dashscope.aliyuncs.com/compatible-mode/v1"),
        "api_key": os.getenv("ALIBABA_API_KEY"),
        "timeout": 30
    },
    local_browser_launch_options={
        "headless": False,  # Set to True for headless mode
        "viewport": {"width": 1280, "height": 720}
    },
    verbose=1,  # Set to 0 for minimal logs, 2 for detailed logs
    dom_settle_timeout_ms=3000,
)

stagehand = Stagehand(config)
print("✓ Stagehand configured with local browser and Alibaba Bailian API")

## Initialize Stagehand

Initialize the Stagehand instance, which will launch a local browser.

In [None]:
await stagehand.init()

# Validate LLM configuration
validation = stagehand.llm.validate_configuration()
if validation['valid']:
    print(f"✓ LLM configured: {validation['configuration']['provider']} - {config.model_name}")
else:
    print("⚠ LLM configuration issues:", validation['errors'])

print("🚀 Local browser session initialized successfully!")

## Navigate to a Website

Let's navigate to Hacker News and extract some data.

In [None]:
page = stagehand.page
await page.goto("https://news.ycombinator.com")
print("✓ Navigated to Hacker News")

## Define Data Models

Define Pydantic models for structured data extraction.

In [None]:
from pydantic import BaseModel, Field

class Post(BaseModel):
    title: str = Field(..., description="Post title")
    points: int = Field(..., description="Number of points/upvotes")
    comments: int = Field(..., description="Number of comments")
    url: str = Field(..., description="Post URL if available")

class Posts(BaseModel):
    posts: List[Post] = Field(..., description="List of posts")

print("✓ Data models defined")

## Extract Structured Data

Use Stagehand's extract method to get structured data from the page.

In [None]:
# Extract posts related to AI
res = await page.extract(
    "find the top 5 posts on the front page with their titles, points, comments, and URLs", 
    schema=Posts
)

print(f"✓ Extracted {len(res.posts)} posts")

## Display Results

Pretty print the extracted data.

In [None]:
print("\n📊 Extracted Posts:")
print("=" * 50)

for idx, post in enumerate(res.posts, 1):
    print(f"\n{idx}. {post.title}")
    print(f"   Points: {post.points} | Comments: {post.comments}")
    if post.url:
        print(f"   URL: {post.url}")
    print("-" * 40)

## Demonstrate Browser Actions

Show how to perform actions on the page.

In [None]:
# Observe elements on the page
observed = await page.observe("find the 'new' link in the navigation")
print(f"✓ Observed: {observed}")

# Perform an action
if observed:
    action_result = await page.act("click on the 'new' link")
    print(f"✓ Action performed: {action_result}")
    
    # Wait a moment for the page to load
    import asyncio
    await asyncio.sleep(2)
    
    print("✓ Navigated to 'new' posts page")

## Alternative LLM Provider Configuration

Here's how you could configure Stagehand with different LLM providers:

In [None]:
# Example configurations for different providers (don't run these cells unless you have the API keys)

# Anthropic Claude configuration
anthropic_config = StagehandConfig(
    model_name="claude-3-haiku-20240307",
    model_client_options={
        "api_base": "https://api.anthropic.com",
        "api_key": os.getenv("ANTHROPIC_API_KEY"),
        "timeout": 60
    },
    local_browser_launch_options={"headless": True}
)

# Together AI configuration
together_config = StagehandConfig(
    model_name="together/llama-2-7b-chat",
    model_client_options={
        "api_base": "https://api.together.xyz/v1",
        "api_key": os.getenv("TOGETHER_API_KEY"),
        "timeout": 45
    },
    local_browser_launch_options={"headless": True}
)

# Groq configuration
groq_config = StagehandConfig(
    model_name="groq/llama2-70b-4096",
    model_client_options={
        "api_base": "https://api.groq.com/openai/v1",
        "api_key": os.getenv("GROQ_API_KEY"),
        "timeout": 30
    },
    local_browser_launch_options={"headless": True}
)

print("✓ Alternative provider configurations shown above")

## Clean Up

Always close the Stagehand session when done to free up resources.

In [None]:
await stagehand.close()
print("✓ Stagehand session closed successfully!")

## Summary

This notebook demonstrated:

1. **Local Browser Configuration**: Using Stagehand with a local browser instead of remote services
2. **Custom LLM Endpoints**: Configuring different LLM providers (OpenAI, Anthropic, Together AI, Groq)
3. **Structured Data Extraction**: Using Pydantic models to extract structured data
4. **Browser Actions**: Observing and acting on page elements
5. **Configuration Validation**: Checking LLM configuration validity

### Key Benefits of Local Mode:
- No external service dependencies
- Better privacy and security
- Lower latency for browser operations
- Full control over browser configuration
- Works offline (except for LLM API calls)

### Next Steps:
- Try different LLM providers
- Experiment with headless vs. headed browser modes
- Build more complex automation workflows
- Integrate with your own applications