# ðŸ”“ Web Unlocker - Scrape Any Website

Test the Web Unlocker API for scraping any website with anti-bot bypass:
- Basic HTML scraping
- Batch URL scraping
- Country-specific proxies

---

## Setup

In [None]:
import os
from dotenv import load_dotenv
load_dotenv()

API_TOKEN = os.getenv("BRIGHTDATA_API_TOKEN")
if not API_TOKEN:
    raise ValueError("Set BRIGHTDATA_API_TOKEN in .env file")

print(f"API Token: {API_TOKEN[:10]}...{API_TOKEN[-4:]}")
print("Setup complete!")

## Initialize Client

In [None]:
from brightdata import BrightDataClient

# Initialize client
client = BrightDataClient(token=API_TOKEN)

print("Client initialized")
print(f"Default Web Unlocker zone: {client.web_unlocker_zone}")

---
## Test 1: Basic HTML Scraping

Scrape a simple webpage and get raw HTML.

In [None]:
URL = "https://example.com"

print(f"Scraping: {URL}\n")

async with client:
    result = await client.scrape_url(
        url=URL,
        response_format="raw"
    )

print(f"Success: {result.success}")
print(f"Status: {result.status}")
print(f"Method: {result.method}")

if result.success and result.data:
    html = result.data
    print(f"\nHTML length: {len(html)} characters")
    print("\nFirst 500 characters:")
    print("-" * 50)
    print(html[:500])
    print("-" * 50)
else:
    print(f"\nError: {result.error}")

---
## Test 2: Country-Specific Proxy

Use a proxy from a specific country to get localized content.

In [7]:
URL = "https://example.com"

print("Testing country-specific proxies...\n")

countries = ["US", "GB", "DE"]

async with client:
    for country in countries:
        print(f"Country: {country}")
        result = await client.scrape_url(
            url=URL,
            country=country,
            response_format="raw"
        )
        
        if result.success and result.data:
            print(f"  Success: {result.success}")
            print(f"  HTML length: {len(result.data)} chars")
        else:
            print(f"  Error: {result.error}")
        print()

Testing country-specific proxies...

Country: US
  Success: True
  HTML length: 513 chars

Country: GB
  Success: True
  HTML length: 513 chars

Country: DE
  Success: True
  HTML length: 513 chars



---
## Test 3: Batch URL Scraping

Scrape multiple URLs concurrently.

In [8]:
URLS = [
    "https://example.com",
    "https://www.iana.org/domains/reserved",
]

print(f"Batch scraping {len(URLS)} URLs...\n")

async with client:
    results = await client.scrape_url(
        url=URLS,
        response_format="raw"
    )

print(f"Results: {len(results)} responses\n")

for i, result in enumerate(results):
    print(f"=== URL {i+1}: {URLS[i]} ===")
    print(f"  Success: {result.success}")
    print(f"  Status: {result.status}")
    if result.success and result.data:
        content_len = len(result.data) if isinstance(result.data, str) else len(str(result.data))
        print(f"  Content length: {content_len} chars")
        print(f"  Preview: {str(result.data)[:100]}...")
    else:
        print(f"  Error: {result.error}")
    print()

Batch scraping 2 URLs...

Results: 2 responses

=== URL 1: https://example.com ===
  Success: True
  Status: ready
  Content length: 513 chars
  Preview: <!doctype html><html lang="en"><head><title>Example Domain</title><meta name="viewport" content="wid...

=== URL 2: https://www.iana.org/domains/reserved ===
  Success: True
  Status: ready
  Content length: 10815 chars
  Preview: 
<!doctype html>
<html>
<head>
	<title>IANA-managed Reserved Domains</title>

	<meta charset="utf-8"...



---
## Test 4: Timing Metadata

In [9]:
# Check timing metadata from last result
print("=== Result Metadata ===")
print(f"url: {result.url}")
print(f"success: {result.success}")
print(f"status: {result.status}")
print(f"method: {result.method}")
print(f"root_domain: {result.root_domain}")
print(f"html_char_size: {result.html_char_size}")
print("\n=== Timing ===")
print(f"trigger_sent_at: {result.trigger_sent_at}")
print(f"data_fetched_at: {result.data_fetched_at}")

if result.trigger_sent_at and result.data_fetched_at:
    duration = (result.data_fetched_at - result.trigger_sent_at).total_seconds()
    print(f"\nTotal time: {duration:.2f} seconds")

=== Result Metadata ===
url: https://www.iana.org/domains/reserved
success: True
status: ready
method: web_unlocker
root_domain: iana.org
html_char_size: 10815

=== Timing ===
trigger_sent_at: 2026-01-29 21:14:52.185321+00:00
data_fetched_at: 2026-01-29 21:14:54.939432+00:00

Total time: 2.75 seconds


---
## Summary

### Web Unlocker Methods

| Method | Description |
|--------|-------------|
| `client.scrape_url(url, ...)` | Scrape URL(s) with Web Unlocker |

### Parameters

| Parameter | Description | Default |
|-----------|-------------|--------|
| `url` | Single URL or list of URLs | Required |
| `zone` | Bright Data zone | `sdk_unlocker` |
| `country` | Proxy country code (US, GB, etc.) | `""` |
| `response_format` | `"raw"` (HTML) | `"raw"` |
| `method` | HTTP method (GET, POST, etc.) | `"GET"` |
| `timeout` | Request timeout in seconds | 30 |

### Response Fields

| Field | Description |
|-------|-------------|
| `success` | Boolean indicating success |
| `data` | HTML string |
| `status` | Status ("ready", "error", etc.) |
| `url` | The scraped URL |
| `method` | Always "web_unlocker" |
| `root_domain` | Extracted domain |
| `html_char_size` | Size of HTML response |