Build the theatre once. Change the play every night.
Playwright + Chromium (or Lightpanda) on AWS Lambda as a container image.
Inject Python browser automation scripts at runtime via event payload or S3.
No rebuild needed — one image, unlimited scripts.
Headless browser | Web scraping | Browser automation | Serverless | AWS Lambda container
Table of contents: How it works | Quick start | Usage | Writing scripts | Browser backends | Examples | Benchmarks | Cold start optimization | Security | Project structure
The container image ships Chromium and Playwright pre-installed on Ubuntu 25.04. At Lambda cold start, Chromium launches during the free init phase (not billed). Your Playwright script runs against the already-warm browser, then the page and context are cleaned up. On warm starts, the browser is reused — only a new page is created.
flowchart LR
subgraph "Your Code"
A["Inline script<br>(event payload)"]
B["S3 script<br>(s3://bucket/key)"]
end
subgraph "Lambda Container"
C["handler.py"]
D["Playwright"]
E["Chromium<br>(pre-launched at init)"]
end
F["Target website"]
A --> C
B --> C
C --> D --> E --> F
make build # builds the Chromium image
make test # smoke-tests it locallymake build-lightpanda # builds the Lightpanda image
make test-lightpanda # smoke-tests it locallyBoth images use the same handler and accept the same scripts. Pick whichever fits your workload — see Browser backends for trade-offs.
# Start either image (replace lambda-theatre with lambda-theatre-lightpanda for Lightpanda)
docker run -d --name test -p 9000:8080 lambda-theatre
sleep 5
# Extract a page title
curl -s -XPOST "http://localhost:9000/2015-03-31/functions/function/invocations" \
-d '{"url": "https://example.com", "script": "result[\"title\"] = page.title()"}' \
| python3 -m json.tool
# Clean up
docker rm -f testsam build --template infra/template.yaml && sam deploy --guided --stack-name lambda-theatreaws lambda invoke \
--function-name TheatreFunction \
--cli-binary-format raw-in-base64-out \
--payload '{"url": "https://example.com", "script": "result[\"title\"] = page.title()"}' \
/dev/stdout | python3 -m json.toolOr with the included helper:
python3 examples/invoke.py --url https://example.com --script "result['title'] = page.title()"{
"browser": "chromium",
"url": "https://example.com",
"script": "result['title'] = page.title()",
"s3_uri": "s3://my-bucket/scripts/scrape.py",
"timeout": 30,
"wait_until": "load",
"viewport": {"width": 1280, "height": 720},
"user_agent": "custom-agent/1.0",
"params": {"any": "data your script needs"}
}| Field | Required | Description |
|---|---|---|
script |
One of script or s3_uri |
Inline Python code (takes precedence over s3_uri) |
s3_uri |
One of script or s3_uri |
S3 path to a .py script file (ignored if script is set) |
browser |
No | "chromium" | "lightpanda" (default: auto-detect from image) |
url |
No | Navigate to this URL before running the script |
timeout |
No | Timeout in seconds (default: 30) |
wait_until |
No | load | domcontentloaded | networkidle | commit (default: load, Chromium only) |
viewport |
No | {width, height} (default: 1280x720, Chromium only) |
user_agent |
No | Custom User-Agent string |
params |
No | Arbitrary data accessible as event["params"] in your script |
Your script receives these variables pre-bound. Standard import statements also work (e.g., import boto3, import time).
| Variable | Type | Description |
|---|---|---|
page |
playwright.sync_api.Page |
Already navigated to event["url"] if provided |
browser |
playwright.sync_api.Browser |
Persistent across warm starts |
context |
playwright.sync_api.BrowserContext |
Fresh per invocation |
event |
dict |
Full Lambda event (access event["params"], etc.) |
result |
dict |
Put your return data here |
json |
module |
The json module, pre-imported |
Scripts are bare Playwright code — not Python modules. Don't write def main(), if __name__, or class definitions. Just write the steps directly, as if you're in the middle of a function that already has page, event, and result in scope.
# my_script.py — correct
page.wait_for_selector("h1")
result["title"] = page.title()
result["heading"] = page.inner_text("h1")# my_script.py — WRONG (will not work)
def main():
page.wait_for_selector("h1")
return page.title()
if __name__ == "__main__":
main()Standard imports work at the top of the script (boto3, json, time, and other packages already installed in the container):
import time
import boto3
page.click("#load-more")
time.sleep(2)
result["items"] = page.evaluate("document.querySelectorAll('.item').length")Need additional packages? Scripts can import any package installed in the container image (
playwright,boto3, and Python stdlib are included by default). To add more, add them tosrc/requirements.txtand rebuild:make build. The image is the theatre — rebuild it once when your dependencies change, then swap scripts freely. If you think a package should be included by default, open an issue.
See the examples/ directory — every file there is a working script you can upload directly.
Upload a script file and invoke by S3 URI:
aws s3 cp my_script.py s3://my-bucket/scripts/my_script.py{"url": "https://example.com", "s3_uri": "s3://my-bucket/scripts/my_script.py"}The Lambda function needs s3:GetObject permission on the bucket. The SAM template handles this automatically — pass the bucket name at deploy time:
sam deploy --template infra/template.yaml --parameter-overrides ScriptBucket=my-bucketOr add the permission manually if deploying outside SAM:
{
"Effect": "Allow",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::my-bucket/scripts/*"
}{"url": "https://example.com", "script": "result['text'] = page.inner_text('body')"}{
"url": "https://todomvc.com/examples/react/dist/",
"script": "page.wait_for_selector('input.new-todo')\nfor item in event['params']['todos']:\n page.fill('input.new-todo', item)\n page.press('input.new-todo', 'Enter')\nresult['count'] = page.locator('ul.todo-list li').count()",
"params": {"todos": ["Buy milk", "Write tests", "Ship it"]}
}{
"url": "https://the-internet.herokuapp.com/login",
"script": "page.fill('#username', 'tomsmith')\npage.fill('#password', 'SuperSecretPassword!')\npage.click('button[type=\"submit\"]')\npage.wait_for_load_state('load')\nresult['url'] = page.url\nresult['message'] = page.text_content('#flash')"
}Upload examples/hacker_news_scraper.py to S3 and invoke:
aws s3 cp examples/hacker_news_scraper.py s3://my-bucket/scripts/
python3 examples/invoke.py --s3 s3://my-bucket/scripts/hacker_news_scraper.py --param limit=5See the examples/ directory for all example scripts.
Lambda Theatre supports two browser backends. Each ships as its own container image. Your scripts work on both without changes.
| Chromium | Lightpanda | |
|---|---|---|
| Image | lambda-theatre |
lambda-theatre-lightpanda |
| Size | ~1.2 GB | ~450 MB |
| Build | make build |
make build-lightpanda |
| Test | make test |
make test-lightpanda |
| Best for | Full compatibility — SPAs, complex JS, screenshots, PDF | Speed and size — 2-4x faster on most pages, 63% smaller |
| Dockerfile | src/Dockerfile |
src/Dockerfile.lightpanda |
Start with Chromium if you need full browser compatibility or aren't sure. It works with everything.
Switch to Lightpanda when:
- You want faster cold starts (smaller image = faster Lambda image pull)
- Your target pages are mostly server-rendered HTML or light JavaScript
- You want to minimize Lambda costs (faster execution = less billed duration)
- You've tested your scripts against Lightpanda and they work
Each container image includes exactly one browser. The handler detects which one is installed and uses it automatically. You don't need to change your scripts or event payloads — the same page, browser, and context objects work on both backends.
# Chromium image — scripts run against Chromium
docker build -t lambda-theatre src/
docker run -d --name test -p 9000:8080 lambda-theatre
# Lightpanda image — same scripts, same API, different browser
docker build -t lambda-theatre-lightpanda -f src/Dockerfile.lightpanda src/
docker run -d --name test -p 9000:8080 lambda-theatre-lightpandaIf you need to explicitly request a backend (e.g., testing locally with both installed), use the browser event field:
{"browser": "lightpanda", "url": "https://example.com", "script": "result['title'] = page.title()"}If the requested backend isn't available in the image, the handler returns a 400 error with a clear message.
Lightpanda is a Zig-based headless browser that speaks the Chrome DevTools Protocol (CDP). It is faster and smaller than Chromium, but has some limitations:
- No
viewportsupport — theviewportevent field is ignored (Lightpanda uses a default viewport) - No
wait_untilsupport — thewait_untilevent field is ignored (pages load fully before returning) - Navigation-heavy pages may fail — pages that redirect or destroy execution contexts mid-load can cause
"Execution context was destroyed"errors - Slower on JS-heavy pages — Lightpanda has no JIT compiler, so pages with heavy JavaScript execution (large SPAs, complex frameworks) may be slower than Chromium
- Nightly builds only — Lightpanda distributes x86_64 Linux nightly builds; no stable releases yet
Measured on AWS Lambda (us-east-1) using Lambda Power Tuning with the Hacker News scraper (navigates to HN, visits 5 story URLs, extracts metadata from each). 10 invocations per memory size.
| Memory | Avg Duration | Cost / invocation | Relative speed |
|---|---|---|---|
| 512 MB | 20,422ms | $0.000172 | 4.5x slower |
| 768 MB | 14,313ms | $0.000180 | 3.2x slower |
| 1024 MB | 9,650ms | $0.000162 | 2.1x slower |
| 1536 MB | 7,201ms | $0.000181 | 1.6x slower |
| 2048 MB | 4,531ms | $0.000152 | baseline |
| 3072 MB | 3,181ms | $0.000160 | 1.4x faster |
2048 MB is the sweet spot — cheapest per invocation AND fast. Above 2048 MB, the workload becomes network-bound (waiting for page loads), so extra CPU doesn't help. Below 1024 MB, reduced CPU makes everything slower without meaningful cost savings.
| Scenario | Init Duration | Handler Duration | Total |
|---|---|---|---|
| Cold start (image cached) | 1,925ms | 280ms | 2,205ms |
| Cold start (typical) | 3,659ms | 511ms | 4,171ms |
| Cold start (image evicted) | 6,616ms | 838ms | 7,454ms |
| Warm start (simple page) | — | 115-131ms | ~120ms |
| Warm start (React SPA) | — | 2,243ms | ~2.2s |
| Warm start (multi-page scraper) | — | 4,531ms | ~4.5s |
Cold start variance is dominated by whether the container image is cached on the Lambda worker. After the first pull, subsequent cold starts on the same host are ~2s.
| Workload | Peak Memory |
|---|---|
| Simple page (title extraction) | ~470 MB |
| React SPA (fill, click, toggle) | ~525 MB |
| Multi-page scraper (5 URLs) | ~660 MB |
| Invocations/month | Estimated cost |
|---|---|
| 1,000 | $0.15 |
| 10,000 | $1.52 |
| 100,000 | $15.23 |
This image applies several techniques to minimize cold start latency:
- Module-level browser launch — Chromium starts during Lambda's free init phase (not billed)
- Optimized Chromium flags — 15+ flags disable unnecessary features (extensions, sync, translate, background networking, component updates)
- Disk cache in /tmp — V8 compiled code and resources persist across warm invocations
- Layer ordering — Dockerfile layers ordered by change frequency (OS first, handler code last)
- Stripped locales and docs — unnecessary Chromium files removed from image
To avoid cold starts during normal traffic, add a scheduled warmup ping:
aws events put-rule --name playwright-warmup --schedule-expression "rate(5 minutes)"The handler detects empty/warmup events (no script or s3_uri) and returns {"statusCode": 200, "body": "warm"} immediately, keeping the execution environment alive.
Cost comparison (2048 MB, keeping 1 instance warm):
| Strategy | Monthly cost | Guarantee |
|---|---|---|
| EventBridge warmup (5 min) | ~$0.07 | Best-effort (Lambda may occasionally recycle the environment) |
| Provisioned concurrency = 1 | ~$21.60 | Guaranteed always-warm |
The warmup approach is 300x cheaper and works well in practice — Lambda rarely recycles environments that are pinged every 5 minutes. Use provisioned concurrency only when you need a hard SLA on response latency.
The
scriptands3_urifields execute arbitrary Python code with the full permissions of the Lambda execution role. Never expose this function to untrusted input without an authorization layer.
Production hardening:
- Use
s3_uriwith pre-approved scripts only — consider removingscriptfield support in your fork - Restrict the Lambda execution role to minimum required permissions
- Add API Gateway with IAM auth or a custom authorizer if exposing via HTTP
- Use VPC and security groups to limit Chromium's network access
- Set
PLAYWRIGHT_DEBUG=false(default) to suppress stack traces in error responses. Set totrueonly for development.
Other notes:
- No public endpoints. The SAM template creates a Lambda function with no Function URL, no API Gateway, and no public access. Invoke via SDK or CLI only.
- Chromium runs with
--no-sandboxbecause Lambda does not support the Chrome sandbox. Navigating to untrusted URLs exposes the function to browser-level exploits without sandbox protection. - Chromium binds to localhost. No network ports are exposed.
src/
Dockerfile Chromium image (Ubuntu 25.04 + Chromium + Playwright + Lambda RIE)
Dockerfile.lightpanda Lightpanda image (no Chromium, ~450 MB)
handler.py Lambda handler (script injection runtime)
entry.sh Bootstrap (Lambda RIE for local, awslambdaric for deployed)
requirements.txt Python dependencies
infra/
template.yaml SAM template (one function, no public access)
examples/
invoke.py Python helper for invoking the function (local + deployed)
hacker_news_scraper.py Multi-step scraper (navigates 5+ pages)
extract_links.py Extract all links from a page
form_fill_submit.py Fill and submit a login form
screenshot_to_s3.py Full-page screenshot uploaded to S3
todomvc_add_items.py React SPA interaction
wait_and_extract.py Wait for dynamic content, extract structured data
Makefile build / test / deploy shortcuts
ARCHITECTURE.md Integration patterns (API Gateway, Step Functions, SQS, EventBridge)
Lambda zip deployments have a 250 MB unzipped limit. Chromium alone is ~300 MB. Container images support up to 10 GB, and Lambda caches them across invocations.
Ubuntu 25.04 is used because Playwright's Chromium requires GLIBC 2.39+, which Amazon Linux 2023 (GLIBC 2.34) does not ship. Ubuntu 25.04 provides GLIBC 2.41 and Python 3.13 out of the box.
- Docker
- AWS SAM CLI (for deployment)
- Python 3.12+ (for local invocation helper; container uses 3.13)
- AWS credentials configured
MIT

