Skip to content

georgemandis/tezcatl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tezcatl

A lightweight CLI for rendering web pages and scraping content using native macOS WebKit.

tezcatl loads URLs through the system WKWebView, waits for JavaScript to render, and returns the fully rendered DOM or the result of custom JS evaluation. No headless Chrome, no Puppeteer, no heavy dependencies — just the WebKit engine already on your Mac.

This isn't meant for production scraping pipelines or large-scale crawling. It's a good fit for small tasks, personal projects, experiments, and testing or evaluation workflows where you're already working in the macOS ecosystem and want something simple that just works.

Written in Zig. Uses Apple's WKWebView via Objective-C runtime bindings.

Install

zig build -Doptimize=ReleaseFast
cp zig-out/bin/tezcatl /usr/local/bin/

Usage

Render a page

$ tezcatl https://example.com
<html lang="en"><head><title>Example Domain</title>...

$ tezcatl https://spa-site.com --wait=2000
# waits 2s after load for JS frameworks to render

Evaluate JavaScript

$ tezcatl https://example.com --eval="document.title"
Example Domain

$ tezcatl https://example.com --eval="document.querySelectorAll('a').length"
1

$ tezcatl https://example.com --eval="document.title" --json
{"result":"Example Domain"}

JSON output

$ tezcatl https://example.com --json | jq '.html' | head -c 100
"<html lang=\"en\"><head><title>Example Domain</title>...

Composability

tezcatl reads URLs as arguments and writes to stdout, so it pipes naturally with other tools:

# Get the rendered DOM and detect its language
tezcatl https://example.com | lingua detect

# Extract all links from a JS-rendered page
tezcatl https://spa-site.com --wait=2000 --eval="JSON.stringify([...document.querySelectorAll('a')].map(a => a.href))"

# Scrape a page title for use in a script
TITLE=$(tezcatl https://example.com --eval="document.title")

# Get rendered HTML and extract phone numbers
tezcatl https://business-site.com --wait=1000 | lingua entities --type=phone

Options

tezcatl <url> [options]

  --eval=JS            Evaluate custom JavaScript instead of returning DOM
  --wait=MS            Wait N ms after page load for JS to settle (default: 0)
  --timeout=MS         Navigation timeout in ms (default: 30000)
  --json               Wrap output in JSON
  --help, -h           Show this help message
  --version, -v        Show version

Requirements

  • macOS 10.15+ (Catalina or later)
  • Zig 0.16+

How It Works

tezcatl creates an offscreen WKWebView, loads the URL, waits for the navigation delegate to fire didFinishNavigation:, optionally waits for additional JS settling time, then evaluates document.documentElement.outerHTML (or custom JS via --eval) through evaluateJavaScript:completionHandler:.

The Dock icon is suppressed via NSApplicationActivationPolicyAccessory. All WebKit rendering happens in-process using the system engine — the same one Safari uses.

Key bridging patterns:

  • Navigation delegate: Runtime class creation (objc_allocateClassPair) with WKNavigationDelegate callbacks
  • JS completion handler: ObjC block ABI (_NSConcreteStackBlock) for async evaluation callbacks
  • Run loop: CFRunLoopRunInMode to pump the event loop while waiting for async operations

Related Projects

  • lingua — NLP CLI (NaturalLanguage framework)
  • loupe — Computer vision CLI (Vision framework)
  • whereami — Location CLI (CoreLocation)
  • nearme — Local search CLI (MapKit)

Credits

Created by George Mandis during Recurse Center.

About

curl for rendered DOMs on macOS. Headless web rendering CLI powered by native macOS WebKit. Render JS-heavy pages, extract DOM, evaluate JavaScript, all from the command line.

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages