Skip to content

v2.5.1

Latest

Choose a tag to compare

@github-actions github-actions released this 27 Jun 05:52
· 2 commits to main since this release

Browser rendering and the AI assistant are now built into the default pre-built binaries — no special build needed. This is the practical follow-up to 2.5.0: just download the binary for your platform and use --browser / --ai-* directly.

The browser engine adds only the ~6 MB chromiumoxide CDP client (binary ~24 MB vs ~17 MB without it). The actual Chromium is never bundled — it is auto-detected on your system or downloaded to a cache on first use. For a lean binary without browser support, build with cargo build --release --no-default-features.


🌐 Browser rendering (optional, --browser)

Crawl JavaScript-heavy / SPA sites by rendering each page in a real Chromium via the Chrome DevTools Protocol — the crawler sees the post-JS DOM, so link discovery, offline export and markdown export all work on dynamic sites.

  • Headless by default, or --browser-headful to watch the browser open each page.
  • Screenshots of every page — viewport (custom resolution) or full-page (entire scroll height), as PNG / JPG / WebP.
  • Console / error diagnostics per page — JavaScript console errors, uncaught exceptions, failed sub-requests (404/5xx), CSP/CORS/mixed-content violations, plus render/screenshot failures — in a "Browser issues" report table.
  • No Node.js — auto-detects an installed Chrome/Chromium/Edge/Brave, otherwise offers a one-time chrome-headless-shell download, or use --browser-path.
  • Reliable waiting--browser-wait=load|domcontentloaded|networkidle (default networkidle) for proper SPA settling, with a hard --browser-timeout.
  • Forwards --proxy / --resolve / User-Agent to the browser, respects TLS errors, uses an isolated per-run profile.
./siteone-crawler --url=https://my.spa.tld --browser --screenshots --screenshot-mode=full-page

🤖 AI assistant (optional, --ai-*)

Optional, opt-in LLM analysis — works with OpenAI, Anthropic, Gemini, and any OpenAI-compatible endpoint (vLLM, LiteLLM, MiniMax, Ollama).

  • AI SEO rewrites, llms.txt / llms-full.txt generation, spelling/grammar checks, custom policy prompts, and an executive summary across five quality areas.
  • Smart page selection + hard caps + --ai-dry-run cost preview; safe API-key handling (env var, redacted from logs).
  • Browser diagnostics → AI: in --browser mode, captured console/network errors are exposed to the custom AI prompt via the {{browser_diagnostics}} placeholder.
./siteone-crawler --url=https://my.domain.tld \
  --ai-provider=openai-compatible --ai-endpoint=https://api.example.com/v1 --ai-model=<model> \
  --ai-actions=seo,typos,summary

🔧 What changed since 2.5.0

  • Browser rendering is now a default Cargo feature → the pre-built binaries (and cargo build) include it. Build the lean variant with --no-default-features.

(2.5.0's binaries shipped without the browser engine; use 2.5.1 if you want --browser from the pre-built binaries.)

📦 Install

Download the archive/package for your platform from the Assets below (Linux/macOS/Windows, x64 & arm64; glibc, musl/static, .deb/.rpm/.apk). Self-contained — the only optional runtime dependency is a Chromium-family browser, used only when you pass --browser.

Full changelog: v2.5.0...v2.5.1