Skip to content

v0.0.7

Choose a tag to compare

@us us released this 14 Mar 12:31
· 469 commits to main since this release
  • success: false on 4xx targets — scraping a 403/404/429 target with minimal body now correctly returns success: false with error details, instead of success: true with a warning. Targets with real content (custom error pages) still return success: true with a warning
  • JS renderer fallback warning — when renderJs: true is requested but no CDP renderer is available, the response now includes rendered_with: "http_only_fallback" and a warning instead of silently falling back
  • CDP health checkis_available() now runs a real Browser.getVersion command instead of just testing the WebSocket connection
  • Specific error messages — unknown formats now return descriptive errors (e.g., "Unknown format 'extract'. Valid formats: ...") instead of generic 422
  • "extract" format aliasformats: ["extract"] and formats: ["llm-extract"] are now accepted as aliases for "json" (Firecrawl compatibility)
  • Chunk dedup by default — deduplication is now enabled by default for all chunking strategies; separator-only chunks (---, ***) are filtered out
  • Chunk relevance scores — chunks now return { content, score, index } objects instead of plain strings when a query is provided
  • Map timeout/v1/map accepts a timeout parameter (default 120s, max 300s) to prevent 502s on large sites
  • Stealth + JS rendering fixstealth: true with renderJs: true no longer bypasses CDP; the shared renderer is used with stealth headers injected
  • BM25 NaN guard — prevents NaN scores when all chunks are empty