Multi-page extraction
Dembrandt now supports multi-page crawling with intelligent result merging. Use the following new flags to unify data from across a domain:
--pages N: Crawls up to N pages. It automatically prioritizes high-value links (like /pricing or /features) while filtering out noise like terms and privacy pages.
--sitemap: Discovers URLs via sitemap.xml instead of DOM scraping. It supports robots.txt directives, nested sitemap indexes, and domain variants.
Example:
dembrandt stripe.com --sitemap --pages 10