Skip to content

SiteSavvy v0.6.0

Latest

Choose a tag to compare

@Bloody-Crow Bloody-Crow released this 24 Jun 00:31
· 1 commit to main since this release

SiteSavvy v0.6.0

Capture the web, your way.

v0.6.0 completes the feature set with 7 new modules covering pagination, authentication, proxy/Tor, stealth, recipes, docs-site mode, and offline full-text search — on top of v0.5.0's AI/RAG/MCP capabilities.

Installation

pip install sitesavvy

Or download the stand-alone binary for your OS (no Python required) from the assets below.

What's new in v0.6.0

Feature Flag Module
📄 Pagination awareness --follow-pagination (default on) pagination.py
🔐 Authenticated crawling --login-url / --login-user / --login-pass auth.py
🌐 Proxy / Tor / SOCKS5 --proxy http://... or socks5://... proxies.py
🥸 Stealth mode --stealth stealth.py
🍳 Recipe mode → cookbook EPUB --recipe-mode recipe.py
📚 Docs-site mode --docs-mode docs_mode.py
🔍 Offline full-text search --offline-search offline_search.py

Quick examples

# Offline-searchable mirror
sitesavvy crawl https://example.com --offline-search --format html --out-dir ./out
# → open ./out/search.html in any browser

# Recipe site → cookbook EPUB
sitesavvy crawl https://recipes.example.com --recipe-mode --out-dir ./out
# → ./out/sitesavvy-cookbook.epub

# Authenticated crawl
sitesavvy crawl https://private.example.com \
    --login-url https://private.example.com/login \
    --login-user alice --login-pass secret --out-dir ./out

# Tor + stealth
sitesavvy crawl https://example.onion \
    --proxy socks5://127.0.0.1:9050 --stealth --out-dir ./out

Stats

  • 38 source modules (7 new in v0.6.0)
  • 534 tests passing (252 new), 90% coverage
  • ruff check . clean, mypy sitesavvy clean
  • Tested on Python 3.12

Release assets

Asset OS Notes
sitesavvy-0.6.0-linux-x86_64.tar.gz Linux x86_64 Single-file PyInstaller binary
sitesavvy-0.6.0-macos-x86_64.tar.gz macOS x86_64 Single-file PyInstaller binary
sitesavvy-0.6.0-windows-x86_64.exe Windows x86_64 Single-file PyInstaller binary
sitesavvy-0.6.0-py3-none-any.whl Universal pip install wheel
sitesavvy-0.6.0.tar.gz Universal Source distribution

Legal

SiteSavvy is provided for personal, non-commercial use only. Respect the copyright, terms of service, and robots.txt of every site you crawl. Licensed under the MIT License.