Skip to content

linktrace 0.2.0

Choose a tag to compare

@JayBaywatch JayBaywatch released this 08 Jun 22:16
· 43 commits to main since this release

This release tightens up the first public PyPI package release with rebrand cleanup, crawler correctness fixes, and improved robots.txt behavior.

Highlights

  • Completed remaining WebCrawlerlinktrace cleanup in developer commands.
  • Fixed coverage and demo commands in the justfile to target the new linktrace package.
  • Improved robots.txt handling by checking whether a URL is allowed before fetching.
  • Fixed per-document link collection so links discovered on one page no longer leak into later Document objects.
  • Preserved the simple public API: from linktrace import Spider.

Fixes

  • just test-cov now reports coverage against linktrace.
  • just run now runs python -m linktrace.Spider.
  • parse_document() now uses a document-local found_links list instead of crawler-wide link state.
  • Disallowed URLs are skipped before fetch and represented with a 403 status document.

Upgrade

pip install --upgrade linktrace