A CLI tool to detect broken links on any website. Given a starting URL, it crawls all internal pages recursively, collects external links, checks their HTTP status codes, and outputs a CSV report.
- Recursive crawling of all internal links
- External links checked but not followed
- Visited URL tracking to prevent infinite loops
- Parallel HTTP requests for performance
- CSV output:
link, referrer, http_status_code - Markdown summary of non-200 results per scan
- Email notification via the Resend API when non-200 results are found
Requires Python 3.10+.
pip install -r requirements.txtpython src/checker.py <start_url> [options]| Option | Default | Description |
|---|---|---|
start_url |
(required) | The URL to begin crawling from (must use http or https) |
--output, -o |
scans/[WEBSITE]/[TIMESTAMP]/results.csv |
Path to the output CSV file |
--workers, -w |
10 |
Number of parallel threads |
--timeout, -t |
10 |
Per-request timeout in seconds |
--user-agent |
deadlinkprobe/1.0 |
User-Agent header sent with every request |
--notify-email |
(omitted) | Recipient address for a post-scan email notification |
When --output is omitted, results are written to scans/[WEBSITE]/[TIMESTAMP]/ and a README.md summary is produced alongside results.csv.
Example:
python src/checker.py https://example.com --output report.csv --workers 20Pass --notify-email with a recipient address to receive a summary email after each scan. Three environment variables must be set:
| Variable | Description |
|---|---|
RESEND_API_KEY |
API key from your Resend account |
RESEND_FROM_ADDRESS |
Verified sender address in Resend |
export RESEND_API_KEY=re_xxx
export RESEND_FROM_ADDRESS=scanner@yourdomain.com
python src/checker.py https://example.com --notify-email you@example.comSee Resend website for setting up your account.
When using npm run scan-with-email, make sure to add the environment variables:
export RESEND_API_KEY=re_xxxxx && export RESEND_FROM_ADDRESS=noreply@yourdomain.fr && npm run scan-with-email random@yopmail.comResults are written to a CSV file with the following columns:
| Column | Description |
|---|---|
link |
The URL that was checked |
referrer |
The page that contained the link |
http_status_code |
HTTP response code (e.g. 200, 404, 500) |
python -m pytest tests/ -vA sample website is available to test the tool against:
https://deadlinkprobe-sample-website.netlify.app
The expected output is (sorted by referrer, then link):
link,referrer,http_status_code
https://deadlinkprobe-sample-website.netlify.app/about/,https://deadlinkprobe-sample-website.netlify.app/,404
https://deadlinkprobe-sample-website.netlify.app/blog,https://deadlinkprobe-sample-website.netlify.app/,200
https://deadlinkprobe-sample-website.netlify.app/contact,https://deadlinkprobe-sample-website.netlify.app/,200
https://iamjeremie.me/,https://deadlinkprobe-sample-website.netlify.app/contact,200
https://iamjeremie.me/doesnt-exist,https://deadlinkprobe-sample-website.netlify.app/contact,404Notes:
/about/is an intentional dead link (404) on the sample site.https://iamjeremie.me/andhttps://iamjeremie.me/doesnt-existare external links, checked but not crawled.- Links to
/discovered from/contactand/blogare skipped as already visited.
This project uses a multi-agent pipeline powered by Claude Code. Describe a feature request or bug fix and the orchestrator coordinates specialist agents automatically.
| Agent | Role |
|---|---|
| Specification | Writes detailed specs from the user request |
| Coder | Implements the feature based on specs |
| Tester | Validates the implementation with tests |
| Versioning | Commits to a Git Flow branch and pushes |
Human approval gates pause the pipeline after specs and after coding.
.agents-output/
├── 0-user-requests/ # your input
├── 1-business-specifications/ # specification output
├── 2-technical-specifications/ # coder output
└── 3-test-results/ # test output
| Change type | Branch prefix |
|---|---|
| New feature | feature/ |
| Bug fix | fix/ |
| Docs only | docs/ |
See LICENSE.