You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A fast, self-contained broken link scanner for developers and SEO teams.
What it does
LinkScan crawls a website from a seed URL, discovers every link on each page, and checks whether those links are reachable. It classifies failures by type (client error, server error, timeout, invalid URL) and tracks every page where each broken link was found. Use the CLI for quick scans and CI integration, or launch the built-in local web UI for an interactive experience with downloadable reports.
Install
Download binary
Download the latest release for your platform from GitHub Releases.
Build from source
git clone https://github.com/prabhatdotdev/linkscan
cd linkscan
go build -o linkscan .
Quick start
CLI
# Scan a site and print broken links
./linkscan scan https://example.com
# Scan with rate limiting to avoid 429s
./linkscan scan https://example.com --rate-limit 2
# Scan only internal links
./linkscan scan https://example.com --ignore-external
# Ignore bot-blocking and rate limit codes
./linkscan scan https://example.com --ignore-codes 403,429,999
# Output as JSON
./linkscan scan https://example.com --format json
# Generate HTML, CSV, and JSON reports
./linkscan scan https://example.com --out result.json
./linkscan report --input result.json --out ./reports
Web UI
./linkscan ui --port 8080
# Open http://localhost:8080
All flags
Global flags (available to all subcommands)
Flag
Default
Description
--depth
3
Max crawl depth
--crawl-workers
5
Concurrent page fetching goroutines
--check-workers
10
Concurrent link checking goroutines
--timeout
10
Per-request timeout in seconds
--rate-limit
0
Max requests per second per host (0 = unlimited)
--user-agent
linkscan/1.0
Custom User-Agent string
--ignore-external
false
Skip checking external links
--ignore-codes
Comma-separated HTTP status codes to treat as OK (e.g. 403,429,999)
--quiet
false
Suppress progress logs, errors only
--verbose
false
Log every request for debugging
scan flags
Flag
Default
Description
--format
table
Output format: table, json
--out
Write JSON result to this file path
--report-style
flat
Report style for table output: flat, bypage
report flags
Flag
Default
Description
--input
(required)
Path to JSON scan result file
--out
./reports
Output directory for report files
--format
html,csv,json
Comma-separated list of formats: html, csv, json, pagehtml
ui flags
Flag
Default
Description
--port
8080
Port to listen on
Report formats
Format
File
Use case
html
report.html
Human-readable flat report with summary and error table
csv
report.csv
Import into spreadsheets for filtering and sorting
json
report.json
Machine-readable, full scan result for CI pipelines
pagehtml
report-bypage.html
Page-centric HTML report grouped by source page
Link classes
Class
Meaning
OK
Link returned 2xx status
Redirect
Link returned 3xx status
BrokenClient
Link returned 4xx status (not found, forbidden, etc.)
BrokenServer
Link returned 5xx status (server error)
Timeout
Request timed out before a response was received
InvalidURL
URL could not be parsed or has an unsupported scheme
Skipped
Link was not checked (e.g. external link with --ignore-external)
How it works
Crawler: A bounded worker pool fetches pages concurrently, respecting domain boundaries and depth limits. Each discovered link is deduplicated and all source pages are tracked.
Checker: Validates links using HEAD requests with GET fallback. Per-host token bucket rate limiting prevents overwhelming target servers.
Reports: All output formats are generated from a single in-memory ScanResult struct, ensuring consistency across HTML, CSV, JSON, and page-grouped reports.