Skip to content

prabhatdotdev/linkscan

Repository files navigation

LinkScan

A fast, self-contained broken link scanner for developers and SEO teams.

CI

What it does

LinkScan crawls a website from a seed URL, discovers every link on each page, and checks whether those links are reachable. It classifies failures by type (client error, server error, timeout, invalid URL) and tracks every page where each broken link was found. Use the CLI for quick scans and CI integration, or launch the built-in local web UI for an interactive experience with downloadable reports.

Install

Download binary

Download the latest release for your platform from GitHub Releases.

Build from source

git clone https://github.com/prabhatdotdev/linkscan
cd linkscan
go build -o linkscan .

Quick start

CLI

# Scan a site and print broken links
./linkscan scan https://example.com

# Scan with rate limiting to avoid 429s
./linkscan scan https://example.com --rate-limit 2

# Scan only internal links
./linkscan scan https://example.com --ignore-external

# Ignore bot-blocking and rate limit codes
./linkscan scan https://example.com --ignore-codes 403,429,999

# Output as JSON
./linkscan scan https://example.com --format json

# Generate HTML, CSV, and JSON reports
./linkscan scan https://example.com --out result.json
./linkscan report --input result.json --out ./reports

Web UI

./linkscan ui --port 8080
# Open http://localhost:8080

All flags

Global flags (available to all subcommands)

Flag Default Description
--depth 3 Max crawl depth
--crawl-workers 5 Concurrent page fetching goroutines
--check-workers 10 Concurrent link checking goroutines
--timeout 10 Per-request timeout in seconds
--rate-limit 0 Max requests per second per host (0 = unlimited)
--user-agent linkscan/1.0 Custom User-Agent string
--ignore-external false Skip checking external links
--ignore-codes Comma-separated HTTP status codes to treat as OK (e.g. 403,429,999)
--quiet false Suppress progress logs, errors only
--verbose false Log every request for debugging

scan flags

Flag Default Description
--format table Output format: table, json
--out Write JSON result to this file path
--report-style flat Report style for table output: flat, bypage

report flags

Flag Default Description
--input (required) Path to JSON scan result file
--out ./reports Output directory for report files
--format html,csv,json Comma-separated list of formats: html, csv, json, pagehtml

ui flags

Flag Default Description
--port 8080 Port to listen on

Report formats

Format File Use case
html report.html Human-readable flat report with summary and error table
csv report.csv Import into spreadsheets for filtering and sorting
json report.json Machine-readable, full scan result for CI pipelines
pagehtml report-bypage.html Page-centric HTML report grouped by source page

Link classes

Class Meaning
OK Link returned 2xx status
Redirect Link returned 3xx status
BrokenClient Link returned 4xx status (not found, forbidden, etc.)
BrokenServer Link returned 5xx status (server error)
Timeout Request timed out before a response was received
InvalidURL URL could not be parsed or has an unsupported scheme
Skipped Link was not checked (e.g. external link with --ignore-external)

How it works

  • Crawler: A bounded worker pool fetches pages concurrently, respecting domain boundaries and depth limits. Each discovered link is deduplicated and all source pages are tracked.
  • Checker: Validates links using HEAD requests with GET fallback. Per-host token bucket rate limiting prevents overwhelming target servers.
  • Reports: All output formats are generated from a single in-memory ScanResult struct, ensuring consistency across HTML, CSV, JSON, and page-grouped reports.

Contributing

  • Go 1.26+
  • go test ./... to run the test suite
  • External dependencies: cobra, goquery, golang.org/x/time/rate
  • PRs welcome — open an issue first for large changes

License

MIT

About

No description, website, or topics provided.

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors