A command-line tool for automated website link checking. This tool helps QA engineers and web developers identify broken links across their websites by crawling pages and validating all internal links.
- 🔍 Recursive website crawling
- 🌐 Domain-scoped link checking (only checks internal links)
- 📊 Generates both CSV and Markdown reports
- ⚡ Real-time progress tracking
- 🚨 Detailed error reporting
- ⏱️ Configurable navigation timeouts
# Install globally from npm
npm install -g qa-engineer
# Or install globally from source
git clone https://github.com/yourusername/qa-engineer.git
cd qa-engineer
npm install
npm install -g .qa-engineer https://mywebsite.test [--external|-e]Replace https://mywebsite.test with the website URL you want to check.
Options:
--externalor-e: Enable checking of external links (links to other domains)
Starting link check from: https://mywebsite.test
Checking: https://mywebsite.test
Checking: https://mywebsite.test/about
Checking: https://mywebsite.test/contact
...
Scan completed!
Total URLs checked: 25
Broken links found: 3
Internal broken links: 1
External broken links: 2
CSV report generated: broken-links.csv
Markdown report generated: broken-links.md
The tool generates two types of reports:
-
CSV Report (broken-links.csv):
URL,Domain Type,Error "https://mywebsite.test/missing-page","Internal","Page not found (404)" "https://external-site.test/broken","External","Connection refused" "https://another-site.test/error","External","Server error (500)"
-
Markdown Report (broken-links.md):
# Broken Links Report | URL | Domain Type | Error | | --- | ----------- | ----- | | https://mywebsite.test/missing-page | Internal | Page not found (404) | | https://external-site.test/broken | External | Connection refused | | https://another-site.test/error | External | Server error (500) |
- Starts at the provided base URL
- Uses Puppeteer to load and render the page
- Extracts all anchor tags (
<a>elements) - Filters for internal links (same domain)
- Recursively visits each unvisited link
- Records any errors or non-200 status codes
- Generates detailed reports of findings
- Built with Node.js and Puppeteer
- Handles dynamic JavaScript-rendered content
- Respects robots.txt through Puppeteer
- Smart error detection:
- HTTP status codes (404, 403, 500, etc.)
- Network issues (timeouts, DNS failures, SSL errors)
- Access restrictions
- Domain-aware link checking:
- Internal links (same domain as base URL)
- External links (different domains, optional)
- Configurable 30-second timeout per page
- Memory-efficient using a queue-based crawler
# Clone the repository
git clone https://github.com/yourusername/qa-engineer.git
# Install dependencies
cd qa-engineer
npm install
# Make your changes
# Install globally to test
npm install -g .ISC
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request