Skip to content
🐍 + 🤖 Python bot that crawls your website looking for dead stuff
Python
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.gitignore
LICENSE
README.md Fix example URL Aug 20, 2019
bot-in-action.gif
deadseeker.py

README.md

This was for my tutorial on building a dead link checker so its scope has been kept quite small.

Broken Link Crawler

Desktop

Let's say I have a website and I want to find any dead links and images on this website.

$ python deadseeker.py 'https://healeycodes.com/'
> 404 - https://docs.python.org/3/library/missing.html
> 404 - https://github.com/microsoft/solitare2

It's that simple. The website is crawled, and all href and src attributes are sent a request. Errors are reported. This bot doesn't observe robots.txt but you should.

It is not a clever bot. But it is a good bot.


Accepting (small) PRs and issues!

You can’t perform that action at this time.