Skip to content
Punishing bad robots for misbehaving.
PHP ApacheConf
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.htaccess
README.md
index.php
maze1.jpg
robots.txt

README.md

That Which Cannot Be Crawled

Punishing bad robots for misbehaving.

"That Which Cannot Be Crawled" is a very simple website that traps any web crawlers that ignore the robots.txt document. Meaning, if they visit a page they're not supposed to visit, they get stuck in a never-ending labyrinth of links to random URLs.

Yes, I understand that the artistic liberties taken make it easier for a web crawler to defeat. It's the concept that interested me. Anyone who really wanted to use it would need to make modifications anyways.

Click here for the live demo

You can’t perform that action at this time.