quellhorst / robots

A robots.txt to block all those pesky bandwidth wasting bots

This URL has Read+Write access

quellhorst (author)
Sun Oct 05 21:16:13 -0700 2008
commit  e5aa9104e0f4926a242fd761e146db2ddf358216
tree    2066cdb2620ce4f90b4debe7df6210886f372ede
parent  3e6d6e9136ac8f264da3ffa6d17196b975c2440c
robots /
README.textile

Robots.txt Examples

Fend off those pesky robots!

This project resulted from having to specifically block useless bots that requests thousands of pages
per day and send no traffic. Several example bots files are included that could be renamed and copied to
robots.txt in the root of your web application. Your robots file should be accessible at
http://www.yourdomain.com/robots.txt

I have seen sites that had multiple servers just to handle excessive bot load.

Files included

robots.txt Standard bot file should be usable for most sites.
Only disallows know bad bots.

robots.major.txt Has a white-list for major search engine and blocks
everything else

robots.noarchive.txt same as above but disallows archive.org bot which uses
lots of traffic and doesn’t send much traffic

robots.wordpress.txt Entries for Wordpress blogs

robots.none.txt Block all bots

http://abtain.com/

Licensed under the MIT License