Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
lib
 
 
 
 
 
 
 
 

Crude SEO Spider

Provides a simple method of spidering a website to provide some basic url information to assist in Search Engine Optimisation.

Features

  • Detects duplicate content using MD5 hashes
  • Shows HTTP status codes for each url
  • Displays the response time and page size
  • Follows redirects
  • Export results to CSV format
  • Supports the Robots Exclusion Protocol (robots.txt)
  • Supports rel="nofollow" link attribute

Usage

For usage parameters run

./spider.pl -h

  1. First open and edit the spider.pl script and at the top set the full path to the lib directory.

  2. Modify the options in the spider.conf file, each option is commented so it should be self explanatory.

  3. Run the spider either by executing the script directly:

    ./spider.pl
    Or by running the script through perl:
    perl spider.pl

  4. While the script is running it will provide information on the currently tracked urls and will be outputting the information to results.txt file.

Options

To output to a CSV file provide the --csv=FILE perameter.

About

Script for spidering a website and providing information to assist in search engine optimisation

Resources

Releases

No releases published

Packages

No packages published

Languages