Skip to content
Map all links on a given site
Ruby
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
bin
lib
spec
.gitignore
.rspec
.travis.yml
Gemfile
LICENSE
README.md
Rakefile
site_mapper.gemspec

README.md

SiteMapper

Code Climate Coverage Status Docs badge Build Status Dependency Status Gem Version

Map all links on a given site.
SiteMapper will try to respect /robots.txt

Works great with Wayback Archiver a gem that crawls your site and submits each URL to the Internet Archive (Wayback Machine).

Installation

Install the gem:

gem install site_mapper

Usage

Command line usage:

# Crawl all found links on page
# that has example.com domain
site_mapper example.com

Ruby usage:

# Crawl all found links on page
# that has example.com domain
require 'site_mapper'
SiteMapper.map('example.com') do |new_url|
  puts "New URL found: #{new_url}"
end
# Log to STDOUT
SiteMapper.map('example.com', logger: :system) do |new_url|
  puts "New URL found: #{new_url}"
end

Docs

You can find the docs online on RubyDoc.

This gem is documented using yard (run from the root of this respository).

yard # Generates documentation to doc/

Contributing

Contributions, feedback and suggestions are very welcome.

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

Notes

  • Special thanks to the robots gem, which provided the bulk of the code in lib/robots.rb

Alternatives

There are a couple of great alternatives, which are more mature and has more features than this Gem and has. Please feel free to check them out:

License

MIT License

You can’t perform that action at this time.