Skip to content
Ruby gem to send URLs to Wayback Machine
Branch: master
Clone or download
dependabot-bot and buren Update rake requirement from ~> 10.3 to ~> 12.3
Updates the requirements on [rake]( to permit the latest version.
- [Release notes](
- [Changelog](
- [Commits](ruby/rake@v10.3.0...v12.3.2)

Signed-off-by: dependabot[bot] <>
Latest commit 3a14796 Apr 14, 2019
Type Name Latest commit message Commit time
Failed to load latest commit information.
lib Add support for multiple hosts Apr 7, 2019
spec Re-add coverage code Sep 23, 2017
.gitignore Rewrite - CLI, Proper concurrency, added test suite, proper Sitemap Jul 29, 2017
.travis.yml Add ruby 2.0 to Travis CI build matrix. Closes #18 Aug 6, 2017 Bump version from 1.2.1 to 1.3.0 and update CHANGELOG Jan 24, 2019
Gemfile Initial commit Jul 17, 2014
LICENSE Renamed LICENSE file Apr 2, 2015 Remove gemnasium badge Jan 24, 2019
Rakefile Fix rake console task Apr 7, 2019


Post URLs to Wayback Machine (Internet Archive), using a crawler, from Sitemap(s), or a list of URLs.

The Wayback Machine is a digital archive of the World Wide Web [...] The service enables users to see archived versions of web pages across time ...
- Wikipedia

Build Status Code Climate Docs badge Gem Version



Install the gem:

$ gem install wayback_archiver

Or add this line to your application's Gemfile:

gem 'wayback_archiver'

And then execute:

$ bundle



  • auto (the default) - Will try to
    1. Find Sitemap(s) defined in /robots.txt
    2. Then in common sitemap locations /sitemap-index.xml, /sitemap.xml etc.
    3. Fallback to crawling (using the excellent spidr gem)
  • sitemap - Parse Sitemap(s), supports index files (and gzip)
  • urls - Post URL(s)


First require the gem

require 'wayback_archiver'

Configuration (the below values are the defaults)

WaybackArchiver.concurrency = 5
WaybackArchiver.user_agent = WaybackArchiver::USER_AGENT
WaybackArchiver.logger =
WaybackArchiver.max_limit = -1 # unlimited
WaybackArchiver.adapter = WaybackArchiver::WaybackMachine # must implement #call(url)

For a more verbose log you can configure WaybackArchiver as such:

WaybackArchiver.logger = do |logger|
  logger.progname = 'WaybackArchiver'
  logger.level = Logger::DEBUG

Pro tip: If you're using the gem in a Rails app you can set WaybackArchiver.logger = Rails.logger.



# auto is the default

# or explicitly
WaybackArchiver.archive('', strategy: :auto)


WaybackArchiver.archive('',  strategy: :crawl)

Only send one single URL

WaybackArchiver.archive('', strategy: :url)

Send multiple URLs

WaybackArchiver.archive(%w[], strategy: :urls)

Send all URL(s) found in Sitemap

WaybackArchiver.archive('', strategy: :sitemap)

# works with Sitemap index files too
WaybackArchiver.archive('', strategy: :sitemap)

Specify concurrency

WaybackArchiver.archive('', strategy: :auto, concurrency: 10)

Specify max number of URLs to be archived

WaybackArchiver.archive('', strategy: :auto, limit: 10)

Each archive strategy can receive a block that will be called for each URL

WaybackArchiver.archive('', strategy: :auto) do |result|
  if result.success?
    puts "Successfully archived: #{result.archived_url}"
    puts "Error (HTTP #{result.code}) when archiving: #{result.archived_url}"

Use your own adapter for posting found URLs

WaybackArchiver.adapter = ->(url) { puts url } # whatever that responds to #call

ℹ️ This gem uses the spidr gem that has a bug in the version that is pushed to RubyGems, it's fixed in the master branch. Simply add gem 'spidr', github: 'postmodern/spidr' to your Gemfile to use the fixed version. See #25 for details.



wayback_archiver [<url>] [options]

Print full usage instructions

wayback_archiver --help



# auto is the default

# or explicitly
wayback_archiver --auto


wayback_archiver --crawl

Only send one single URL

wayback_archiver --url

Send multiple URLs

wayback_archiver --urls

Crawl multiple URLs

wayback_archiver --crawl

Send all URL(s) found in Sitemap


# works with Sitemap index files too

Most options

wayback_archiver --auto --concurrency=10 --limit=100 --log=output.log --verbose

View archive:*/ (replace with to your desired domain).


You can find the docs online on RubyDoc.

This gem is documented using yard (run from the root of this repository).

yard # Generates documentation to doc/


Contributions, feedback and suggestions are very welcome.

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request


MIT License


You can’t perform that action at this time.