Magic: The Gathering Oracle web scraper
Pull request Compare This branch is even with eirc:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
fixtures/vcr_cassettes
lib
test
.document
.gitignore
.rvmrc
.yardopts
Gemfile
Gemfile.lock
LICENSE
README.md
Rakefile
VERSION
scapeshift.gemspec

README.md

Scapeshift

Scapeshift is a webscraper rubygem designed for the Magic: The Gathering Oracle "Gatherer" card index. Since Wizards doesn't want to make an API for this system for various reasons, I've gone ahead and made a pseudo-API here.

Scapeshift uses the delightful Nokogiri gem to parse and scrape the various Oracle pages, generating (most commonly) a SortedSet of Scapeshift::Card objects containing the card data. In the case of expansion sets, formats, etc. Scapeshift returns a SortedSet of strings.

Usage

Usage is as simple as can be:

# Grab the complete list of expansion sets
@sets = Scapeshift::Crawler.crawl :meta, :type => :sets

# Grab the card set for an expansion
@alara_cards = Scapeshift::Crawler.crawl :cards, :set => 'Shards of Alara'

# Grab a single named card
@card = Scapeshift::Crawler.crawl :single, :name => 'Counterspell'

Configuration

The gem can be easily configured with a Scapeshift.configure block (currently on the cache store option is available):

Scapeshift.configure do |config|
  config.cache = :memory_store
end

Caching

By default requests to the Gatherer website are cached in memory using ActiveSupport's MemoryStore but that can be easily configured.

To change to a memcache server simply:

Scapeshift.configure do |config|
  config.cache = :mem_cache_store, "cache-1.example.com", "cache-2.example.com"
end

You will need to install the memcache-client gem to do so.

You can also use an existing cache store by passing it as the cache option. For example in a Rails application you could:

Scapeshift.configure do |config|
  config.cache = Rails.cache
end

To disable caching DO NOT set the cache to nil as that will break stuff. Instead use the ActiveSupport's NullStore that does the same thing but through the ActiveSupport::Cache::Store API.

See the Rails Caching Guide for more info on configuring different cache stores.

Development

This gem uses Bundler to manage its dependencies for development:

$ sudo gem install bundler
$ cd /path/to/scapeshift
$ bundle install

Bundler is unlike Rubygems in that it doesn't automagically handle load paths for you. To make stuff work, you will need to start a subshell with

$ bundle exec bash

Replacing bash with the shell of your choice, of course.

TODO

Documentation

This gem uses Yardoc syntax for documentation. You can generate these docs with rake yard. Point any webserver at the docs/ directory to browse.

Simple, with Thin:

$ cd /path/to/scapeshift
$ rake yard
$ cd docs/
$ thin -A file -d start

Copyright

Copyright (c) 2010 Josh Lindsey. See LICENSE for details.