Skip to content
A DSL for Scraping Information from Websites
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
lib
.gitignore
Gemfile
LICENSE.txt
README.md
Rakefile
cut.gemspec

README.md

Cut

A DSL for Scraping Websites

Installation

Add this line to your application's Gemfile:

gem 'cut'

And then execute:

$ bundle

Or install it yourself as:

$ gem install cut

Usage

Search Google:

class SearchResult

  include Cut

  url "http://google.com/search?q={{keywords}}"

  selector "li.g"

  map :title, String, to: "h3.r"
  map :url,   String, to: "div.s cite", operation: lambda {|str| str.upcase }

end

Return Results:

SearchResult.all(keywords: "war and peace")
#=> [#<SearchResult:0x007f94bbfaae90 @title="War and Peace - Wikipedia, the free encyclopedia", @url="HTTPS://EN.WIKIPEDIA.ORG/WIKI/WAR_AND_PEACE">, #<SearchResult:0x007f94beed97c0 @title="War and Peace (Vintage Classics): Leo Tolstoy, Richard Pevear ...", @url="WWW.AMAZON.COM/WAR-PEACE-VINTAGE-CLASSICS.../DP/1400079985">, #<SearchResult:0x007f94be95ee80 @title="War and Peace (1956) - IMDb", @url="WWW.IMDB.COM/TITLE/TT0049934/">, #<SearchResult:0x007f94be9cb198 @title="SparkNotes: War and Peace", @url="WWW.SPARKNOTES.COM/LIT/WARANDPEACE/">, #<SearchResult:0x007f94be9c7ea8 @title="War and Peace by graf Leo Tolstoy - Free Ebook - Project Gutenberg", @url="WWW.GUTENBERG.ORG/EBOOKS/2600">, #<SearchResult:0x007f94bc83f218 @title="War and Peace by Leo Tolstoy - Reviews, Discussion, Bookclubs, Lists", @url="WWW.GOODREADS.COM/BOOK/SHOW/656.WAR_AND_PEACE">, #<SearchResult:0x007f94bba7ee80 @title="War and Peace - The Literature Network", @url="WWW.ONLINE-LITERATURE.COM/TOLSTOY/WAR_AND_PEACE/">, #<SearchResult:0x007f94bba7b820 @title="War and Peace - graf Leo Tolstoy - Google Books", @url="BOOKS.GOOGLE.COM/BOOKS/ABOUT/WAR_AND_PEACE.HTML?ID=2GOK4HJO2VKC">, #<SearchResult:0x007f94bbed4ac0 @title="Images for war and peace", @url="">, #<SearchResult:0x007f94bdda0eb8 @title="War and Peace - Shmoop", @url="WWW.SHMOOP.COM/WAR-AND-PEACE/">, #<SearchResult:0x007f94bdd695d0 @title="War and Peace - Planet PDF", @url="WWW.PLANETPDF.COM/PLANETPDF/PDFS/FREE_EBOOKS/WAR_AND_PEACE_NT.PDF">, #<SearchResult:0x007f94bdde53d8 @title="News for war and peace", @url="">]

SearchResult.first(keywords: "war and peace")
#=> #<SearchResult:0x007f94bdfbeb78 @title="War and Peace - Wikipedia, the free encyclopedia", @url="HTTPS://EN.WIKIPEDIA.ORG/WIKI/WAR_AND_PEACE">

Contributing

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request
You can’t perform that action at this time.