Obey robots.txt on top of em-http-request (Asynchronous HTTP Client)
Ruby
Latest commit 0dcba0d Sep 14, 2010 @edgar edgar Version bump to 0.2.3
Permalink
Failed to load latest commit information.
features Adjusted feature for new version of em-http-request thata handeld pro… Aug 30, 2010
lib bug fixed in verbose method Sep 13, 2010
spec Added optional specs that access in a concurrent way several real url… Sep 13, 2010
.document first commit Jul 13, 2010
.gitignore
LICENSE
README.rdoc fix a typo in README Jul 19, 2010
Rakefile Added optional specs that access in a concurrent way several real url… Sep 13, 2010
VERSION Version bump to 0.2.3 Sep 13, 2010

README.rdoc

R.Daneel

An EventMachine+Ruby library to fetch urls obeying robots.txt rules.

RDaneel is built it on top of @igrigorik's em-http-request

Features

  • Support following redirects, honoring robots.txt for each host in the redirect chain.

  • Support an external cache to store robots.txt

  • Compatible with all options defined in em-http-request

Install

$ gem install rdaneel

Examples

Following redirects

require 'rdaneel'

EM.run do
  r = RDaneel.new("http://bit.ly/cbEnpa")
  r.callback{
    puts r.http_client.response_header.status
    puts r.http_client.response[0,80]
    puts r.redirects
    puts r.uri
    EM.stop
  }
  r.errback{
    puts "should not happen"
    EM.stop
  }
  r.get(:redirects => 3)
end

=> 200
=> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
=> http://bit.ly:80/cbEnpa
=> http://github.com:80/hasmanydevelopers/RDaneel

Denied by robots.txt

require 'rdaneel'

EM.run do
  r = RDaneel.new("http://github.com/hasmanydevelopers/RDaneel/tarball/v0.0.0")
  r.callback{
    puts "should not happen"
    EM.stop
  }
  r.errback{
    puts r.error
    EM.stop
  }
  r.get(:redirects => 3)
end

=> robots denied

Why RDaneel?

R Daneel Olivaw is a fictional robot created by Isaac Asimov - en.wikipedia.org/wiki/R._Daneel_Olivaw

Acknowledge

To Ilya Grigorik (@igrigorik) for em-http-request lib and his support and advice.

Note on Patches/Pull Requests

  • Fork the project.

  • Make your feature addition or bug fix.

  • Add tests for it. This is important so I don't break it in a future version unintentionally.

  • Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)

  • Send me a pull request. Bonus points for topic branches.

Copyright

Copyright © 2010 has_many :developers. See LICENSE for details.