Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Obey robots.txt on top of em-http-request (Asynchronous HTTP Client)
Ruby

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
features
lib
spec
.document
.gitignore
LICENSE
README.rdoc
Rakefile
VERSION

README.rdoc

R.Daneel

An EventMachine+Ruby library to fetch urls obeying robots.txt rules.

RDaneel is built it on top of @igrigorik's em-http-request

Features

  • Support following redirects, honoring robots.txt for each host in the redirect chain.

  • Support an external cache to store robots.txt

  • Compatible with all options defined in em-http-request

Install

$ gem install rdaneel

Examples

Following redirects

require 'rdaneel'

EM.run do
  r = RDaneel.new("http://bit.ly/cbEnpa")
  r.callback{
    puts r.http_client.response_header.status
    puts r.http_client.response[0,80]
    puts r.redirects
    puts r.uri
    EM.stop
  }
  r.errback{
    puts "should not happen"
    EM.stop
  }
  r.get(:redirects => 3)
end

=> 200
=> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
=> http://bit.ly:80/cbEnpa
=> http://github.com:80/hasmanydevelopers/RDaneel

Denied by robots.txt

require 'rdaneel'

EM.run do
  r = RDaneel.new("http://github.com/hasmanydevelopers/RDaneel/tarball/v0.0.0")
  r.callback{
    puts "should not happen"
    EM.stop
  }
  r.errback{
    puts r.error
    EM.stop
  }
  r.get(:redirects => 3)
end

=> robots denied

Why RDaneel?

R Daneel Olivaw is a fictional robot created by Isaac Asimov - en.wikipedia.org/wiki/R._Daneel_Olivaw

Acknowledge

To Ilya Grigorik (@igrigorik) for em-http-request lib and his support and advice.

Note on Patches/Pull Requests

  • Fork the project.

  • Make your feature addition or bug fix.

  • Add tests for it. This is important so I don't break it in a future version unintentionally.

  • Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)

  • Send me a pull request. Bonus points for topic branches.

Copyright

Copyright © 2010 has_many :developers. See LICENSE for details.

Something went wrong with that request. Please try again.