Skip to content
Find file
Fetching contributors…
Cannot retrieve contributors at this time
157 lines (114 sloc) 7.83 KB

Typhoeus

http://github.com/pauldix/typhoeus/tree/master

the mailing list

Thanks to my employer kgb for allowing me to release this as open source. Btw, we’re hiring and we work on cool stuff like this every day. Get a hold of me if you rock at rails/js/html/css or if you have experience in search, information retrieval, and machine learning.

Summary

Like a modern code version of the mythical beast with 100 serpent heads, Typhoeus runs HTTP requests in parallel while cleanly encapsulating handling logic. To be a little more specific, it’s a library for accessing web services in Ruby. It’s specifically designed for building RESTful service oriented architectures in Ruby that need to be fast enough to process calls to multiple services within the client’s HTTP request/response life cycle.

Some of the awesome features are parallel request execution, memoization of request responses (so you don’t make the same request multiple times in a single group), built in support for caching responses to memcached (or whatever), and a nifty DSL for creating classes that make http calls and process responses. It uses libcurl and libcurl-multi to work this speedy magic. I wrote the c bindings myself so it’s yet another Ruby libcurl library, but with some extra awesomeness added in.

Installation

For now Typhoeus exists only on github. It requires you to have a current version of libcurl installed. I’ve tested this with 7.19.4.


gem sources -a http://gems.github.com # if you haven’t already
gem install pauldix-typhoeus

If you’re on Debian or Ubuntu and getting errors while trying to install, it could be because you don’t have the latest version of libcurl installed. Do this to fix:

sudo apt-get install libcurl4-gnutls-dev

Another problem could be if you are running Mac Ports and you have libcurl installed through there. You need to uninstall it for Typhoeus to work! The version in Mac Ports is old and doesn’t play nice. You should download curl and build from source. Then you’ll have to install the gem again.

If you’re still having issues, please let me know on the mailing list.

There’s one other thing you should know. The Easy object (which is just a libcurl thing) allows you to set timeout values in milliseconds. However, for this to work you need to build libcurl with c-ares support built in. Unfortunately, I still haven’t been able to get this to work on either my Mac or Linux. I’ll have to figure that out.

Usage

# here's an example for twitter search
# Including Typhoeus adds http methods like get, put, post, and delete.
# What's more interesting though is the stuff to build up what I call
# remote_methods.
class Twitter
  include Typhoeus
  remote_defaults :on_success => lambda {|response| JSON.parse(response.body)},
                  :on_failure => lambda {|response| puts "error code: #{response.code}"},
                  :base_uri   => "http://search.twitter.com"

  define_remote_method :search, :path => '/search.json'
  define_remote_method :trends, :path => '/trends/:time_frame.json'
end

tweets = Twitter.search(:params => {:q => "railsconf"})

# if you look at the path argument for the :trends method, it has :time_frame.
# this tells it to add in a parameter called :time_frame that gets interpolated
# and inserted.
trends = Twitter.trends(:time_frame => :current)

# and then the calls don't actually happen until the first time you
# call a method on one of the objects returned from the remote_method
puts tweets.keys # it's a hash from parsed JSON

# you can also do things like override any of the default parameters
Twitter.search(:params => {:q => "hi"}, :on_success => lambda {|response| puts response.body})

# on_success and on_failure lambdas take a response object. 
# It has four accesssors: code, body, headers, and time

# here's and example of memoization
twitter_searches = []
10.times do
  twitter_searches << Twitter.search(:params => {:q => "railsconf"})
end

# this next part will actually make the call. However, it only makes one
# http request and parses the response once. The rest are memoized.
twitter_searches.each {|s| puts s.keys}

# you can also have it cache responses and do gets automatically
# here we define a remote method that caches the responses for 60 seconds
klass = Class.new do
  include Typhoeus
  
  define_remote_method :foo, :base_uri => "http://localhost:3001", :cache_responses => 60
end

klass.cache = some_memcached_instance_or_whatever
response = klass.foo 
puts response.body # makes the request

second_response = klass.foo
puts response.body # pulls from the cache without making a request

# you can also pass timeouts on the define_remote_method or as a parameter
# Note that timeouts are in milliseconds.
Twitter.trends(:time_frame => :current, :timeout => 2000)

# you also get the normal get, put, post, and delete methods
class Remote
  include Typhoeus
end

Remote.get("http://www.pauldix.net")
Remote.put("http://", :body => "this is a request body")
Remote.post("http://localhost:3001/posts.xml", 
  {:params => {:post => {:author => "paul", :title => "a title", :body => "a body"}}})
Remote.delete("http://localhost:3001/posts/1")

The best place to see the functionality of what including Typhoeus in a class gives you is to look at the “remote_spec.rb”

Benchmarks

I set up a benchmark to test how the parallel performance works vs Ruby’s built in NET::HTTP. The setup was a local evented HTTP server that would take a request, sleep for 500 milliseconds and then issued a blank response. I set up the client to call this 20 times. Here are the results:

  net::http  0.030000   0.010000   0.040000 ( 10.054327)
  typhoeus   0.020000   0.070000   0.090000 (  0.508817)

We can see from this that NET::HTTP performs as expected, taking 10 seconds to run 20 500ms requests. Typhoeus only takes 500ms (the time of the response that took the longest.) One other thing to note is that Typhoeus keeps a pool of libcurl Easy handles to use. For this benchmark I warmed the pool first. So if you test this out it may be a bit slower until the Easy handle pool has enough in it to run all the simultaneous requests. For some reason the easy handles can take quite some time to allocate.

Next Steps

  • Write up some more examples.
  • Create a SimpleDB client library using Typhoeus.
  • Create or get someone to create a CouchDB client library using Typhoeus.
  • wire up a simple method to mock out remote requests.
  • Add support for automatic retry, exponential back-off, and queuing for later.
  • Add in the support for custom get and set methods on the cache.
  • Add in support for integrated HTTP caching with Memcached.

LICENSE

(The MIT License)

Copyright © 2009:

Paul Dix

Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
‘Software’), to deal in the Software without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so, subject to
the following conditions:

The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Something went wrong with that request. Please try again.