Skip to content
Newer
Older
100644 287 lines (216 sloc) 13.2 KB
6f1e7e1 @pauldix fixed up the readme and gemspec to take in fixes from Oleg Ivanov
pauldix authored
1 h1. Typhoeus
1223fb4 @pauldix some of the basic scaffolding for the gem. a spike of the implementat…
pauldix authored
2
6f1e7e1 @pauldix fixed up the readme and gemspec to take in fixes from Oleg Ivanov
pauldix authored
3 "http://github.com/pauldix/typhoeus/tree/master":http://github.com/pauldix/typhoeus/tree/master
f7b826b @pauldix updated the README with the new name. Please take note that the name …
pauldix authored
4
6f1e7e1 @pauldix fixed up the readme and gemspec to take in fixes from Oleg Ivanov
pauldix authored
5 "the mailing list":http://groups.google.com/group/typhoeus
6
7 Thanks to my employer "kgbweb":http://kgbweb.com for allowing me to release this as open source. Btw, we're hiring and we work on cool stuff like this every day. Get a hold of me if you rock at rails/js/html/css or if you have experience in search, information retrieval, and machine learning.
8
9 I also wanted to thank Todd A. Fisher. I ripped a good chunk of the c libcurl-multi code from his update to Curb. Awesome stuff Todd!
10
11 h2. Summary
12
13 Like a modern code version of the mythical beast with 100 serpent heads, Typhoeus runs HTTP requests in parallel while cleanly encapsulating handling logic. To be a little more specific, it's a library for accessing web services in Ruby. It's specifically designed for building RESTful service oriented architectures in Ruby that need to be fast enough to process calls to multiple services within the client's HTTP request/response life cycle.
14
15 Some of the awesome features are parallel request execution, memoization of request responses (so you don't make the same request multiple times in a single group), built in support for caching responses to memcached (or whatever), and mocking capability baked in. It uses libcurl and libcurl-multi to work this speedy magic. I wrote the c bindings myself so it's yet another Ruby libcurl library, but with some extra awesomeness added in.
16
17 h2. Installation
18
19 Typhoeus requires you to have a current version of libcurl installed. I've tested this with 7.19.4 and higher.
20 <pre>
21 gem install typhoeus --source http://gemcutter.org
22 </pre>
23 If you're on Debian or Ubuntu and getting errors while trying to install, it could be because you don't have the latest version of libcurl installed. Do this to fix:
24 <pre>
25 sudo apt-get install libcurl4-gnutls-dev
26 </pre>
27 There's also something built in so that if you have a super old version of curl that you can't get rid of for some reason, you can install in a user directory and specify that during installation like so:
28 <pre>
29 gem install typhoeus --source http://gemcutter.org -- --with-curl=/usr/local/curl/7.19.7/
30 </pre>
31
32 Another problem could be if you are running Mac Ports and you have libcurl installed through there. You need to uninstall it for Typhoeus to work! The version in Mac Ports is old and doesn't play nice. You should "download curl":http://curl.haxx.se/download.html and build from source. Then you'll have to install the gem again.
33
34 If you're still having issues, please let me know on "the mailing list":http://groups.google.com/group/typhoeus.
35
36 There's one other thing you should know. The Easy object (which is just a libcurl thing) allows you to set timeout values in milliseconds. However, for this to work you need to build libcurl with c-ares support built in.
37
38 h2. Usage
39
40 *Deprecation Warning!*
41 The old version of Typhoeus used a module that you included in your class to get functionality. That interface has been deprecated. Here is the new interface.
42
43 The primary interface for Typhoeus is comprised of three classes: Request, Response, and Hydra. Request represents an HTTP request object, response represents an HTTP response, and Hydra manages making parallel HTTP connections.
44
45 <pre>
46 require 'rubygems'
47 require 'typhoeus'
48 require 'json'
49
50 # the request object
51 request = Typhoeus::Request.new("http://www.pauldix.net",
52 :body => "this is a request body",
53 :method => :post,
54 :headers => {:Accepts => "text/html"},
55 :timeout => 100,
56 :cache_timeout => 60,
57 :params => {:field1 => "a field"})
58 # we can see from this that the first argument is the url. the second is a set of options.
59 # the options are all optional. The default for :method is :get. Timeout is measured in milliseconds.
60 # cache_timeout is measured in seconds.
61
62 # the response object will be set after the request is run
63 response = request.response
64 response.code # http status code
65 response.time # time in seconds the request took
66 response.headers # the http headers
67 response.body # the response body
68 </pre>
69
70 *Making Quick Requests*
71 The request object has some convenience methods for performing single HTTP requests. The arguments are the same as those you pass into the request constructor.
72
73 <pre>
74 response = Typhoeus::Request.get("http://www.pauldix.net")
75 response = Typhoeus::Request.put("http://localhost:3000/posts/1", :body => "whoo, a body")
76 response = Typhoeus::Request.post("http://localhost:3000/posts", :params => {:title => "test post", :content => "this is my test"})
77 response = Typhoeus::Request.delete("http://localhost:3000/posts/1")
78 </pre>
79
80 *Making Parallel Requests*
81
82 <pre>
83 # Generally, you should be running requests through hydra. Here is how that looks
84 hydra = Typhoeus::Hydra.new
85
86 first_request = Typhoeus::Request.new("http://localhost:3000/posts/1.json")
87 first_request.on_complete do |response|
88 post = JSON.parse(response.body)
89 third_request = Typhoeus::Request.new(post.links.first) # get the first url in the post
90 third_request.on_complete do |response|
91 # do something with that
92 end
93 hydra.queue third_request
94 return post
95 end
96 second_request = Typhoeus::Request.new("http://localhost:3000/users/1.json")
97 second_request.on_complete do |response|
98 JSON.parse(response.body)
99 end
100 hydra.queue first_request
101 hydra.queue second_request
102 hydra.run # this is a blocking call that returns once all requests are complete
103
104 first_request.handled_resposne # the value returned from the on_complete block
105 second_request.handled_resposne # the value returned from the on_complete block (parsed JSON)
106 </pre>
107
108 The execution of that code goes something like this. The first and second requests are built and queued. When hydra is run the first and second requests run in parallel. When the first request completes, the third request is then built and queued up. The moment it is queued Hydra starts executing it. Meanwhile the second request would continue to run (or it could have completed before the first). Once the third request is done, hydra.run returns.
109
110 *Specifying Max Concurrency*
111
112 Hydra will also handle how many requests you can make in parallel. Things will get flakey if you try to make too many requests at the same time. The built in limit is 200. When more requests than that are queued up, hydra will save them for later and start the requests as others are finished. You can raise or lower the concurrency limit through the Hydra constructor.
113
114 <pre>
115 hydra = Typhoeus::Hydra.new(:max_concurrency => 20) # keep from killing some servers
116 </pre>
117
118 *Memoization*
119 Hydra memoizes requests within a single run call. You can also disable memoization.
120
121 <pre>
122 hydra = Typhoeus::Hydra.new
123 2.times do
124 r = Typhoeus::Request.new("http://localhost/3000/users/1")
125 hydra.queue r
126 end
127 hydra.run # this will result in a single request being issued. However, the on_complete handlers of both will be called.
128 hydra.disable_memoization
129 2.times do
130 r = Typhoeus::Request.new("http://localhost/3000/users/1")
131 hydra.queue r
132 end
133 hydra.run # this will result in a two requests.
134 </pre>
135
136 *Caching*
137 Hydra includes built in support for creating cache getters and setters. In the following example, if there is a cache hit, the cached object is passed to the on_complete handler of the request object.
138
139 <pre>
140 hydra = Typhoeus::Hydra.new
141 hydra.cache_setter do |request|
142 @cache.set(request.cache_key, request.response, request.cache_timeout)
143 end
144
145 hydra.cache_getter do |request|
146 @cache.get(request.cache_key) rescue nil
147 end
148 </pre>
149
150 *Stubbing*
151 Hydra allows you to stub out specific urls and patters to avoid hitting remote servers while testing.
152
153 <pre>
154 hydra = Typhoeus::Hydra.new
155 response = Response.new(:code => 200, :headers => "", :body => "{'name' : 'paul'}", :time => 0.3)
156 hydra.stub(:get, "http://localhost:3000/users/1").and_return(response)
157
158 request = Typhoeus::Request.new("http://localhost:3000/users/1")
159 request.on_complete do |response|
160 JSON.parse(response.body)
161 end
162 hydra.queue request
163 hydra.run
164 </pre>
165
166 The queued request will hit the stub. The on_complete handler will be called and will be passed the response object. You can also specify a regex to match urls.
167
168 <pre>
169 hydra.stub(:get, /http\:\/\/localhost\:3000\/users\/.*/).and_return(response)
170 # any requests for a user will be stubbed out with the pre built response.
171 </pre>
172
173 *The Singleton*
174 All of the quick requests are done using the singleton hydra object. If you want to enable caching or stubbing on the quick requests, set those options on the singleton.
175
176 <pre>
177 hydra = Typhoeus::Hydra.hydra
178 hydra.stub(:get, "http://localhost:3000/users")
179 </pre>
180
181 *Basic Authentication*
182
183 <pre>
184 require 'base64'
185 response = Typhoeus::Request.get("http://twitter.com/statuses/followers.json",
186 :headers => {"Authorization" => "Basic #{Base64.b64encode("#{username}:#{password}")}"})
187 </pre>
188
6abf462 @pauldix bumped version to include ssl cert disable and updated the readme
pauldix authored
189 *SSL*
d2eeef9 @pauldix added some more notes on SSL
pauldix authored
190 SSL comes built in to libcurl so it's in Typhoeus as well. If you pass in a url with "https" it should just work assuming that you have your "cert bundle":http://curl.haxx.se/docs/caextract.html in order and the server is verifiable. You must also have libcurl built with SSL support enabled. You can check that by doing this:
191
192 <pre>
193 Typhoeus::Easy.new.curl_version # output should include OpenSSL/...
194 </pre>
195
196 Now, even if you have libcurl built with OpenSSL you may still have a messed up cert bundle or if you're hitting a non-verifiable SSL server then you'll have to disable peer verification to make SSL work. Like this:
6abf462 @pauldix bumped version to include ssl cert disable and updated the readme
pauldix authored
197
198 <pre>
199 Typhoeus::Request.get("https://mail.google.com/mail", :disable_ssl_peer_verification => true)
200 </pre>
201
6f1e7e1 @pauldix fixed up the readme and gemspec to take in fixes from Oleg Ivanov
pauldix authored
202 *LibCurl*
203 Typhoeus also has a more raw libcurl interface. These are the Easy and Multi objects. If you're into accessing just the raw libcurl style, those are your best bet.
1223fb4 @pauldix some of the basic scaffolding for the gem. a spike of the implementat…
pauldix authored
204
c856718 @morhekil fork info
morhekil authored
205 h2. NTLM authentication
60a23f3 @pauldix udpated readme with note about release schedule
pauldix authored
206
6f1e7e1 @pauldix fixed up the readme and gemspec to take in fixes from Oleg Ivanov
pauldix authored
207 Thanks for the authentication piece and this description go to Oleg Ivanov (morhekil). The major reason to start this fork was the need to perform NTLM authentication in Ruby. Now you can do it via Typhoeus::Easy interface using the following API.
6845d5c @pauldix added linebreaks in the hopes of fixing the formating
pauldix authored
208
d1dd330 @pauldix updted readme with all the usage! It's release time, baby!
pauldix authored
209 <pre>
c856718 @morhekil fork info
morhekil authored
210 e = Typhoeus::Easy.new
211 e.auth = {
212 :username => 'username',
213 :password => 'password',
214 :method => Typhoeus::Easy::AUTH_TYPES[:CURLAUTH_NTLM]
215 }
216 e.url = "http://example.com/auth_ntlm"
217 e.method = :get
218 e.perform
6cf2f56 @pauldix updated readme with info about the new interface and features
pauldix authored
219 </pre>
d1dd330 @pauldix updted readme with all the usage! It's release time, baby!
pauldix authored
220
6f1e7e1 @pauldix fixed up the readme and gemspec to take in fixes from Oleg Ivanov
pauldix authored
221 *Other authentication types*
d1dd330 @pauldix updted readme with all the usage! It's release time, baby!
pauldix authored
222
c856718 @morhekil fork info
morhekil authored
223 The following authentication types are available:
224 * CURLAUTH_BASIC
225 * CURLAUTH_DIGEST
226 * CURLAUTH_GSSNEGOTIATE
227 * CURLAUTH_NTLM
228 * CURLAUTH_DIGEST_IE
6845d5c @pauldix added linebreaks in the hopes of fixing the formating
pauldix authored
229
6f1e7e1 @pauldix fixed up the readme and gemspec to take in fixes from Oleg Ivanov
pauldix authored
230 *Query of available auth types*
6845d5c @pauldix added linebreaks in the hopes of fixing the formating
pauldix authored
231
c856718 @morhekil fork info
morhekil authored
232 After the initial request you can get the authentication types available on the server via Typhoues::Easy#auth_methods call. It will return a number
233 that you'll need to decode yourself, please refer to easy.rb source code to see the numeric values of different auth types.
d1dd330 @pauldix updted readme with all the usage! It's release time, baby!
pauldix authored
234
c856718 @morhekil fork info
morhekil authored
235 h2. Verbose debug output
ef289a1 @pauldix updated the docs with details about the max_concurrency option
pauldix authored
236
c856718 @morhekil fork info
morhekil authored
237 Sometime it's useful to see verbose output from curl. You may now enable it:
ef289a1 @pauldix updated the docs with details about the max_concurrency option
pauldix authored
238
239 <pre>
c856718 @morhekil fork info
morhekil authored
240 e = Typhoeus::Easy.new
241 e.verbose = 1
ef289a1 @pauldix updated the docs with details about the max_concurrency option
pauldix authored
242 </pre>
243
6f1e7e1 @pauldix fixed up the readme and gemspec to take in fixes from Oleg Ivanov
pauldix authored
244 Please note that libcurl prints it's output to the console, so you'll need to run your scripts from the console to see the debug info.
245
246 h2. Benchmarks
247
248 I set up a benchmark to test how the parallel performance works vs Ruby's built in NET::HTTP. The setup was a local evented HTTP server that would take a request, sleep for 500 milliseconds and then issued a blank response. I set up the client to call this 20 times. Here are the results:
249
250 <pre>
251 net::http 0.030000 0.010000 0.040000 ( 10.054327)
252 typhoeus 0.020000 0.070000 0.090000 ( 0.508817)
253 </pre>
254
255 We can see from this that NET::HTTP performs as expected, taking 10 seconds to run 20 500ms requests. Typhoeus only takes 500ms (the time of the response that took the longest.) One other thing to note is that Typhoeus keeps a pool of libcurl Easy handles to use. For this benchmark I warmed the pool first. So if you test this out it may be a bit slower until the Easy handle pool has enough in it to run all the simultaneous requests. For some reason the easy handles can take quite some time to allocate.
256
257 h2. Next Steps
258
259 * Add in ability to keep-alive requests and reuse them within hydra.
260 * Add support for automatic retry, exponential back-off, and queuing for later.
261
262 h2. LICENSE
263
264 (The MIT License)
265
266 Copyright (c) 2009:
267
268 "Paul Dix":http://pauldix.net
269
270 Permission is hereby granted, free of charge, to any person obtaining
271 a copy of this software and associated documentation files (the
272 'Software'), to deal in the Software without restriction, including
273 without limitation the rights to use, copy, modify, merge, publish,
274 distribute, sublicense, and/or sell copies of the Software, and to
275 permit persons to whom the Software is furnished to do so, subject to
276 the following conditions:
277
278 The above copyright notice and this permission notice shall be
279 included in all copies or substantial portions of the Software.
280
281 THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
282 EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
283 MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
284 IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
285 CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
286 TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
287 SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Something went wrong with that request. Please try again.