Skip to content

dburry/http_session

Repository files navigation

HttpSession

A useful yet still extremely light-weight web client built on top of Ruby Net::HTTP. Keeps certain information internally in a session for each host/port used. Great for simple web page scraping or web service API usage.

Dependencies

  • just Ruby 1.8.6 - 1.9.2

  • no third party gem dependencies!

Features

  • Extremely light-weight at just 100-200 lines of code

  • Supports GET and POST requests, including POST parameter data

  • Supports SSL connections in a natural way, and performs certificate verification using the same CA bundle as Mozilla

  • Automatically uses KeepAlives if the server supports them

  • Automatically remembers Cookies for each session (though currently ignores host/path/expires/secure options)

  • Automatically follows Redirects, up to a certain limit

  • Automatically retries again if certain possibly transient errors happen, up to a limit. Useful, for example, when the KeepAlive limit is reached and the server hangs up on you, it just retries and keeps going without missing a beat.

  • Supports additional headers for whatever simple additional HTTP features you may need.

Usage

Normal example usage, parses url string to determine its behavior:

response_object = HttpSession.get_request_url(url_string)
response_object = HttpSession.get_request_url(url_string, headers_hash)
response_object = HttpSession.post_request_url(url_string, post_params_hash)
response_object = HttpSession.post_request_url(url_string, post_params_hash, headers_hash)

Lower level examples, if your url is already broken down into pieces:

session_object  = HttpSession.use(host_string, use_ssl_boolean, port_integer)
response_object = session_object.request(uri_string, headers_hash, get_or_post_symbol, post_params_hash)

session_object  = HttpSession.use(host_string)  # defaults are: false, 80 (or 443 if use_ssl is true)
response_object = session_object.request        # defaults are: '/', {}, :get, {}

All the above return:

  • a Net::HTTPResponse object in all its glory

Or raise any of the following:

  • Timeout::Error, Errno::* (SystemCallError subclass), OpenSSL::SSL::SSLError, or EOFError - for various connection-related issues

  • Net::HTTP*Error/Exception (Net::ProtocolError subclass) - for HTTP response errors

Notes:

  • You do not need to check the response object for HTTP errors like 404 "Not Found" or others, they are raised automatically, and can be caught when needed. In my opinion this is better than how Net::HTTP behaves.

  • A common pattern is to feed the response_object.body string into Nokogiri for further processing.

Contributing

If you think you found a bug or want a feature, get involved at github.com/dburry/http_session/issues If you’d then like to contribute a patch, use Github’s wonderful fork and pull request features. Keep in mind that one of the main goals of this library is to remain light-weight, so changes done in this spirit are most likely to be included.

To set up a full development environment:

  • git clone the repository,

  • have RVM and Bundler installed,

  • then cd into your repo (follow any RVM prompts if this is your first time using that),

  • and run bundle install to pull in all the rest of the development dependencies.

  • After that point, rake -T should be fairly self-explanatory.

  • Check out the rake -T rubies tasks for some neat multi-ruby-version setup and testing.

Note: If rake -T doesn’t show much or gives you warnings about missing libraries, then perhaps you did not install RVM, or run bundle install correctly, or you do not have the right Ruby version or gemset installed/selected with RVM. You can either correct this the normal way, or (if you have RVM) run the rake rubies:setup task, which makes sure several different versions of Ruby are intalled and sets up a nicely configured gemset in each for you. You can then use RVM to switch between them for manually testing or running anything you want in any version (although the .rvmrc chooses Ruby 1.9.2 for you currently).

Alternatives

Light-weight libraries are great resource savers when you only need limited features. But they’re not for every situation. Here are some other great alternatives:

  • Mechanize - Much more fully-featured web page scraping library.

License

This library is distributed under the MIT license. Please see the LICENSE file.

About

A useful yet still extremely light-weight web client library built on top of Ruby Net::HTTP

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages