Ruby library to interact with HTTP Archive (.har) files.
Ruby JavaScript
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.yardoc
doc
lib
spec
Gemfile
Gemfile.lock
LICENSE.txt
README.md
Rakefile
http_archive.gemspec

README.md

http_archive

Interaction with HTTP Archives, loosely following the interface of the Archive::HAR Perl module. This Ruby library provides an API to HTTP Archive (.har) files. .har files are JSON-formatted files which contain data about browser interaction with a web page, such as load data, cookie data, load times and URLs of artifacts.

More on HTTP Archives:
Wikipedia
HAR Specification

Installation

$ gem install http_archive

to use in your script:

require 'http_archive'

Usage

http_archive encapsulates all parts of a .har-file as an 'archive' object and fields of that object. To start, you create an HttpArchive::Archive object with a String or File as parameter like so:

archive = HttpArchive::Archive.new(File.open('/myfile.har', 'r'))

The 'archive' object now holds data from the archive as attributes, namely the following fields which are objects themselves:

  • A 'browser' object with 'name' and 'version' fields
  • A 'creator' object with 'name' and 'version' fields
  • A 'pages'-array with 'page' objects
  • An 'entries'-array with 'entry' objects

A 'page' object holds data about the page interaction as a whole: When the interaction started, page title, overall load time and so forth. An 'entry' object holds information about every interaction made for that page, such as: duration, request and response data, duration, server information...

To interact with that data, you query the 'archive' object. Examples:

puts archive.browser.name # => "Firefox"
puts archive.creator.name # => "Firebug"
all_pages = archive.pages # normally just one
puts all_pages.first.on_load # => 6745, 6.7 seconds load time

The 'entry' objects hold data of the requests and responses as objects as well, with the header data as a Hash:

first_entry = archive.entries.first
puts first_entry.request.http_method # => "GET"
puts first_entry.request.headers['Connection'] # => "keep-alive"
# More interesting stuff is here:
puts first_entry.response.content # => {"mimeType"=>"text/html", "size"=>0}
puts first_entry.response.status # => 200
puts page.entry.time # => 54

For more info, see the docs for each class. To summarize the data, http_archive can print out a table representation like the one you are used to from Firebug et cetera:

archive.print_table

gives you this:

Metrics for: 'Software is hard', 26 Requests, 0.36MB downloaded. Load time: 6.745s

GET http://www.janodvarko.cz/        302 Moved Temporarily    0.0 KB     0.054s
GET /index.php                       301 Moved Permanently    0.0 KB     0.469s
GET http://www.janodvarko.cz/blog/   200 OK                   52.95 KB   2.593s
GET /l10n.js?ver=20101110            200 OK                   0.31 KB    0.065s
GET /prototype.js?ver=1.6.1          200 OK                   139.85 KB  1.06s
GET /wp-scriptaculous.js?ver=1.8.3   200 OK                   2.94 KB    0.136s
GET /effects.js?ver=1.8.3            200 OK                   38.47 KB   0.665s
GET /geshi.css                       200 OK                   1.03 KB    0.261s
GET /lightbox.css                    200 OK                   1.42 KB    0.269s
GET /lightbox.js                     200 OK                   23.84 KB   0.55s
GET /rss.gif                         200 OK                   0.62 KB    1.114s
GET /x.png                           200 OK                   1.37 KB    1.151s
GET /useincommandline.png            200 OK                   21.55 KB   2.488s
GET /red-text.png                    200 OK                   18.97 KB   2.778s
GET /simple-log.png                  200 OK                   27.14 KB   3.135s
GET /start-button.png                200 OK                   11.29 KB   2.29s
GET /wordpress.gif                   200 OK                   0.52 KB    2.316s
GET /creativebits.gif                200 OK                   0.34 KB    2.323s
GET /urchin.js                       200 OK                   22.68 KB   1.476s
GET /style.css                       200 OK                   9.24 KB    0.43s
GET /quote.gif                       200 OK                   1.62 KB    1.716s
GET /sidebar_top.gif                 200 OK                   0.11 KB    1.767s
GET /sidebar_bottom.gif              200 OK                   0.11 KB    1.797s
GET /&utmac=UA-3586722-1&utmcc=__u   200 OK                   0.04 KB    1.318s
GET /loading.gif                     200 OK                   2.77 KB    0.071s
GET /closelabel.gif                  200 OK                   0.98 KB    0.089s

Contributing

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request