The Infinite Monkeywrench (IMW) is a Ruby frameworks to simplify the tasks of acquiring, extracting, transforming, loading, and packaging data. It has the following goals:
Minimize programmer time even at the expense of increasing run time.
Take data through a full transformation from raw source to packaged purity in as few lines of code as possible.
Treat data records as objects as much as possible.
Use instead of repeat better code that already exists in other libraries (FasterCSV, I'm talkin' to you).
Make what's common easy without making what's uncommon impossible.
Work with messy data as well as clean data.
Let you incorporate your own tools wherever you choose to.
The Infinite Monkeywrench is a powerful tool but it is not always the right one to use. IMW is *not* designed for
IMW is hosted on Gemcutter so it's easy to install.
You'll have to set up Gemcutter
$ sudo gem install gemcutter $ gem tumble
and then install IMW
$ sudo gem install imw
The central goal of IMW is to make workflow involved in processing a dataset from a raw source to a finished product as simple as possible.
So consider that there exist two datasets that I want to combine. The first details the historical price of bananas over the past century and the second
require 'rubygems' require 'imw'
IMW holds a registry of paths that you can define on the fly or store in a configuration file.
IMW.add_path :dropbox, "/var/www/public/dropbox" IMW.add_path :raw, "/mnt/data/raw" IMW.add_path :
This makes it easeir
IMW.path_to :raw, "one/particular/dataset" #=> "/mnt/data/raw/one/particular/dataset"
IMW makes it easy to manipulate compressed files and archives.
# Move a collection of files from a public dropbox to a processing directory raw Dir["/public/*"].each do |path| file = IMW.open(path) case when file.compressed? file.decompress.mv_to_dir "/raw" when file.archive? FileUtils.cd("/raw") do file.extract end else file.mv_to_dir("/raw") end end