This demo is geared towards the use of riak. The sample code can easily be adapted to other databases, assuming they can use Javascript/JSON. The sample data should be excellent for testing any Map/Reduce framework.
NoSQL MapReduce Demonstration

Author: Scott Gonyea, me(at)sgonyea(com)
Twitter: @acts_as
IRC: acts_as in #riak
Copyright: 2010
License: See Source Code and Sample Data Attribution Sections. Where not specified: MIT License.
Release Date: 2010-07-03


This project was created to highlight a few of the use-cases for MapReduce, with the riak key/value store being used for demonstration. That said; this data, the demonstrated concepts, and even some of the code, should prove to be both portable and useful, as you explore various MapReduce implementations.

This project's data and source code will be used in my presentation at the LA Ruby Meetup (July 2010). I hope you find it useful! I appreciate any and all feedback.

Sample Data Attribution

Project: mrtoolkit
File: raw-logs
Author: Charles Hayden
License: GPLv3

Project: IP to Country
File: geo-ip
Author: Webnet77
License: Click for Author's Message (GPLv3)

Project: Windy City DB
File: Windy-City-DB-Dataset
Author: Stack Overflow
License: Click for Author's Message (Creative Commons)

Source Code Attribution


  • riak (to use the source code without modification)
  • curl
  • ruby 1.9.1 (not tested on anything else)
  • gem: rake
  • gem: riak-client (officially supported by Basho)


Please see the respective files for the various chunks of sample data. They include instructions on how to load the data into riak, as well as various MapReduce functions. You may also look at the included Keynote presentation (+pdf) to see what all is highlighted by myself.