Skip to content

Commit

Permalink
Merge pull request #1 from gatesvp/master
Browse files Browse the repository at this point in the history
Map-reduce example for "pivot"
  • Loading branch information
gatesvp committed May 9, 2011
2 parents 49a2316 + 6eebfd6 commit 1ab5e5d
Showing 1 changed file with 74 additions and 0 deletions.
74 changes: 74 additions & 0 deletions content/patterns/pivot_data.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
---
title: Pivot Data with Map reduce
created_at: 2011-05-05 18:0:024.036546 -04:00
recipe: true
author: Gaetan Voyer-Perrault
description: How to use map-reduce to pivot table data.
filter:
- erb
- markdown
---

### Problem

You have a collection of Actors with an array of the Movies they've done.

You want to generate a collection of Movies with an array of Actors in each.

Some sample data

<% code 'javascript' do %>
db.actors.insert( { actor: "Richard Gere", movies: ['Pretty Woman', 'Runaway Bride', 'Chicago'] });
db.actors.insert( { actor: "Julia Roberts", movies: ['Pretty Woman', 'Runaway Bride', 'Erin Brockovich'] });
<% end %>

### Solution

We need to loop through each movie in the Actor document and emit each Movie individually.

The catch here is in the reduce phase. We cannot emit an array from the reduce phase, so we must build an Actors array inside of the "value" document that is returned.

#### The code

<% code 'javascript' do %>
map = function() {
for(var i in this.movies){
key = { movie: this.movies[i] };
value = { actors: [ this.actor ] };
emit(key, value);
}
}

reduce = function(key, values) {
actor_list = { actors: [] };
for(var i in values) {
actor_list.actors = values[i].actors.concat(actor_list.actors);
}
return actor_list;
}
<% end %>

Notice how actor_list is actually a javascript object that contains an array. Also notice that map emits the same structure.

Run the following to execute the command and print the result:

<% code 'javascript' do %>
printjson(db.foo.mapReduce(map, reduce, "pivot"));
db.pivot.find().forEach(printjson);
<% end %>

##### Reduce Step

The reduce function is trivial, as it simply performs a count:

<% code 'javascript' do %>
reduce = "function(key, values) {
var count = 0;

values.forEach(function(v) {
count += v['count'];
});

return {count: count};
}"
<% end %>

0 comments on commit 1ab5e5d

Please sign in to comment.