Skip to content

Commit

Permalink
README
Browse files Browse the repository at this point in the history
  • Loading branch information
heynemann committed Jul 17, 2012
1 parent c52ac6f commit a2a3974
Showing 1 changed file with 21 additions and 0 deletions.
21 changes: 21 additions & 0 deletions README.md
Expand Up @@ -153,7 +153,28 @@ mapper has with a specific input stream and with a specific reducer.
Reducing
--------

After all input streams have been mapped, it is time to reduce our data to one
coherent value. This is what the reducer does.

In the case of counting word occurrences, a sample implementation is as
follows:

from collections import defaultdict

class CountWordsReducer:
job_type = 'count-words'

def reduce(self, app, items):
word_freq = defaultdict(int)
for line in items:
for word, frequency in line:
word_freq[word] += frequency

return word_freq

The `job_type` property is required and specifies the relationship that this
reducer has with mappers and with a specific input stream.

This reducer will return a dictionary that contains all the words and the
frequency with which they occur in the given file.

0 comments on commit a2a3974

Please sign in to comment.