Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

making the examples preformatted #2

Merged
merged 1 commit into from

2 participants

@yanick

add CRs in the doc so that the examples are seen by POD as preformatted.

@spazm spazm merged commit ce87eec into spazm:master
@spazm
Owner

finally pushed the updated changes to cpan, Hadoop-Streaming-0.122420.

I have no excuse for taking this long, total steps involved:

git pull
dzil release
@yanick

\o/. Thanks! Both for the merge and the very handy module. :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Jul 7, 2012
  1. @yanick
This page is out of date. Refresh to see the latest.
Showing with 5 additions and 0 deletions.
  1. +5 −0 lib/Hadoop/Streaming.pm
View
5 lib/Hadoop/Streaming.pm
@@ -89,11 +89,13 @@ Reduce jobs are provided a stream of key\tvalue lines. multivalued keys appear
Hadoop::Mapper consumes and chomps lines from STDIN and calls map($line) once per line. This is initiated by the run() method.
example mapper input:
+
line1
line2
line3
Hadoop::Mapper transforms this into 3 calls to map()
+
map(line1)
map(line2)
map(line3)
@@ -103,6 +105,7 @@ Hadoop::Mapper transforms this into 3 calls to map()
Hadoop::Reducer abstracts this stream into an interface of (key, value-iterator). reduce() is called once per key, instead of once per line. The reduce job pulls values from the iterator and outputs key/value pairs to STDOUT. emit() is provided as a convenience for outputing key/value pairs.
example reducer input:
+
key1 value1
key2 valuea
key2 valuec
@@ -111,6 +114,7 @@ example reducer input:
key3 valuebar
Hadoop::Streaming::Reduce transforms this input into three calls to reduce():
+
reduce( key, iterator_over(qw(value1)) );
reduce( key2, iterator_over(qw(valuea valuec valueb)) );
reduce( key3, iterator_over(qw(valuefoo valuebarr)) );
@@ -118,6 +122,7 @@ Hadoop::Streaming::Reduce transforms this input into three calls to reduce():
=item Hadoop::Streaming::Combiner interface
The Hadoop::Streaming::Combiner interface is analagous to the Hadoop::Streaming::Reducer interface. combine() is called instead of reduce() for each key. The above example would produce three calls to combine():
+
combine( key, iterator_over(qw(value1)) );
combine( key2, iterator_over(qw(valuea valuec valueb)) );
combine( key3, iterator_over(qw(valuefoo valuebarr)) );
Something went wrong with that request. Please try again.