Skip to content
Newer
Older
100755 62 lines (54 sloc) 1.26 KB
5c0ca18 Correcting #emit to handle Structs
Philip (flip) Kromer authored Feb 16, 2009
1 #!/usr/bin/env ruby
1e791fa fixed f'ed up requires/rubypaths in example scripts
Philip (flip) Kromer authored Jan 29, 2011
2 require 'rubygems'
947156b Big cleanup of the examples/ directory
Philip (flip) Kromer authored Jan 28, 2011
3 require 'wukong/script'
5c0ca18 Correcting #emit to handle Structs
Philip (flip) Kromer authored Feb 16, 2009
4
5 module Size
6 #
7 # Feed the entire dataset through wc and sum the results
8 #
9 class Script < Wukong::Script
10 #
11 # Don't implement a wukong script to do something if there's a unix command
12 # that does it faster: just override map_command or reduce_command in your
13 # subclass of Wukong::Script to return the complete command line
14 #
15 def map_command
16 '/usr/bin/wc'
17 end
e88d4e2 updated examples to work with new options structure
Philip (flip) Kromer authored Feb 18, 2009
18
5c0ca18 Correcting #emit to handle Structs
Philip (flip) Kromer authored Feb 16, 2009
19 # Make all records go to one reducer
e88d4e2 updated examples to work with new options structure
Philip (flip) Kromer authored Feb 18, 2009
20 def default_options
21 super.merge :reduce_tasks => 1
5c0ca18 Correcting #emit to handle Structs
Philip (flip) Kromer authored Feb 16, 2009
22 end
23 end
e88d4e2 updated examples to work with new options structure
Philip (flip) Kromer authored Feb 18, 2009
24
5c0ca18 Correcting #emit to handle Structs
Philip (flip) Kromer authored Feb 16, 2009
25 #
26 # Sums the numeric value of each column in its input
27 #
28 class Reducer < Wukong::Streamer::Base
29 attr_accessor :sums
30
31 #
32 # The unix +wc+ command uses whitespace, not tabs, so we'll recordize
33 # accordingly.
34 #
35 def recordize line
36 line.strip.split(/\s+/)
37 end
e88d4e2 updated examples to work with new options structure
Philip (flip) Kromer authored Feb 18, 2009
38
5c0ca18 Correcting #emit to handle Structs
Philip (flip) Kromer authored Feb 16, 2009
39 #
40 # add each corresponding column in the input
41 #
42 def process *vals
e88d4e2 updated examples to work with new options structure
Philip (flip) Kromer authored Feb 18, 2009
43 self.sums = vals.zip( sums || [] ).map{|val,sum| val.to_i + sum.to_i }
44 end
45
5c0ca18 Correcting #emit to handle Structs
Philip (flip) Kromer authored Feb 16, 2009
46 #
47 # run through the whole reduction input and then output the total
48 #
49 def stream *args
50 super *args
51 emit sums
52 end
53 end
54 end
55
56 # Execute the script
57 Size::Script.new(
58 nil,
947156b Big cleanup of the examples/ directory
Philip (flip) Kromer authored Jan 28, 2011
59 Size::Reducer,
60 :reduce_tasks => 1
5c0ca18 Correcting #emit to handle Structs
Philip (flip) Kromer authored Feb 16, 2009
61 ).run
Something went wrong with that request. Please try again.