Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Newer
Older
100644 88 lines (61 sloc) 3.36 kB
e81349b Documentation for script.rb
Philip (flip) Kromer authored
1
2 Wukong makes using Hadoop so easy a chimpanzee can use it.
3
0f51446 Now using generator (yield()) semantics rather than crudely puts'ing …
Philip (flip) Kromer authored
4 h2. How to write a Wukong script
5
6 #!/usr/bin/env ruby
7 require 'wukong'
8
9 module WordCount
10 class Mapper < Wukong::Streamer::LineStreamer
11 # Emit each word in each line.
12 def process line
13 yield line.strip.split(/\W+/).
14 end
15 end
16
17 class Reducer < Wukong::Streamer::UniqCountLinesReducer
18 end
19 end
20 # Execute the script
21 Script.new(WordCount::Mapper, WordCount::Reducer).run
22
23
b74872e Correcting readme formatting
Philip (flip) Kromer authored
24 h2. How to run a Wukong script
e81349b Documentation for script.rb
Philip (flip) Kromer authored
25
26 your/script.rb --go path/to/input_files path/to/output_dir
27
28 All of the file paths are HDFS paths except your script path, of course, which
29 is on the local filesystem.
30
31 You can supply arbitrary command line arguments (they wind up as key-value pairs
32 in the options path your mapper and reducer receive), and you can use the hadoop
33 syntax to specify more than one input file:
34
35 ./path/to/your/script.rb --any_specific_options --options=can_have_vals \
36 --go "input_dir/part_*,input_file2.tsv,etc.tsv" path/to/output_dir
37
38
b74872e Correcting readme formatting
Philip (flip) Kromer authored
39 h2. How to test your scripts
e81349b Documentation for script.rb
Philip (flip) Kromer authored
40
41 To run mapper on its own:
0f51446 Now using generator (yield()) semantics rather than crudely puts'ing …
Philip (flip) Kromer authored
42
e81349b Documentation for script.rb
Philip (flip) Kromer authored
43 cat ./local/test/input.tsv | ./examples/word_count.rb --map | more
0f51446 Now using generator (yield()) semantics rather than crudely puts'ing …
Philip (flip) Kromer authored
44
e81349b Documentation for script.rb
Philip (flip) Kromer authored
45 or if your test data lies on the HDFS,
0f51446 Now using generator (yield()) semantics rather than crudely puts'ing …
Philip (flip) Kromer authored
46
e81349b Documentation for script.rb
Philip (flip) Kromer authored
47 hdp-cat test/input.tsv | ./examples/word_count.rb --map | more
48
49
b74872e Correcting readme formatting
Philip (flip) Kromer authored
50 h2. What's up with Wukong::AndPig?
e81349b Documentation for script.rb
Philip (flip) Kromer authored
51
0f51446 Now using generator (yield()) semantics rather than crudely puts'ing …
Philip (flip) Kromer authored
52 @Wukong::AndPig@ is a small library to more easily generate code for the
b74872e Correcting readme formatting
Philip (flip) Kromer authored
53 "Pig":http://hadoop.apache.org/pig data analysis language. See its
54 "README":wukong/and_pig/README.textile for more.
e81349b Documentation for script.rb
Philip (flip) Kromer authored
55
b74872e Correcting readme formatting
Philip (flip) Kromer authored
56 h2. Why is it called Wukong?
e81349b Documentation for script.rb
Philip (flip) Kromer authored
57
58 Hadoop, as you may know, is "named after a stuffed
b74872e Correcting readme formatting
Philip (flip) Kromer authored
59 elephant."http://en.wikipedia.org/wiki/Hadoop Wukong (the Monkey King), known
60 for his power and agility, is hero of a famous Chinese Fairytale in which he
e81349b Documentation for script.rb
Philip (flip) Kromer authored
61 journeys to the land of the Elephant:
62
b74872e Correcting readme formatting
Philip (flip) Kromer authored
63 Quoting "Sun Wukong's Wikipedia entry:":http://en.wikipedia.org/wiki/Wukong
e81349b Documentation for script.rb
Philip (flip) Kromer authored
64
0f51446 Now using generator (yield()) semantics rather than crudely puts'ing …
Philip (flip) Kromer authored
65 bq. Sun Wukong (traditional Chinese: 孫悟空;
e81349b Documentation for script.rb
Philip (flip) Kromer authored
66 simplified Chinese: 孙悟空; pinyin: Sūn Wùkōng; Wade-Giles: Sun1 Wu4-k'ung1;
67 Japanese 孫悟空 (Son Gokū?)), known in the West as the Monkey King, is the main
68 character in the classical Chinese epic novel Journey to the West. In the novel,
69 he accompanies the monk Xuanzang on the journey to retrieve Buddhist sutras from
70 India.
71
0f51446 Now using generator (yield()) semantics rather than crudely puts'ing …
Philip (flip) Kromer authored
72 bq. Sun Wukong possesses incredible strength, being able to lift his 13,500 jīn
e81349b Documentation for script.rb
Philip (flip) Kromer authored
73 (8,100 kg) Ruyi Jingu Bang with ease. He also has superb speed, traveling
74 108,000 li (54,000 kilometers) in one somersault. Sun knows 72 transformations,
75 which allows him to transform into various animals and objects; he is, however,
76 shown with slight problems transforming into other people, since he is unable to
77 complete the transformation of his tail. He is a skilled fighter, capable of
78 holding his own against the best generals of heaven. Each of his hairs possesses
79 magical properties, and is capable of transforming into a clone of the Monkey
80 King himself, or various weapons, animals, and other objects. He also knows
81 various spells in order to command wind, part water, conjure protective circles
82 against demons, freeze humans, demons, and gods alike. (Journey to the West, Wu
83 Cheng'en (1500-1582), Translated by Foreign Languages Press, Beijing 1993.)
84
85 p. Sounds about right to us :) The "BBC-produced Jaime Hewlett / Damon Albarn
86 short":http://news.bbc.co.uk/sport1/hi/olympics/monkey made for the 2008
87 Olympics is highly recommended.
Something went wrong with that request. Please try again.