mrflip / wukong

Ruby libraries for efficient, effective Hadoop streaming

This URL has Read+Write access

Philip (flip) Kromer (author)
Sun Nov 08 13:38:51 -0800 2009
commit  07912149a00fb0d3c206bdc1a8759516f6f250b1
tree    214c20986608c29db2438cb643d561d5f0774603
parent  e1fe6784fcd165ee6bff129df74dd9c38cf6f3e0
wukong / examples
name age message
..
file README.txt Mon Jun 22 03:35:10 -0700 2009 Breadth first search [Philip (flip) Kromer]
file apache_log_parser.rb Thu Oct 29 03:00:41 -0700 2009 Documentation [Philip (flip) Kromer]
file count_keys.rb Fri Jul 24 08:48:43 -0700 2009 Making startup times faster. No longer loading ... [Philip (flip) Kromer]
file count_keys_at_mapper.rb Wed Apr 08 20:01:51 -0700 2009 Update example require paths to include lib [Empact]
directory graph/ Thu Oct 29 03:00:41 -0700 2009 Documentation [Philip (flip) Kromer]
file package-local.rb Wed Apr 08 20:01:51 -0700 2009 Update example require paths to include lib [Empact]
file package.rb Sat Jul 25 22:28:13 -0700 2009 Tweaking scripts to sync & package resources. [Philip (flip) Kromer]
directory pagerank/ Fri Jul 10 23:15:21 -0700 2009 Added to readme [Philip (flip) Kromer]
file rank_and_bin.rb Tue Jul 28 10:20:47 -0700 2009 Made struct streamer keep the suffix (class_nam... [Philip (flip) Kromer]
file run_all.sh Wed Feb 18 02:00:33 -0800 2009 updated examples to work with new options struc... [Philip (flip) Kromer]
file sample_records.rb Sun Nov 08 13:37:57 -0800 2009 Added reuse_jvms option-- helps speed up things... [Philip (flip) Kromer]
file size.rb Wed Apr 08 20:01:51 -0700 2009 Update example require paths to include lib [Empact]
file word_count.rb Tue Jun 23 17:14:36 -0700 2009 word count [Philip (flip) Kromer]
examples/README.txt
Examples:


* sample_records -- extract a random sample from a collection of data

* word_count

* apache_log_parser -- example for parsing standard apache webserver log files.

* wordchains -- solving a word puzzle using breadth-first search of a graph

* graph -- some generic graph

* pagerank -- use the pagerank algorithm to find the most 'interesting'
  (central) nodes of a network graph