Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Now using generator (yield()) semantics rather than crudely puts'ing …
…results
- Loading branch information
Philip (flip) Kromer
committed
Feb 16, 2009
1 parent
b323256
commit 0f51446
Showing
12 changed files
with
159 additions
and
89 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,11 @@ | ||
require 'wukong/streamer/base' | ||
require 'wukong/streamer/accumulating_reducer' | ||
require 'wukong/streamer/line_streamer' | ||
require 'wukong/streamer/struct_streamer' | ||
# | ||
require 'wukong/streamer/filter' | ||
# | ||
require 'wukong/streamer/accumulating_reducer' | ||
require 'wukong/streamer/list_reducer' | ||
require 'wukong/streamer/uniq_by_last_reducer' | ||
require 'wukong/streamer/uniq_count_keys_reducer' | ||
require 'wukong/streamer/uniq_count_lines_reducer' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
module Wukong | ||
module Streamer | ||
class LineStreamer < Wukong::Streamer::Base | ||
# | ||
# Turns a flat line into a record for +#process+ to consume | ||
# | ||
def recordize line | ||
[line] | ||
end | ||
end | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
module Wukong | ||
module Streamer | ||
# | ||
# Emit each unique key and the count of its occurrences | ||
# | ||
class ListReducer < Wukong::Streamer::AccumulatingReducer | ||
attr_accessor :values | ||
|
||
# reset the counter to zero | ||
def reset! | ||
super | ||
self.values = [] | ||
end | ||
|
||
# record one more for this key | ||
def accumulate *record | ||
self.values << record | ||
end | ||
end | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,30 +1,48 @@ | ||
module Wukong | ||
module Streamer | ||
# | ||
# Mix StructRecordizer into any streamer to make it accept a stream of | ||
# objects -- the first field in each line is turned into a class and used to | ||
# instantiate an object using the remaining fields on that line. | ||
# | ||
# | ||
class StructStreamer < Wukong::Streamer::Base | ||
def itemize line | ||
StructItemizer.itemize *super(line) | ||
end | ||
end | ||
|
||
# | ||
# | ||
# | ||
module StructItemizer | ||
module StructRecordizer | ||
def self.class_from_resource klass_name | ||
begin klass = klass_name.to_s.camelize.constantize | ||
# kill off all but class name | ||
klass_name = klass_name.gsub(/-.*$/, '') | ||
begin | ||
# convert it to class name | ||
klass = klass_name.to_s.camelize.constantize | ||
rescue ; warn "Bogus class name '#{klass_name}'" ; return ; end | ||
end | ||
|
||
def self.itemize klass_name, *vals | ||
# | ||
# Turned the first field into a class name, then use the remaining fields | ||
# on that line to instantiate the object to process. | ||
# | ||
def self.recordize klass_name, *fields | ||
return if klass_name =~ /^(?:bogus-|bad_record)/ | ||
klass_name.gsub!(/-.*$/, '') # kill off all but class name | ||
klass = self.class_from_resource(klass_name) or return | ||
[ klass.new(*vals) ] | ||
klass = class_from_resource(klass_name) or return | ||
# instantiate the class using the remaining fields on that line | ||
[ klass.new(*fields) ] | ||
end | ||
|
||
# | ||
# | ||
# | ||
def recordize line | ||
StructRecordizer.recordize line.split("\t") | ||
end | ||
end | ||
|
||
# | ||
# Processes file as a stream of objects -- the first field in each line is | ||
# turned into a class and used to instantiate an object using the remaining | ||
# fields on that line. | ||
# | ||
# See [StructRecordizer] for more. | ||
# | ||
class StructStreamer < Wukong::Streamer::Base | ||
include StructRecordizer | ||
end | ||
end | ||
end |
Oops, something went wrong.