-
Notifications
You must be signed in to change notification settings - Fork 2
Examples
cgrep is a wrapper around the cmr-grep that provides an interface that is familiar to users of grep.
The general form of a cgrep command is.
$ cgrep [-flags] "<pattern>" "fileglob"
Find rock in any of the files in mountain.
cgrep "rock" "/mnt/gv0/user/hive/warehouse/mountain/*"
{"name":"rock", "type":"rock"}
Find rock in any of the files in mountain and output to rocks.out.
cgrep "rock" "/mnt/gv0/user/hive/warehouse/mountain/*" > rocks.out
Additionally ignore-case...
cgrep -i "rock" "/mnt/gv0/user/hive/warehouse/mountain/*"
{"name":"rock", "type":"rock"}
{"name":"Rockman", "type":"robot"}
-i, --ignore-case ignore case distinctions
-E, --extended-regexp PATTERN is an extended regular expression (ERE)
-e, --regexp PATTERN use PATTERN for matching
-P, --perl-regexp PATTERN is a Perl regular expression
-v, --invert-match select non-matching lines
-c, --count print only a count of matching lines per FILE
-o, --only-matching show only the part of a line matching PATTERN
-n, --line-number print line number with output lines
-q, --quiet suppress all normal output
cmr-grep is the client that cgrep is built on top of, it provides mostly the same functionality though a flag based interface, making it slightly more verbose.
The general form of a cmr-grep command is.
cmr-grep --input "<input>" --pattern "<pattern>" -o "<output>" [--flags "<flags>"]
Find rock in any of the files in the mountain.
cmr-grep --pattern "rock" --input "/mnt/gv0/user/hive/warehouse/mountain/pdate=2014-01-01/ptime=0/*" --stdout
{"name":"rock", "type":"rock"}
Find rock in any of the files in the mountain and output to rocks.out.
cmr-grep --pattern "rock" --input "/mnt/gv0/user/hive/warehouse/mountain/*" --stdout > rocks.out
Additionally ignore-case...
cmr-grep --flags "-i" --pattern "rock" --input "/mnt/gv0/user/hive/warehouse/mountain/*" --stdout
{"name":"rock", "type":"rock"}
{"name":"Rockman", "type":"robot"}
cget is wrapper around cmr specifically for querying json formatted time-series data. It provides a simplified interface to subset of the functionality that cmr provides. cget requires some specific configuration in order to function correctly. See Configuration for more information.
The general form of a cget command is.
cget --select "<fields>" --from "<table>" --filter "<filters>" --between "<start-date> and <end-date>"
Find out how many bananas were purchased each day during January of 2014
cget --select "day" --from "purchases" --filter "+type:banana" --between "2014-01-01 and 2014-02-01"
2014-01-01 38
2014-01-02 42
2014-01-03 16
2014-01-04 54
... etc ...
2014-01-31 25
Find out how many bananas or oranges were purchased on the first of January 2014
cget --select "day,type" --from "purchases" --filter "+type:banana,+type:orange" --between "2014-01-01 and 2014-01-02"
2014-01-01 banana 38
2014-01-01 orange 22
cmr is the client that cget is built on top of, it provides an interface for performing map-reduce jobs. cmr is packaged along with a general JSON mapper and a reducer which are used to implement cget.
The general form of a cmr command is.
cmr --input "<input>" --mapper "<mapper>" --reducer "<reducer>" -o "<output>"
Find out how many bananas were purchased each day during January of 2014
cmr --input "/mnt/gv0/user/hive/warehouse/purchases/day=2014-01-*/*" --mapper "cmr_map_json day _1 +type:banana" --reducer "cmr_reduce s" --stdout
2014-01-01 38
2014-01-02 42
2014-01-03 16
2014-01-04 54
... etc ...
2014-01-31 25
Find out how many bananas or oranges were purchased on the first of January 2014
cmr --input "/mnt/gv0/user/hive/warehouse/purchases/day=2014-01-*/*" --mapper "cmr_map_json day _1 +type:banana +type:orange" --reducer "cmr_reduce s" --stdout
2014-01-01 banana 38
2014-01-01 orange 22