Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Simple Byte Array Tools for Pig
Java
branch: master

This branch is 1 commit ahead of jkebinger:master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
src/main/java/com/kebinger/pigbat
README.md
pom.xml

README.md

PigBAT

A set of User Defined functions (UDF) for pig to decode shorts, ints, and longs from byte arrays in pig.

Requires hbase.jar to be on the classpath because these UDF pass through to org.apache.hadoop.hbase.util.Bytes.

Primary use case is to decode information encoded in an Hbase rowkey as below:

register 'hbase-0.90.4-cdh3u3.jar';
register 'zookeeper.jar';
register 'PigBAT-0.0.1-SNAPSHOT.jar';

counts_raw = LOAD 'hbase://some-table' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('0:cd', '-loadKey true')  AS (key:bytearray, data:bytearray);
-- extract an int
expanded = FOREACH counts_raw GENERATE com.kebinger.pigbat.BYTES_TO_INT(key,0) as some_id, data;
-- count 
grouped = GROUP expanded by some_id;
counted = FOREACH grouped GENERATE group, count(expanded);
Something went wrong with that request. Please try again.