Skip to content

When you visit your Hive, bring your smoker and you'll get stung less

Notifications You must be signed in to change notification settings

jashmenn/smoker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

72 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

smoker

SMOKER

Custom Hive UDFs in Clojure

Usage

If you want to compile yourself you can:

   # wherever you have your code
   lein compile
   lein uberjar
   scp build/smoker-1.0.0-SNAPSHOT-standalone.jar myserver:~/hive-jars/smoker-standalone.jar

Then use it within Hive:

   # on your server, start hive with auxpath
   hive --auxpath /home/nmurray/hive-jars

   # tell hive about your jars (possibly optional)
   add jar /home/nmurray/hive-jars/smoker-standalone.jar;
   list jars;

   # create your operations
   create temporary function my_lower as 'smoker.udf.MyLowerCase';
   select my_lower(my_column) from my_table where ds=20110101 limit 10;

List of Operations

Lower-case. The "hello-world" of UDFs

   create temporary function my_lower as 'smoker.udf.MyLowerCase';
   select my_lower(my_column) from my_table where ds=20110101 limit 10;

Tokenize. The "hello-cruel-world" of UDTFs. UDFs emit a single record, UDTFs can emit multiple records for a single input record.

   create temporary function tokenize as 'smoker.udf.MyTokenize';
   select tokenize(my_column) AS (word, count) from my_table where ds=20110101 limit 10;

Authors

References

About

When you visit your Hive, bring your smoker and you'll get stung less

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published