Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBase Storage and PIG #57

Open
v5tech opened this issue Jan 16, 2015 · 0 comments
Open

HBase Storage and PIG #57

v5tech opened this issue Jan 16, 2015 · 0 comments

Comments

@v5tech
Copy link
Owner

v5tech commented Jan 16, 2015

HBase Storage and PIG

  • Getting Started
export HBASE_HOME=/opt/hbase
export PIG_CLASSPATH="`${HBASE_HOME}/bin/hbase classpath`:$PIG_CLASSPATH"
export HADOOP_CLASSPATH="`${HBASE_HOME}/bin/hbase classpath`:$HADOOP_CLASSPATH"
  • Hello World

write a simple script to load some data from a file and write it out to an HBase table

To begin, use the shell to create your table:

jhoover@jhoover2:~$ hbase shell
HBase Shell; enter ‘help‘ for list of supported commands.
Type “exit” to leave the HBase Shell
Version 0.90.3-cdh3u1, r, Mon Jul 18 08:23:50 PDT 2011

hbase(main):002:0> create ‘sample_names’, ‘info’
0 row(s) in 0.5580 seconds

Next, we’ll put some simple data in a file ‘input.csv’:

1, John, Smith
2, Jane, Doe
3, George, Washington
4, Ben, Franklin

Then we’ll write a simple script to extract this data and write it into fixed columns in HBase:

raw_data = LOAD ‘sample_data.csv’ USING PigStorage( ‘,’ ) AS (
listing_id: chararray,
fname: chararray,
lname: chararray );

STORE raw_data INTO ‘hbase://sample_names’ USING
org.apache.pig.backend.hadoop.hbase.HBaseStorage (
‘info:fname info:lname’);

Then run the pig script locally:

jhoover@jhoover2:~/hbase_sample$ pig -x local hbase_sample.pig
…
Success!

Job Stats (time in seconds):
JobId   Alias   Feature Outputs
job_local_0001  raw_data    MAP_ONLY    hbase://hello_world,

Input(s):
Successfully read records from: “file:///autohome/jhoover/hbase_sample/sample_data.csv”

Output(s):
Successfully stored records in: “hbase://sample_names”

Job DAG:
job_local_0001

You can then see the results of your script in the hbase shell:

hbase(main):001:0> scan ‘hello_world’
ROW COLUMN+CELL
1 column=info:fname, timestamp=1356134399789, value= John
1 column=info:lname, timestamp=1356134399789, value= Smith
2 column=info:fname, timestamp=1356134399789, value= Jane
2 column=info:lname, timestamp=1356134399789, value= Doe
3 column=info:fname, timestamp=1356134399789, value= George
3 column=info:lname, timestamp=1356134399789, value= Washington
4 column=info:fname, timestamp=1356134399789, value= Ben
4 column=info:lname, timestamp=1356134399789, value= Franklin
4 row(s) in 0.4850 seconds
  • Sample Code

You can download the sample code from this blog post here.

  • 文章来源

http://blog.whitepages.com/2011/10/27/hbase-storage-and-pig/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant