# About: Simple HBase query for Test

---

HBase queryの動作確認をしてみる。

## *Operation Note*

*This is a cell for your own recording.  ここに経緯を記述*

# Notebookと環境のBinding

Inventory中のgroup名でBind対象を指示する。

In [1]:
target_group = 'hadoop_all_testcluster'

コマンドの動作確認。まずはmaster nodeを取得する。

In [6]:
hosts_stdout = !ansible {target_group} -b -a 'cat /etc/hosts'
hosts_stdout = filter(lambda l: not l.strip().endswith('>>'), hosts_stdout)
hosts_stdout = map(lambda l: l.split(), hosts_stdout)
hosts_stdout = filter(lambda l: len(l) == 2, hosts_stdout)
machines = dict(map(lambda l: (l[1], l[0]), hosts_stdout))
zknode_stdout = !ansible -m ping -l {target_group} hadoop_zookeeperserver
zknodes = sorted([l.split()[0] for l in zknode_stdout if 'SUCCESS' in l])

from kazoo.client import KazooClient
zk = KazooClient(hosts='%s:2181' % zknodes[0], read_only=True)
zk.start()
(master_result,v) = zk.get("/hbase/master")
zk.stop()
for host, ip in machines.items():
    if host in master_result:
        hbase_master_host = ip
hbase_master_host

'XXX.XXX.XXX.70'

In [7]:
!ansible -a 'hbase' {hbase_master_host} -l {target_group}

[0;31mXXX.XXX.XXX.70 | FAILED | rc=1 >>
Usage: hbase [<options>] <command> [<args>]
Options:
  --config DIR    Configuration direction to use. Default: ./conf
  --hosts HOSTS   Override the list in 'regionservers' file
  --auth-as-server Authenticate to ZooKeeper using servers configuration

Commands:
Some commands take arguments. Pass no args or -h for usage.
  shell           Run the HBase shell
  hbck            Run the hbase 'fsck' tool
  snapshot        Create a new snapshot of a table
  wal             Write-ahead-log analyzer
  hfile           Store file analyzer
  zkcli           Run the ZooKeeper shell
  upgrade         Upgrade hbase
  master          Run an HBase HMaster node
  regionserver    Run an HBase HRegionServer node
  zookeeper       Run a Zookeeper server
  rest            Run an HBase REST server
  thrift          Run the HBase Thrift server
  thrift2         Run the HBase Thrift2 server
  clean           Run the HBase clean up script
  clas

# Datasetの準備

In [8]:
import tempfile
work_dir = tempfile.mkdtemp()
work_dir

'/tmp/tmpszvmJY'

In [9]:
%env WORK_DIR={work_dir}

env: WORK_DIR=/tmp/tmpszvmJY


今回は適当に・・・Iris Data Setを読み込む。 [Iris Data Set - UCI Machine Learning Repository](https://archive.ics.uci.edu/ml/datasets/Iris)

In [10]:
%%bash
cd ${WORK_DIR}
curl -O https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0100  4551  100  4551    0     0   8270      0 --:--:-- --:--:-- --:--:--  8274


In [11]:
!head {work_dir}/iris.data

5.1,3.5,1.4,0.2,Iris-setosa
4.9,3.0,1.4,0.2,Iris-setosa
4.7,3.2,1.3,0.2,Iris-setosa
4.6,3.1,1.5,0.2,Iris-setosa
5.0,3.6,1.4,0.2,Iris-setosa
5.4,3.9,1.7,0.4,Iris-setosa
4.6,3.4,1.4,0.3,Iris-setosa
5.0,3.4,1.5,0.2,Iris-setosa
4.4,2.9,1.4,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa


# HBaseクエリの実行

IrisのデータをHBaseにputし、scanしてみる例を実行してみる。

## Tableの作成

In [20]:
%%writefile {work_dir}/hbase.query
create 'iris', 'data'
list 'iris'

Overwriting /tmp/tmpszvmJY/hbase.query


In [21]:
!ansible -m copy -a 'src={work_dir}/hbase.query dest={hadoop_client_dir}/' {hbase_master_host} -l {target_group}
!ansible -m shell -a 'cat {hadoop_client_dir}/hbase.query | hbase shell' {hbase_master_host} -l {target_group}

[0;33mXXX.XXX.XXX.70 | SUCCESS => {
    "changed": true, 
    "checksum": "ab69b5501a059b3bc3490d0634f58c66720723b8", 
    "dest": "/home/ansible/iris/hbase.query", 
    "gid": 500, 
    "group": "ansible", 
    "md5sum": "a5c2777fce421dd2618d5d1a43dff824", 
    "mode": "0664", 
    "owner": "ansible", 
    "size": 33, 
    "src": "/home/ansible/.ansible/tmp/ansible-tmp-1472094845.62-219286607431707/source", 
    "state": "file", 
    "uid": 500
}[0m
[0;32mXXX.XXX.XXX.70 | SUCCESS | rc=0 >>
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version XXX.XXX.XXX.2.4.2.0-258, rUnknown, Mon Apr 25 06:36:21 UTC 2016

create 'iris', 'data'
0 row(s) in 5.1530 seconds

Hbase::Table - iris
list 'iris'TABLE
iris
1 row(s) in 0.0150 seconds

["iris"]SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/XXX.XXX.XXX.0-258/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class

## データのput

In [30]:
import os
import csv

with open(os.path.join(work_dir, 'hbase.query'), 'w') as qf:
    with open(os.path.join(work_dir, 'iris.data'), 'r') as f:
        for index, row in enumerate(filter(lambda l: len(l) == 5, csv.reader(f))):
            for value, colname in zip(row, ['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'class']):
                qf.write("put 'iris', 'iris-{}', 'data:{}', '{}'\n".format(index, colname, value))

!head {work_dir}/hbase.query

put 'iris', 'iris-0', 'data:sepal_length', '5.1'
put 'iris', 'iris-0', 'data:sepal_width', '3.5'
put 'iris', 'iris-0', 'data:petal_length', '1.4'
put 'iris', 'iris-0', 'data:petal_width', '0.2'
put 'iris', 'iris-0', 'data:class', 'Iris-setosa'
put 'iris', 'iris-1', 'data:sepal_length', '4.9'
put 'iris', 'iris-1', 'data:sepal_width', '3.0'
put 'iris', 'iris-1', 'data:petal_length', '1.4'
put 'iris', 'iris-1', 'data:petal_width', '0.2'
put 'iris', 'iris-1', 'data:class', 'Iris-setosa'


In [31]:
!ansible -m copy -a 'src={work_dir}/hbase.query dest={hadoop_client_dir}/' {hbase_master_host} -l {target_group}
!ansible -m shell -a 'cat {hadoop_client_dir}/hbase.query | hbase shell' {hbase_master_host} -l {target_group}

[0;33mXXX.XXX.XXX.70 | SUCCESS => {
    "changed": true, 
    "checksum": "292b173a06f844388da2d70b30ecf09c8fee31aa", 
    "dest": "/home/ansible/iris/hbase.query", 
    "gid": 500, 
    "group": "ansible", 
    "md5sum": "9323ca568375ecb8ec3463d6491557c5", 
    "mode": "0664", 
    "owner": "ansible", 
    "size": 37900, 
    "src": "/home/ansible/.ansible/tmp/ansible-tmp-1472095648.93-223891743750175/source", 
    "state": "file", 
    "uid": 500
}[0m
[0;32mXXX.XXX.XXX.70 | SUCCESS | rc=0 >>
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version XXX.XXX.XXX.2.4.2.0-258, rUnknown, Mon Apr 25 06:36:21 UTC 2016

put 'iris', 'iris-0', 'data:sepal_length', '5.1'
0 row(s) in 0.3790 seconds

put 'iris', 'iris-0', 'data:sepal_width', '3.5'
0 row(s) in 0.0120 seconds

put 'iris', 'iris-0', 'data:petal_length', '1.4'
0 row(s) in 0.0090 seconds

put 'iris', 'iris-0', 'data:petal_width', '0.2'
0 row(s) in 0.0160 seconds

put 'iri

In [32]:
%%writefile {work_dir}/hbase.query
get 'iris', 'iris-16'

Overwriting /tmp/tmpszvmJY/hbase.query


In [33]:
!ansible -m copy -a 'src={work_dir}/hbase.query dest={hadoop_client_dir}/' {hbase_master_host} -l {target_group}
!ansible -m shell -a 'cat {hadoop_client_dir}/hbase.query | hbase shell' {hbase_master_host} -l {target_group}

[0;33mXXX.XXX.XXX.70 | SUCCESS => {
    "changed": true, 
    "checksum": "d05ce041d53cfd5eb2d09f65c81735924bd4e135", 
    "dest": "/home/ansible/iris/hbase.query", 
    "gid": 500, 
    "group": "ansible", 
    "md5sum": "e0a827322c19c4e95ee8831d3d33ec0a", 
    "mode": "0664", 
    "owner": "ansible", 
    "size": 21, 
    "src": "/home/ansible/.ansible/tmp/ansible-tmp-1472095718.03-33518397094454/source", 
    "state": "file", 
    "uid": 500
}[0m
[0;32mXXX.XXX.XXX.70 | SUCCESS | rc=0 >>
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version XXX.XXX.XXX.2.4.2.0-258, rUnknown, Mon Apr 25 06:36:21 UTC 2016

get 'iris', 'iris-16'COLUMN  CELL
 data:class timestamp=1472095664485, value=Iris-setosa
 data:petal_length timestamp=1472095664401, value=1.3
 data:petal_width timestamp=1472095664464, value=0.4
 data:sepal_length timestamp=1472095664371, value=5.4
 data:sepal_width timestamp=1472095664386, value=3.9
5 row(s) in 0.3

## Tableの削除

In [34]:
%%writefile {work_dir}/hbase.query
disable 'iris'

Overwriting /tmp/tmpszvmJY/hbase.query


In [35]:
!ansible -m copy -a 'src={work_dir}/hbase.query dest={hadoop_client_dir}/' {hbase_master_host} -l {target_group}
!ansible -m shell -a 'cat {hadoop_client_dir}/hbase.query | hbase shell' {hbase_master_host} -l {target_group}

[0;33mXXX.XXX.XXX.70 | SUCCESS => {
    "changed": true, 
    "checksum": "d9d79b3c44972cb33ec37aedec63e388f88d79bc", 
    "dest": "/home/ansible/iris/hbase.query", 
    "gid": 500, 
    "group": "ansible", 
    "md5sum": "85570a9c24629c937f248bb6ff1e8b5e", 
    "mode": "0664", 
    "owner": "ansible", 
    "size": 14, 
    "src": "/home/ansible/.ansible/tmp/ansible-tmp-1472095808.43-50685302333311/source", 
    "state": "file", 
    "uid": 500
}[0m
[0;32mXXX.XXX.XXX.70 | SUCCESS | rc=0 >>
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version XXX.XXX.XXX.2.4.2.0-258, rUnknown, Mon Apr 25 06:36:21 UTC 2016

disable 'iris'0 row(s) in 3.1060 secondsSLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/XXX.XXX.XXX.0-258/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/XXX.XXX.XXX.0-258/zookeeper/lib/slf4j-log4j1

In [36]:
%%writefile {work_dir}/hbase.query
drop 'iris'

Overwriting /tmp/tmpszvmJY/hbase.query


In [37]:
!ansible -m copy -a 'src={work_dir}/hbase.query dest={hadoop_client_dir}/' {hbase_master_host} -l {target_group}
!ansible -m shell -a 'cat {hadoop_client_dir}/hbase.query | hbase shell' {hbase_master_host} -l {target_group}

[0;33mXXX.XXX.XXX.70 | SUCCESS => {
    "changed": true, 
    "checksum": "9816d0c0fd3f5cbd396cd46e43678d5995e678fb", 
    "dest": "/home/ansible/iris/hbase.query", 
    "gid": 500, 
    "group": "ansible", 
    "md5sum": "90b5f755bf6351bd82a48e6b85a0edd8", 
    "mode": "0664", 
    "owner": "ansible", 
    "size": 11, 
    "src": "/home/ansible/.ansible/tmp/ansible-tmp-1472095837.53-169767998480121/source", 
    "state": "file", 
    "uid": 500
}[0m
[0;32mXXX.XXX.XXX.70 | SUCCESS | rc=0 >>
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version XXX.XXX.XXX.2.4.2.0-258, rUnknown, Mon Apr 25 06:36:21 UTC 2016

drop 'iris'0 row(s) in 2.7170 secondsSLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/XXX.XXX.XXX.0-258/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/XXX.XXX.XXX.0-258/zookeeper/lib/slf4j-log4j12-

# 後始末

一時ディレクトリを削除する。

In [38]:
!rm -fr {work_dir}