# 用python操作hbase

使用happybase包来实现用python操作hbase

happybase主页：http://happybase.readthedocs.io/en/latest/user.html

## step1 安装 happybase

安装:
    pip install happybase

验证：
    python -c "import happybase"

如果没有错误则安装成功
如果遇到thriftPy does not support generating module with path in protocol 'c'请参考：
http://blog.csdn.net/sinolover/article/details/77714648


## step2 使用 happybase

In [1]:
import happybase

1 建立连接

In [2]:
hbase_host = '192.168.1.254'

1.1 普通连接 
连接如果持续1分钟没有交互的话hbase服务端会自动断开

In [5]:
conn = happybase.Connection(hbase_host,12345)

In [6]:
# 打印表格，验证连接是否成功
print(conn.tables())

[b'myproject_mytable', b'mytable', b'test']


1.2 连接池

In [10]:
pool = happybase.ConnectionPool(size=3,host=hbase_host)
with pool.connection() as conn:
    print(conn.tables())

[]


2 操作表
  创建表

In [11]:
pool = happybase.ConnectionPool(size=3,host=hbase_host)
with pool.connection() as conn:
    print(conn.tables())
    conn.create_table('mytable',
                     {
                         'cf1':dict(max_versions=10),
                         'cf2': dict(max_versions=1, block_cache_enabled=False),
                         'cf3': dict(),
                     })
    print(conn.tables())

[]


IOError: IOError(message=b'org.apache.hadoop.hbase.TableExistsException: mytable\n\tat sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)\n\tat sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)\n\tat sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)\n\tat java.lang.reflect.Constructor.newInstance(Constructor.java:423)\n\tat org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)\n\tat org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)\n\tat org.apache.hadoop.hbase.util.ForeignExceptionUtil.toIOException(ForeignExceptionUtil.java:45)\n\tat org.apache.hadoop.hbase.client.HBaseAdmin$ProcedureFuture.convertResult(HBaseAdmin.java:4621)\n\tat org.apache.hadoop.hbase.client.HBaseAdmin$ProcedureFuture.waitProcedureResult(HBaseAdmin.java:4579)\n\tat org.apache.hadoop.hbase.client.HBaseAdmin$ProcedureFuture.get(HBaseAdmin.java:4512)\n\tat org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:677)\n\tat org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:607)\n\tat org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.createTable(ThriftServerRunner.java:1208)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.lang.reflect.Method.invoke(Method.java:498)\n\tat org.apache.hadoop.hbase.thrift.HbaseHandlerMetricsProxy.invoke(HbaseHandlerMetricsProxy.java:67)\n\tat com.sun.proxy.$Proxy9.createTable(Unknown Source)\n\tat org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$createTable.getResult(Hbase.java:4022)\n\tat org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$createTable.getResult(Hbase.java:4006)\n\tat org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)\n\tat org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)\n\tat org.apache.hadoop.hbase.thrift.TBoundedThreadPoolServer$ClientConnnection.run(TBoundedThreadPoolServer.java:289)\n\tat org.apache.hadoop.hbase.thrift.CallQueue$Call.run(CallQueue.java:64)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)\n\tat java.lang.Thread.run(Thread.java:745)\nCaused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hbase.TableExistsException): mytable\n\tat org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.prepareCreate(CreateTableProcedure.java:299)\n\tat org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:106)\n\tat org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.executeFromState(CreateTableProcedure.java:58)\n\tat org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:119)\n\tat org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:498)\n\tat org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1147)\n\tat org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:942)\n\tat org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:895)\n\tat org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$400(ProcedureExecutor.java:77)\n\tat org.apache.hadoop.hbase.procedure2.ProcedureExecutor$2.run(ProcedureExecutor.java:497)\n')

当多个用户建立表，正好有相同名字的表时将发生冲突，使用namespace可以解决表名冲突的问题，在连接时赋予table_prefix相应的namespace,此时创建的表面为namespace_tablename

In [42]:
pool = happybase.ConnectionPool(size=3,host=hbase_host,table_prefix='myproject')
with pool.connection() as conn:
    print(conn.tables())
    conn.create_table('mytable',
                     {
                         'cf1':dict(max_versions=10),
                         'cf2': dict(max_versions=1, block_cache_enabled=False),
                         'cf3': dict(),
                     })
    print(conn.tables())

[b'mytable']
[b'mytable']


3 操作数据

In [44]:
#   此处接着上面的with下来，因此开头有退格
    table = conn.table('mytable')
    table.put(b'row-key',{b'cf:col1':})

In [None]:
3.1 存储数据
3.2 批处理数据
3.3 使用自动计数器
3.4 查询数据
3.5 扫描表
3.6 删除数据