Antonio Piccolboni edited this page Jun 17, 2014 · 1 revision


This R package provides basic connectivity to HBASE, using the Thrift server. R programmers can browse, read, write, and modify tables stored in HBASE. The following functions are part of this package

  • Table Maninpulation, hb.delete.table, hb.describe.table, hb.set.table.mode, hb.regions.table
  • Read/Write
    hb.insert, hb.get, hb.delete,,, hb.scan, hb.scan.ex
  • Utility
  • Initialization
    hb.defaults, hb.init


  • Installing the package requires that you first install and build Thrift. Once you have the libraries built, be sure they are in a path where the R client can find them (i.e. /usr/lib). This package was built and tested using Thrift 0.8

    Here is an example for building the libraries on CentOS:

    1. Install all Thrift pre-requisites:
    2. Build Thrfit according to instructions:
    3. Update PKG_CONFIG_PATH: export PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/local/lib/pkgconfig/
    4. Verifiy pkg-config path is correct: pkg-config --cflags thrift , returns: -I/usr/local/include/thrift
    5. Copy Thrift library sudo cp /usr/local/lib/ /usr/lib/

  • The Thrift server by default starts on port 9090.

    [hbase-root]/bin/hbase thrift start

    If you are running on rhbase on a different hostname:port you will have to change how the package is initialized

    hb.init(host=, port=9090)
  • By default the rhbase uses "native" R serialization (serialize/unserialize) to read and write data from hbase. You can switch this to "raw" (i.e treat everything as a string) serialization by specifying "serialization="raw"" during the initialization of the package


    See the sample /rhbase/pkg/inst/samples/StringSerializer.R for details

Hbase table scans - using the filterstring option in hb.scan.ex

In version 1.1 of rhbase, a new function hb.scan.ex was introduced. This function allows the use of a 'filterString' for Hbase table scans (Hbase 0.92 or >).

Please see the Apache docs ( for details on filterString syntax (be aware that as of this writing, there are some inaccuracies in this documentation).

Hbase/Thrift is very unforgiving if you get the syntax or spelling wrong. An exception will be throw

    rhbase<hbScannerOpenFilterEx>:: (TTransportException) No more data to read.

This basically means that the socket connection to the Thrift server is dead. The only way to recover, is to reinitialize your connection


An example of a filterstring has been added to the sample: