Sharkbite is a native client for key/value stores. With initial support for Apache Accumulo, the design can and has been used to support other key/value stores. Development began in 2014 and has slowly evolved. there is no structural specificity to Accumulo despite being the base implementation. Despite this the examples below will look very much like Accumulo due to aliasing. This is intentional.
Capabilities That will be supported in V0.7 :
- Pysharkbite, the native python implementation can be installed via pip install sharkbite for linux and osx
- Works with Accumulo 1.6.x, 1.7.x, 1.8.x, 1.9.x and 2.x
- Read/Write : Reading and writing data to Accumulo is currently supported.
- Table Operations
- Security Operations
- Python Iterators : Python iterators run natively at the tablet server. They are supported either through the labmda iterators in Python or the typical python interfaces.
About the name
Sharkbite's name originated from design as a connector that abstracted components in which we tightly coupled and gripped interfaces of the underlying datastore. With an abstraction layer for access, and using cross compatible objects, the underlying interfaces are heavily coupled to each database. As a result, Sharkbite became a fitting name since interfaces exist to abstract the high coupling that exists within implementations of the API.
GNU 4+ cmake make python3.7-dev ( for python bindings )
bison 3+ libboost. automake autotool autoconf libtool libevent-dev unzip
Lib Boost is a compile time dependency for thrift. It is not required at runtime, so will no be referenced via any compiled artifact.
For Native Python Iterators
JDK ( such as OpenJDK)
The following is a list of build options in cmake. Preface this with -D at the command line. An example command might be
cmake -DPYTHON_ITERATOR_SUPPORT=ON -DCMAKE_BUILD_TYPE=Release -D BUILD_TESTS=ON -D BUILD_PYTHON_TESTS=ON ..
|CMAKE Option||Default value||Purpose|
|BUILD_TESTS||ON||Build all tests|
|BUILD_PYTHON_TESTS||OFF||Builds python Integration tests that are used for verification and validation|
|NATIVE_ARCH||OFF||Builds using native architecture symbols.|
|PYTHON_ITERATOR_SUPPORT||OFF||Builds the python iterator jar|
mkdir build && cd build && cmake .. && cmake --build . ; make test This will build the package and library, which you can use. It will also build examples in the examples directory+
Building on OSX
Follow the same procedures as above, but you may need to force linking bison 3.x if you installed it via Homebrew
What we provide
Please note that the library of sharkbite consists of C bindings to allow you to create various connectors via our C interfaces and a Python binding built via Pybind11.
The Python library can be installed by simply typing pip install . into the root source directory. During this process the C++ library and python bindings will be built.
A Python example is included. This is your primary example of the Python bound sharkbite library.
We now support python iterators. If you don't have a need to build sharkbite and don't wish to use JNI for the python bindings, you can build the native-iterators-jni package without the rest of sharkbite. Simply run mvn package in the native-iterators-jni directory and the target directory will have the required JAR with all necessary dependencies.
Using JNI bindings for python ( not recommended )
By using the cmake option PYTHON_ITERATOR_SUPPORT ( cmake -DPYTHON_ITERATOR_SUPPORT=ON ) we will build the necessary infrastructure to support python iterators using either JNI or Jython.
Iterators can be defined as single function lambdas or by implementing the seek or next methods.
The first example implements the seek and onNext methods. seek is optional if you don't wish to adjust the range. Once keys are being iterated you may get the top key. You may call iterator.next() after or the infrastructure will do that for you.
class myIterator: def seek(iterator,soughtRange): range = Range("a") iterator.seek(range) def onNext(iterator): if (iterator.hasTop()): kv = KeyValue() key = iterator.getTopKey() cf = key.getColumnFamily() value = iterator.getTopValue() key.setColumnFamily("oh changed " + cf) iterator.next() return KeyValue(key,value) else: return None
If this is defined in a separate file, you may use it with the following code snippet
with open('test.iter', 'r') as file: iterator = file.read() ## name, iterator text, priority iterator = pysharkbite.PythonIterator("PythonIterator",iteratortext,100) scanner.addIterator(iterator)
Alternative you may use lambdas. The lambda you provide will be passed the KeyValue ( getKey() and getValue() return the constituent parts). A partial code example of setting it up is below. You may return a Key or KeyValue object. If you return the former an empty value will be return ed.
## define only the name and priority iterator = pysharkbite.PythonIterator("PythonIterator",100) ## define a lambda to ajust the column family. iterator = iterator.onNext("lambda x : Key( x.getKey().getRow(), 'new cf', x.getKey().getColumnQualifier()) ") scanner.addIterator(iterator)
You may either define a python iterator as a text implementation or a lambda. Both cannot be used simulaneously.
The C library is a wrapper around the C++ Code. We supply a target called "capi".
This python example is vestigial and only exists as an example of the C-API wrappers. Implementations can be written in Go, Rust, Ruby, etc.
A C++ Example can be found here as well as in src/examples/
A Python example can be found [here] (https://github.com/phrocker/sharkbite/blob/master/examples/pythonexample.py)