Skip to content
Working commits for Hive connector to Accumulo. This will eventually be checked directly into Accumulo.
Java Other
  1. Java 98.7%
  2. Other 1.3%
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Query data stored in Accumulo tables directly with HiveQL.

Pertains to issue:

Currently does not work with Hadoop 2.0/CDH4.

Getting Started Guide


How Iterator Predicate pushdown works

List of required AUX_JARS

ACLED examples:

$ACCUMULO_HOME/bin, $HADOOP_HOME/bin, $HIVE_HOME/bin on environment path. Either wget or curl installed.

The query examples use a cleaned up version of the structured Acled Nigeria dataset. (

  1. Navigate to src/test/hql/acled and run The script handles creating and loading data for both the Hive and Accumulo acled tables named 'acled_nigeria' and 'acled' respectively. The ETL and data for both processes runs standalone from the ingest directory.

  2. See query_acled.sql for CREATE EXTERNAL TABLE example, required aux jars, and several sample queries that utilize both the Hive and Accumulo tables. The number of hive columns in table definition must be equal to accumulo.column.mapping.

  3. Run to see the different query results. Make sure to configure the -hiveconf variables for your local Accumulo instance.

Known limitations:

  • Requires Hive 0.10 and Accumulo 1.5+ which both use Thrift 0.9. Otherwise there are binary incompatibilities.
  • Requires Hadoop 1.0/0.20.2x/CDH3.
  • Supported Hive column types limited to int, double, string and bigint.
  • Hive column type mapping assumes value type consistency for the same qualifier across different rows. For example, r1/cf/q/v cannot hold an int while r2/cf/q/v is a double.
  • The Hive column types must match Accumulo value types. An Accumulo value holding integer bytes should be mapped as a hive column of type int.
  • Does not yet support INSERT.
  • Iterator pushdown only works on WHERE clauses consisting of purely conjunctive predicates. This is a known Hive limitation with the IndexPredicateAnalyzer.
  • 'Like' CompareOpt is not considered decomposable by the predicate analyzer. This has to do with the Hive UDFLike not extending GenericUDF.
  • Iterator pushdown only kicks in for operators <, >, =, >=, <=, !=.

Future enhancements:

  • Allow INSERT for field serialization to Accumulo. OutputFormat exists but is not wired to Serde or tested.
  • Serde property for setting fixed timestamp during mutations.
  • Allow per-qualifier type hints in the serde property, similar to the latest build of the HBase StorageHandler.
  • Support for remaining hive primitive column types.
  • Support for complex value types (Struct, Map, Array, Union).
  • Allow custom Authorizations to be supplied from an external source.


Licensed AS-IS under Apache License 2.0

You can’t perform that action at this time.