Fluo incrementally processes large data sets stored in Accumulo
Java Shell Thrift

README.md

Fluo

Build Status Apache License Maven Central Javadoc

Apache Fluo lets users make incremental updates to large data sets stored in Apache Accumulo.

Apache Fluo is an open source implementation of Percolator (which populates Google's search index) for Apache Accumulo. Fluo makes it possible to update the results of a large-scale computation, index, or analytic as new data is discovered. Check out the Fluo project website for news and general information.

Getting Started

  • Take the Fluo Tour if you are completely new to Fluo.
  • Read the install instructions to install Fluo and start a Fluo application in YARN on a cluster where Accumulo, Hadoop & Zookeeper are running. If you need help setting up these dependencies, see the related projects page for external projects that may help.

Applications

Below are helpful resources for Fluo application developers:

  • Instructions for creating Fluo applications
  • Fluo API javadocs
  • Fluo Recipes is a project that provides common code for Fluo application developers implemented using the Fluo API.

Implementation

  • Architecture - Overview of Fluo's architecture
  • Contributing - Documentation for developers who want to contribute to Fluo
  • Metrics - Fluo metrics are visible via JMX by default but can be configured to send to Graphite or Ganglia