Uno (an Apache Fluo project) sets up Apache Accumulo or Apache Fluo on a single machine for development
Shell
Switch branches/tags
Nothing to show
Clone or download
Permalink
Failed to load latest commit information.
bin Merge pull request #181 from ctubbsii/minimize-output-128 Jun 6, 2018
conf Adding Hadoop 3.1.0 checksum (#186) Jul 21, 2018
contrib Adding Uno logo Aug 4, 2016
docs Add doc for centos (#165) Jan 6, 2018
.gitignore Closes #26 #27 #28 - Limit 'fluo-dev deploy' to only deploying fluo. Apr 1, 2015
AUTHORS Closes #9 - Add AUTHORS & LICENSE files Jan 13, 2015
LICENSE
README.md Fix ZooKeeper naming Jun 6, 2018

README.md

Uno

Apache License

Uno automates setting up Apache Accumulo or Apache Fluo (and their dependencies) on a single machine.

Uno makes it easy for a developer to experiment with Accumulo or Fluo in a realistic environment. Uno is designed for developers who need to frequently upgrade and test their code, and do not care about preserving data. While Uno makes it easy to setup a dev stack running Fluo or Accumulo, it also makes it easy clear your data and setup your dev stack again. To avoid inadvertent data loss, Uno should not be used in production.

Checkout Muchos for setting up Accumulo or Fluo on multiple machines.

Requirements

Uno requires the following software to be installed on your machine.

  • Java - JDK 8 is required for running Fluo.
  • wget - Needed for fetch command to download tarballs.
  • Maven - Only needed if fetch command builds tarball from local repo.

You should also be able to ssh to localhost without a passphrase. The following instructions can help you setup these requirements in your environment :

Quickstart

The following commands will get you up and running with an Accumulo instance if you have satisfied the requirements mentioned above. Replace accumulo with fluo to setup a Fluo instance.

git clone https://github.com/apache/fluo-uno.git
cd fluo-uno
./bin/uno fetch accumulo            # Fetches binary tarballs of Accumulo and its dependencies
./bin/uno setup accumulo            # Sets up Accumulo and its dependencies (Hadoop & ZooKeeper)
eval "$(./bin/uno env)"             # Bash-specific command that sets up current shell

Accumulo is now ready to use. Verify your installation by checking the Accumulo Monitor and Hadoop NameNode status pages.

Note that the Accumulo shell can be accessed in one of two ways. The easiest is method is to use the uno command.

./bin/uno ashell

You can also access the shell directly. The Accumulo installation is initialized using the username root and password secret (set in the uno.conf file). Therefore, the shell can be accessed directly using:

accumulo shell -u root -p secret

For a more complete understanding of Uno, please continue reading.

Installation

First, clone the Uno repo on a local disk with enough space to run Hadoop, Accumulo, etc:

git clone https://github.com/apache/fluo-uno.git

The uno command uses conf/uno.conf for its default configuration which should be sufficient for most users.

Optionally, you can customize this configuration by modifying the uno.conf file for your environment. Inside this script the variable UNO_HOME defaults to the root of the Uno repository.

vim conf/uno.conf

All commands are run using the uno script in bin/. Uno has a command that helps you configure your shell so that you can run commands from any directory and easily set common environment variables in your shell for Uno, Hadoop, ZooKeeper, Fluo, and Spark. Run the following command to print this shell configuration. You can also add --paths or --vars to the command below to limit output to PATH or environment variable configuration:

uno env

You can either copy and paste this output into your shell or add the following (with a correct path) to your ~/.bashrc automatically configure every new shell.

eval "$(/path/to/uno/bin/uno env)"

With uno script set up, you can now use it to download, configure, and run Fluo's dependencies.

Fetch command

The uno fetch <component> command fetches the tarballs of a component and its dependencies for later use by the setup command. By default, the fetch command downloads tarballs but you can configure it to build Fluo or Accumulo from a local git repo by setting FLUO_REPO or ACCUMULO_REPO in uno.conf.

If uno fetch all is run, all possible components will be either downloaded or built. If you would like to only fetch certain components, run uno fetch to see a list of possible components.

After the fetch command is run for the first time, it only needs to run again if you want to upgrade components and need to download/build the latest version.

Setup command

The setup command will install the downloaded tarballs to the directory set by $INSTALL in your uno.conf and run you local development cluster. The command can be run in several different ways:

  1. Sets up Apache Accumulo and its dependencies of Hadoop, ZooKeeper. This starts all processes and will wipe Accumulo/Hadoop if this command was run previously.

     uno setup accumulo
    
  2. Sets up Apache Fluo along with Accumulo (and its dependencies). This command will wipe your cluster. While Fluo is set up, it does not start any Fluo applications.

     uno setup fluo
    
  3. For Fluo & Accumulo, you can setup the software again without wiping/setting up their underlying dependencies. You can upgrade Accumulo or Fluo by running uno fetch before running this command.

     uno setup fluo --no-deps
     uno setup accumulo --no-deps
    
  4. Sets up metrics service (InfluxDB + Grafana).

     uno setup metrics
    
  5. Sets up Apache Spark and starts Spark's History Server.

     uno setup spark
    
  6. Sets up all components (Fluo, Accumulo, Hadoop, ZooKeeper, Spark, metrics service).

     uno setup all
    

You can confirm that everything started by checking the monitoring pages below:

You can verify that Fluo was installed correctly by running the fluo command which you can use to administer Fluo:

./install/fluo-1.0.0-beta-1/bin/fluo

If you run some tests and then want a fresh cluster, run the setup command again which will kill all running processes, clear any data and logs, and restart your cluster.

Running Apache Fluo applications

Before running an Apache Fluo application, it is recommended that you configure your shell using uno env. If this is done, many Fluo example applications (such as Webindex and Phrasecount) can be run by simply cloning their repo and executing their start scripts (which will use environment variables set in your shell by uno env).

If you want to create your own Fluo application, you should mimic the scripts of example Fluo applications or follow the instructions starting at the Configure a Fluo application section of the Fluo install instructions. These instructions will guide you through the process of configuring, initializing, and starting your application.