Skip to content
Permalink
Browse files
ACCUMULO-4511 Adding examples from Accumulo repo
* Created run.sh script for running examples
* Fixed package names
  • Loading branch information
mikewalch committed Dec 9, 2016
1 parent bb64db9 commit d96c6d968e64ddd126691bfb9022c1df8f83d470
Show file tree
Hide file tree
Showing 90 changed files with 11,122 additions and 0 deletions.
@@ -0,0 +1,6 @@
/.classpath
/.project
/.settings/
/target/
/*.iml
/.idea
@@ -0,0 +1,31 @@
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
language: java
notifications:
irc:
channels:
- "chat.freenode.net#accumulo"
use_notice: true
on_success: change
on_failure: always
template:
- "%{result} %{repository_slug} %{branch} (%{build_url}): %{message}"
cache:
directories:
- $HOME/.m2
jdk:
- oraclejdk8
install: true
script: mvn clean verify -DskipITs
110 README.md
@@ -1 +1,111 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
# Apache Accumulo Examples

## Setup instructions

Before running any of the examples, the following steps must be performed.

1. Install and run Accumulo via the instructions found in [INSTALL.md] of Accumulo's tarball.
Remember the instance name. It will be referred to as "instance" throughout the examples. A
comma-separated list of zookeeper servers will be referred to as "zookeepers".

2. Create an Accumulo user (for help see the 'User Administration' section of the
[user manual][manual]), or use the root user. This user and their password should replace any
reference to "username" or "password" in the examples. This user needs the ability to create
tables.

3. Clone and build this repository.

git clone https://github.com/apache/accumulo-examples.git
mvn clean package

4. Each Accumulo example has its own documentation and instructions for running the example which
are linked to below.

When running the examples, remember the tips below:

* Examples are run using the `runex` command which is located in the `bin/` directory of this repo.
The `runex` command is a simple wrapper around the Maven Exec plugin.
* Any command that references Accumulo settings such as `instance`, `zookeepers`, `username`, or
`password` should be updated for your instance.
* Commands intended to be run in bash are prefixed by '$' and should be run from the root of this
repository.
* Several examples use the `accumulo` and `tool.sh` commands which are expected to be on your
`PATH`. These commands are found in the `bin/` and `contrib/` directories of your Accumulo
installation.
* Commands intended to be run in the Accumulo shell are prefixed by '>'.

## Available Examples

Each example below highlights a feature of Apache Accumulo.

| Example | Description |
|---------|-------------|
| [batch] | Using the batch writer and batch scanner |
| [bloom] | Creating a bloom filter enabled table to increase query performance |
| [bulkIngest] | Ingesting bulk data using map/reduce jobs on Hadoop |
| [classpath] | Using per-table classpaths |
| [client] | Using table operations, reading and writing data in Java. |
| [combiner] | Using example StatsCombiner to find min, max, sum, and count. |
| [compactionStrategy] | Configuring a compaction strategy |
| [constraints] | Using constraints with tables. |
| [dirlist] | Storing filesystem information. |
| [export] | Exporting and importing tables. |
| [filedata] | Storing file data. |
| [filter] | Using the AgeOffFilter to remove records more than 30 seconds old. |
| [helloworld] | Inserting records both inside map/reduce jobs and outside. And reading records between two rows. |
| [isolation] | Using the isolated scanner to ensure partial changes are not seen. |
| [mapred] | Using MapReduce to read from and write to Accumulo tables. |
| [maxmutation] | Limiting mutation size to avoid running out of memory. |
| [regex] | Using MapReduce and Accumulo to find data using regular expressions. |
| [reservations] | Using conditional mutations to implement simple reservation system. |
| [rgbalancer] | Using a balancer to spread groups of tablets within a table evenly |
| [rowhash] | Using MapReduce to read a table and write to a new column in the same table. |
| [sample] | Building and using sample data in Accumulo. |
| [shard] | Using the intersecting iterator with a term index partitioned by document. |
| [tabletofile] | Using MapReduce to read a table and write one of its columns to a file in HDFS. |
| [terasort] | Generating random data and sorting it using Accumulo. |
| [visibility] | Using visibilities (or combinations of authorizations). Also shows user permissions. |

[manual]: https://accumulo.apache.org/latest/accumulo_user_manual/
[INSTALL.md]: https://github.com/apache/accumulo/blob/master/INSTALL.md
[batch]: docs/batch.md
[bloom]: docs/bloom.md
[bulkIngest]: docs/bulkIngest.md
[classpath]: docs/classpath.md
[client]: docs/client.md
[combiner]: docs/combiner.md
[compactionStrategy]: docs/compactionStrategy.md
[constraints]: docs/constraints.md
[dirlist]: docs/dirlist.md
[export]: docs/export.md
[filedata]: docs/filedata.md
[filter]: docs/filter.md
[helloworld]: docs/helloworld.md
[isolation]: docs/isolation.md
[mapred]: docs/mapred.md
[maxmutation]: docs/maxmutation.md
[regex]: docs/regex.md
[reservations]: docs/reservations.md
[rgbalancer]: docs/rgbalancer.md
[rowhash]: docs/rowhash.md
[sample]: docs/sample.md
[shard]: docs/shard.md
[tabletofile]: docs/tabletofile.md
[terasort]: docs/terasort.md
[visibility]: docs/visibility.md
@@ -0,0 +1,21 @@
#! /usr/bin/env bash

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

main_class="$1"
main_args="${*:2}"

mvn -q exec:java -Dexec.mainClass="org.apache.accumulo.examples.$main_class" -Dexec.args="$main_args"
@@ -0,0 +1,55 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
# Apache Accumulo Batch Writing and Scanning Example

This tutorial uses the following Java classes, which can be found in org.apache.accumulo.examples.client:

* SequentialBatchWriter.java - writes mutations with sequential rows and random values
* RandomBatchWriter.java - used by SequentialBatchWriter to generate random values
* RandomBatchScanner.java - reads random rows and verifies their values

This is an example of how to use the batch writer and batch scanner. To compile
the example, run maven and copy the produced jar into the accumulo lib dir.
This is already done in the tar distribution.

Below are commands that add 10000 entries to accumulo and then do 100 random
queries. The write command generates random 50 byte values.

Be sure to use the name of your instance (given as instance here) and the appropriate
list of zookeeper nodes (given as zookeepers here).

Before you run this, you must ensure that the user you are running has the
"exampleVis" authorization. (you can set this in the shell with "setauths -u username -s exampleVis")

$ accumulo shell -u root -e "setauths -u username -s exampleVis"

You must also create the table, batchtest1, ahead of time. (In the shell, use "createtable batchtest1")

$ accumulo shell -u username -e "createtable batchtest1"
$ ./bin/runex client.SequentialBatchWriter -i instance -z zookeepers -u username -p password -t batchtest1 --start 0 --num 10000 --size 50 --batchMemory 20M --batchLatency 500 --batchThreads 20 --vis exampleVis
$ ./bin/runex client.RandomBatchScanner -i instance -z zookeepers -u username -p password -t batchtest1 --num 100 --min 0 --max 10000 --size 50 --scanThreads 20 --auths exampleVis
07 11:33:11,103 [client.CountingVerifyingReceiver] INFO : Generating 100 random queries...
07 11:33:11,112 [client.CountingVerifyingReceiver] INFO : finished
07 11:33:11,260 [client.CountingVerifyingReceiver] INFO : 694.44 lookups/sec 0.14 secs

07 11:33:11,260 [client.CountingVerifyingReceiver] INFO : num results : 100

07 11:33:11,364 [client.CountingVerifyingReceiver] INFO : Generating 100 random queries...
07 11:33:11,370 [client.CountingVerifyingReceiver] INFO : finished
07 11:33:11,416 [client.CountingVerifyingReceiver] INFO : 2173.91 lookups/sec 0.05 secs

07 11:33:11,416 [client.CountingVerifyingReceiver] INFO : num results : 100

0 comments on commit d96c6d9

Please sign in to comment.