Skip to content
Permalink
Browse files
ACCUMULO-4814 Added links to Java classes (#9)
* Cleaned up batch docs
  • Loading branch information
mikewalch committed Feb 14, 2018
1 parent 309e0c9 commit e3616c40ee620b13dafeadac3ac57b97bee75330
Show file tree
Hide file tree
Showing 6 changed files with 54 additions and 39 deletions.
@@ -16,32 +16,31 @@ limitations under the License.
-->
# Apache Accumulo Batch Writing and Scanning Example

This tutorial uses the following Java classes, which can be found in org.apache.accumulo.examples.client:
This tutorial uses the following Java classes:

* SequentialBatchWriter.java - writes mutations with sequential rows and random values
* RandomBatchWriter.java - used by SequentialBatchWriter to generate random values
* RandomBatchScanner.java - reads random rows and verifies their values
* [SequentialBatchWriter.java] - writes mutations with sequential rows and random values
* [RandomBatchWriter.java] - used by SequentialBatchWriter to generate random values
* [RandomBatchScanner.java] - reads random rows and verifies their values

This is an example of how to use the batch writer and batch scanner. To compile
the example, run maven and copy the produced jar into the accumulo lib dir.
This is already done in the tar distribution.
This is an example of how to use the BatchWriter and BatchScanner.

Below are commands that add 10000 entries to accumulo and then do 100 random
queries. The write command generates random 50 byte values.
First, you must ensure that the user you are running with (i.e `myuser` below) has the
`exampleVis` authorization.

Be sure to use the name of your instance (given as instance here) and the appropriate
list of zookeeper nodes (given as zookeepers here).
$ accumulo shell -u root -e "setauths -u myuser -s exampleVis"

Before you run this, you must ensure that the user you are running has the
"exampleVis" authorization. (you can set this in the shell with "setauths -u username -s exampleVis")
Second, you must create the table, batchtest1, ahead of time.

$ accumulo shell -u root -e "setauths -u username -s exampleVis"
$ accumulo shell -u root -e "createtable batchtest1"

You must also create the table, batchtest1, ahead of time. (In the shell, use "createtable batchtest1")
The command below adds 10000 entries with random 50 bytes values to Accumulo.

$ accumulo shell -u username -e "createtable batchtest1"
$ ./bin/runex client.SequentialBatchWriter -c ./examples.conf -t batchtest1 --start 0 --num 10000 --size 50 --batchMemory 20M --batchLatency 500 --batchThreads 20 --vis exampleVis

The command below will do 100 random queries.

$ ./bin/runex client.RandomBatchScanner -c ./examples.conf -t batchtest1 --num 100 --min 0 --max 10000 --size 50 --scanThreads 20 --auths exampleVis

07 11:33:11,103 [client.CountingVerifyingReceiver] INFO : Generating 100 random queries...
07 11:33:11,112 [client.CountingVerifyingReceiver] INFO : finished
07 11:33:11,260 [client.CountingVerifyingReceiver] INFO : 694.44 lookups/sec 0.14 secs
@@ -53,3 +52,7 @@ You must also create the table, batchtest1, ahead of time. (In the shell, use "c
07 11:33:11,416 [client.CountingVerifyingReceiver] INFO : 2173.91 lookups/sec 0.05 secs

07 11:33:11,416 [client.CountingVerifyingReceiver] INFO : num results : 100

[SequentialBatchWriter.java]: ../src/main/java/org/apache/accumulo/examples/client/SequentialBatchWriter.java
[RandomBatchWriter.java]: ../src/main/java/org/apache/accumulo/examples/client/RandomBatchWriter.java
[RandomBatchScanner.java]: ../src/main/java/org/apache/accumulo/examples/client/RandomBatchScanner.java
@@ -16,22 +16,17 @@ limitations under the License.
-->
# Apache Accumulo Client Examples

This documents how you run the simplest java examples.
The following Java classes are examples of the Accumulo client API:

This tutorial uses the following Java classes, which can be found in org.apache.accumulo.examples.client:
* [Flush.java] - flushes a table
* [RowOperations.java] - reads and writes rows
* [ReadWriteExample.java] - creates a table, writes to it, and reads from it

* Flush.java - flushes a table
* RowOperations.java - reads and writes rows
* ReadWriteExample.java - creates a table, writes to it, and reads from it

Using the accumulo command, you can run the simple client examples by providing their
class name, and enough arguments to find your accumulo instance. For example,
the Flush class will flush a table:
The Flush class will flush a table:

$ ./bin/runex client.Flush -c ./examples.conf -t trace

The very simple RowOperations class demonstrates how to read and write rows using the BatchWriter
and Scanner:
The RowOperations class demonstrates how to read and write rows using the BatchWriter and Scanner:

$ ./bin/runex client.RowOperations -c ./examples.conf
2013-01-14 14:45:24,738 [client.RowOperations] INFO : This is everything
@@ -76,3 +71,6 @@ To create a table, write to it and read from it:
hello%08; datatypes:xml [LEVEL1|GROUP1] 1358192329450 false -> world
hello%09; datatypes:xml [LEVEL1|GROUP1] 1358192329450 false -> world

[Flush.java]: ../src/main/java/org/apache/accumulo/examples/client/Flush.java
[RowOperations.java]: ../src/main/java/org/apache/accumulo/examples/client/RowOperations.java
[ReadWriteExample.java]: ../src/main/java/org/apache/accumulo/examples/client/ReadWriteExample.java
@@ -18,7 +18,7 @@ limitations under the License.

This tutorial uses the following Java class, which can be found in org.apache.accumulo.examples.combiner:

* StatsCombiner.java - a combiner that calculates max, min, sum, and count
* [StatsCombiner.java] - a combiner that calculates max, min, sum, and count

This is a simple combiner example. To build this example run maven and then
copy the produced jar into the accumulo lib dir. This is already done in the
@@ -68,3 +68,5 @@ the column family stat and hstat. The stats combiner computes min,max,sum, and
count. It can be configured to use a different base or radix. In the example
above the column family stat is configured for base 10 and the column family
hstat is configured for base 16.

[StatsCombiner.java]: ../src/main/java/org/apache/accumulo/examples/combiner/StatsCombiner.java
@@ -24,10 +24,10 @@ This example stores filesystem information in accumulo. The example stores the i

This example shows how to use Accumulo to store a file system history. It has the following classes:

* Ingest.java - Recursively lists the files and directories under a given path, ingests their names and file info into one Accumulo table, indexes the file names in a separate table, and the file data into a third table.
* QueryUtil.java - Provides utility methods for getting the info for a file, listing the contents of a directory, and performing single wild card searches on file or directory names.
* Viewer.java - Provides a GUI for browsing the file system information stored in Accumulo.
* FileCount.java - Computes recursive counts over file system information and stores them back into the same Accumulo table.
* [Ingest.java] - Recursively lists the files and directories under a given path, ingests their names and file info into one Accumulo table, indexes the file names in a separate table, and the file data into a third table.
* [QueryUtil.java] - Provides utility methods for getting the info for a file, listing the contents of a directory, and performing single wild card searches on file or directory names.
* [Viewer.java] - Provides a GUI for browsing the file system information stored in Accumulo.
* [FileCount.java] - Computes recursive counts over file system information and stores them back into the same Accumulo table.

To begin, ingest some data with Ingest.java.

@@ -114,3 +114,7 @@ Other column family : column qualifier pairs are "~chunk" : chunk size in bytes
There may exist multiple copies of the same file (with the same md5 hash) with different chunk sizes or different visibilities. There is an iterator that can be set on the data table that combines these copies into a single copy with a visibility taken from the visibilities of the file references, e.g. (vis from ref1)|(vis from ref2).

[vis]: visibility.md
[Ingest.java]: ../src/main/java/org/apache/accumulo/examples/dirlist/Ingest.java
[FileCount.java]: ../src/main/java/org/apache/accumulo/examples/dirlist/FileCount.java
[QueryUtil.java]: ../src/main/java/org/apache/accumulo/examples/dirlist/QueryUtil.java
[Viewer.java]: ../src/main/java/org/apache/accumulo/examples/dirlist/Viewer.java
@@ -16,10 +16,10 @@ limitations under the License.
-->
# Apache Accumulo Hello World Example

This tutorial uses the following Java classes, which can be found in org.apache.accumulo.examples.helloworld:
This tutorial uses the following Java classes:

* InsertWithBatchWriter.java - Inserts 10K rows (50K entries) into accumulo with each row having 5 entries
* ReadData.java - Reads all data between two rows
* [InsertWithBatchWriter.java] - Inserts 10K rows (50K entries) into accumulo with each row having 5 entries
* [ReadData.java] - Reads all data between two rows

Log into the accumulo shell:

@@ -45,3 +45,6 @@ To view the entries, use the shell to scan the table:
You can also use a Java class to scan the table:

$ ./bin/runex helloworld.ReadData -c ./examples.conf -t hellotable --startKey row_0 --endKey row_1001

[InsertWithBatchWriter.java]: ../src/main/java/org/apache/accumulo/examples/helloworld/InsertWithBatchWriter.java
[ReadData.java]: ../src/main/java/org/apache/accumulo/examples/helloworld/ReadData.java
@@ -19,10 +19,10 @@ limitations under the License.
Accumulo has an iterator called the intersecting iterator which supports querying a term index that is partitioned by
document, or "sharded". This example shows how to use the intersecting iterator through these four programs:

* Index.java - Indexes a set of text files into an Accumulo table
* Query.java - Finds documents containing a given set of terms.
* Reverse.java - Reads the index table and writes a map of documents to terms into another table.
* ContinuousQuery.java Uses the table populated by Reverse.java to select N random terms per document. Then it continuously and randomly queries those terms.
* [Index.java] - Indexes a set of text files into an Accumulo table
* [Query.java] - Finds documents containing a given set of terms.
* [Reverse.java] - Reads the index table and writes a map of documents to terms into another table.
* [ContinuousQuery.java] - Uses the table populated by Reverse.java to select N random terms per document. Then it continuously and randomly queries those terms.

To run these example programs, create two tables like below.

@@ -64,3 +64,8 @@ randomly selects one set of 5 terms and queries. It prints the number of matchin
[for, static, println, public, the] 55 0.211
[sleeptime, wrappingiterator, options, long, utilwaitthread] 1 0.057
[string, public, long, 0, wait] 12 0.132

[Index.java]: ../src/main/java/org/apache/accumulo/examples/shard/Index.java
[Query.java]: ../src/main/java/org/apache/accumulo/examples/shard/Query.java
[Reverse.java]: ../src/main/java/org/apache/accumulo/examples/shard/Reverse.java
[ContinuousQuery.java]: ../src/main/java/org/apache/accumulo/examples/shard/ContinuousQuery.java

0 comments on commit e3616c4

Please sign in to comment.