Jupiter

This project implements 3 data stores on top of Apache Accumulo : one for blobs, one for terms and facts (a fact is a predicate expression that makes a declarative statement about a problem domain) and one for JSON objects.

These data stores are not meant to be efficients but are intended to be easy to use.

Adding Jupiter to your build

Jupiter's Maven group ID is com.computablefacts and its artifact ID is jupiter.

To add a dependency on Jupiter using Maven, use the following:

<dependency>
  <groupId>com.computablefacts</groupId>
  <artifactId>jupiter</artifactId>
  <version>0.x</version>
</dependency>

Snapshots

Snapshots of Jupiter built from the master branch are available through Sonatype using the following dependency:

<dependency>
  <groupId>com.computablefacts</groupId>
  <artifactId>jupiter</artifactId>
  <version>0.x-SNAPSHOT</version>
</dependency>

In order to be able to download snapshots from Sonatype add the following profile to your project pom.xml:

 <profiles>
    <profile>
        <id>allow-snapshots</id>
        <activation><activeByDefault>true</activeByDefault></activation>
        <repositories>
            <repository>
                <id>snapshots-repo</id>
                <url>https://s01.oss.sonatype.org/content/repositories/snapshots</url>
                <releases><enabled>false</enabled></releases>
                <snapshots><enabled>true</enabled></snapshots>
            </repository>
        </repositories>
    </profile>
</profiles>

Publishing a new version

Deploy a release to Maven Central with these commands:

$ git tag <version_number>
$ git push origin <version_number>

To update and publish the next SNAPSHOT version, just change and push the version:

$ mvn versions:set -DnewVersion=<version_number>-SNAPSHOT
$ git commit -am "Update to version <version_number>-SNAPSHOT"
$ git push origin master

Points of interest

BlobStore

The BlobStore API allows your application to persist data objects. Methods are available to write and read opaque Strings, JSON and Files.

Configurations configurations = ...;
BlobStore blobStore = new BlobStore(configurations, "blobs" /* table name */);

// Write blobs
Set<String> noBlobSpecificVizLabels = Sets.newHashSet();

try (BatchWriter writer = blobStore.writer()) {
    
    String str = ...;
    blobStore.putString(writer, "my_strings", UUID.randomUUID().toString(), noBlobSpecificVizLabels, str);

    Map<String, Object> json = ...;
    blobStore.putJson(writer, "my_jsons", UUID.randomUUID().toString(), noBlobSpecificVizLabels, json);

    File file = ...;
    blobStore.putFile(writer, "my_files", UUID.randomUUID().toString(), noBlobSpecificVizLabels, file);
}

// Read blobs
// For convenience, <dataset>_RAW_DATA authorizations are automatically added to each blob
Authorizations auths = new Authorizations("MY_STRINGS_RAW_DATA", "MY_JSONS_RAW_DATA", "MY_FILES_RAW_DATA");

blobStore.strings(auths, "my_strings", null, null).forEachRemaining(blob -> ...);
blobStore.jsons(auths, "my_jsons", null, null).forEachRemaining(blob -> ...);
blobStore.files(auths, "my_files", null, null).forEachRemaining(blob -> ...);

Note that it is possible to filter-out JSON fields at the tserver level before returning the JSON object to the client.

Map<String, Object> json = new HashMap<>();
json.put("first_name", "john");
json.put("last_name", "doe");
json.put("email", "john.doe@gmail.com");
json.put("password", "&N?8LXtT7&f4@nH$");

try (BatchWriter writer = blobStore.writer()) {
    blobStore.putJson(writer, "my_jsons", UUID.randomUUID().toString(), Sets.newHashSet(), json);
}

Set<String> fieldsToKeep = Sets.newHashSet("first_name", "last_name", "email");
Optional<Value> blob = blobStore.jsons(scanner, "my_jsons", null, fieldsToKeep).first();

json.remove("password");
Assert.assertEquals(json, Codecs.asObject(blob.get().toString()));

TermStore

The TermStore API allows your application to persist buckets of key-value pairs. Numbers and dates are automatically lexicoded to maintain their native Java sort order.

Configurations configurations = ...;
TermStore termStore = new TermStore(configurations, "terms" /* table name */);

Map<String, Object> bucket = new HashMap<>();
bucket.put("first_name", "john");
bucket.put("last_name", "doe");
bucket.put("age", 37);
bucket.put("last_seen", new Date());

String dataset = "my_buckets";
String bucketId = UUID.randomUUID().toString();
Set<String> bucketSpecificLabels = Sets.newHashSet("MY_BUCKETS_RAW_DATA");

// Write terms
try (BatchWriter writer = termStore.writer()) {
    
    bucket.entrySet().forEach(keyValuePair -> {
        
        String field = keyValuePair.getKey();
        Object value = keyValuePair.getValue();
        Set<String> fieldSpecificLabels = Sets.newHashSet();
        
        boolean isOk = termStore.put(writer, dataset, bucketId, key, value, 1, bucketSpecificLabels, fieldSpecificLabels);
    });
}

Authorizations auths = new Authorizations("MY_BUCKETS_RAW_DATA");

/* Get the number of distinct buckets containing a given term */
        
// Wildcard query
termStore.termCardinalityEstimationForBuckets(scanner, dataset, "joh*").forEachRemaining(estimation -> ...);

// Range query
termStore.termCardinalityEstimationForBuckets(scanner, dataset, null, 30, 40).forEachRemaining(estimation -> ...);

/* Get buckets ids containing a given term */

// Wildcard query
termStore.bucketsIds(scanner, dataset, "joh*").forEachRemaining(term -> ...);
        
// Range query    
termStore.bucketsIds(scanner, dataset, null, 30, 40, null).forEachRemaining(term -> ...);

Name		Name	Last commit message	Last commit date
Latest commit History 474 Commits
.mvn		.mvn
.travis		.travis
src		src
test/com/computablefacts/jupiter		test/com/computablefacts/jupiter
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
deploy.sh		deploy.sh
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Jupiter

Adding Jupiter to your build

Snapshots

Publishing a new version

Points of interest

BlobStore

TermStore

About

Releases

Packages

Contributors 2

Languages

License

computablefacts/jupiter

Folders and files

Latest commit

History

Repository files navigation

Jupiter

Adding Jupiter to your build

Snapshots

Publishing a new version

Points of interest

BlobStore

TermStore

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages