Heroic

A scalable time series database based on Bigtable, Cassandra, and Elasticsearch. Go to https://spotify.github.io/heroic/ for documentation.

This project adheres to the Open Code of Conduct. By participating, you are expected to honor this code.

Stability Disclaimer: Heroic is an evolving project, and should in its current state be considered unstable. Do not use in production unless you are willing to spend time with it, experiment and contribute. Not doing so might result in losing your data to goblins. It is currently not on a release schedule and is not versioned. At Spotify we rely on multiple release forks that we actively maintain and flip between.

Building

Java 8 is required.

There are some repackaged dependencies that you have to make available, you do this by running tools/install-repackaged.

$ tools/install-repackaged
Installing repackaged/x
...

After this, the project is built using Gradle:

# full build, runs all tests and builds the shaded jar
./gradlew build

# only compile
./gradlew assemble

# build a single module
./gradlew heroic-metric-bigtable:build

The heroic-dist module can be used to produce a shaded jar that contains all required dependencies:

./gradlew heroic-dist:shadowJar

Running

After building, the entry point of the service is com.spotify.heroic.HeroicService. The following is an example of how this can be run:

./gradlew heroic-dist:runShadow <config>

which is the equivalent of doing:

java -jar $PWD/heroic-dist/build/libs/heroic-dist-0.0.1-SNAPSHOT-shaded.jar <config>

For help on how to write a configuration file, see the Configuration Section of the official documentation.

Heroic has been tested with the following services:

Cassandra (2.1.x, 3.5) when using metric/datastax.
Cloud Bigtable when using metric/bigtable.
Elasticsearch (5.x) when using metadata/elasticsearch or suggest/elasticsearch.
Kafka (0.8.x) when using consumer/kafka.

Building/Running via docker

A docker container with the shaded jar is now available. To build the container:

$ docker build -t heroic:latest .

This is a multi-stage build and will first build Heroic via a ./gradlew clean build and then copy the resulting shaded jar into the runtime container.

Running heroic via docker can be done:

$ docker run -d -p 8080:8080 -p 9091:9091 -v /path/to/config.yml:/heroic.yml spotify/heroic:latest

Logging

Logging is captured using SLF4J, and forwarded to Log4j.

To configure logging, define the -Dlog4j.configurationFile=<path> parameter. You can use docs/log4j2-file.xml as a base.

Testing

We run tests with Gradle:

# run unit tests
./gradlew test

# run integration tests
./gradlew integrationTest

or to run a more comprehensive set of checks:

./gradlew check

This will run:

unit tests
integration tests
Checkstyle
Coverage Reporting with Jacoco

It is strongly recommended that you run the full test suite before setting up a pull request, otherwise it will be rejected by Travis.

Remote Integration Tests

Integration tests are configured to run remotely depending on a set of system properties.

Elasticsearch

Property	Description
`-D elasticsearch.version=<version>`	Use the given client version when building the project
`-D it.elasticsearch.remote=true`	Run Elasticsearch tests against a remote database
`-D it.elasticsearch.seed=<seed>`	Use the given seed (default: `localhost`)
`-D it.elasticsearch.clusterName=<clusterName>`	Use the given cluster name (default: `elasticsearch`)

Datastax

Property	Description
`-D datastax.version=<version>`	Use the given client version when building the project
`-D it.datastax.remote=true`	Run Datastax tests against a remote database
`-D it.datastax.seed=<seed>`	Use the given seed (default: `localhost`)

Bigtable

Property	Description
`-D bigtable.version=<version>`	Use the given client version when building the project
`-D it.bigtable.remote=true`	Run Bigtable tests against a remote database
`-D it.bigtable.project=<project>`	Use the given project
`-D it.bigtable.zone=<zone>`	Use the given zone
`-D it.bigtable.instance=<instance>`	Use the given instance
`-D it.bigtable.credentials=<credentials>`	Use the given credentials file

The following is an example Elasticsearch remote integration test:

$> mvn -P integration-tests \
    -D elasticsearch.version=5.6.0 \
    -D it.elasticsearch.remote=true \
    clean verify

PubSub

PubSub relies on having the PUBSUB_EMULATOR_HOST environment variable set instead of a system property. Detailed instructions are available in the Google PubSub emulator docs.

Full Cluster Tests

Full cluster tests are defined in heroic-dist/src/test/java.

This way, they have access to all the modules and parts of Heroic.

The JVM RPC module is specifically designed to allow for rapid execution of integration tests. It allows multiple cores to be defined and communicate with each other in the same JVM instance.

Coverage

There's an ongoing project to improve test coverage. Clicking the above graph will bring you to codecov.io, where you can find areas to focus on.

Building a Debian Package

This project does not provide a single debian package, this is primarily because the current nature of the service (alpha state) does not mesh well with stable releases.

Instead, you are encouraged to build your own using the provided scripts in this project.

First run the prepare-sources script:

$ debian/bin/prepare-sources myrel 1

myrel will be the name of your release, it will be part of your package name debian-myrel, it will also be suffixed to all helper tools (e.g. heroic-myrel).

For the next step you'll need a Debian environment:

$ dpkg-buildpackage -uc -us

If you encounter problems, you can troubleshoot the build with DH_VERBOSE:

$ env DH_VERBOSE=1 dpkg-buildpackage -uc -us

Contributing

Guidelines for contributing can be found here.

Module Orientation

The Heroic project is split into a couple of modules.

The most critical one is heroic-component. It contains interfaces, value objects, and the basic set of dependencies necessary to glue different components together.

Submodules include metric, suggest, metadata, and aggregation. The first three contain various implementations of the given backend type, while the latter provides aggregation methods.

heroic-core contains the com.spotify.heroic.HeroicCore class which is the central building block for setting up a Heroic instance.

heroic-elasticsearch-utils is a collection of utilities for interacting with Elasticsearch. This is separate since we have more than one backend that needs to talk with elasticsearch.

heroic-parser provides an Antlr4 implementation of com.spotify.heroic.grammar.QueryParser, which is used to parse the Heroic DSL.

heroic-shell contains com.spotify.heroic.HeroicShell, a shell capable of either running a standalone, or connecting to an existing Heroic instance for administration.

heroic-all contains dependencies and references to all modules that makes up a Heroic distribution. This is also where profiles are defined since they need to have access to all dependencies.

Anything in the repackaged directory is dependencies that include one or more Java packages that must be relocated to avoid conflicts. These are exported under the com.spotify.heroic.repackaged groupId.

Finally there is heroic-dist, a small project that depends on heroic-all, heroic-shell, and a logging implementation. Here is where everything is bound together into a distribution — a shaded jar. It also provides the entry-point for services, namely com.spotify.heroic.HeroicService.

Bypassing Validation

To bypass automatic formatting and checkstyle validation you can use the following stanza:

// @formatter:off
final List<String> list = ImmutableList.of(
   "Welcome to...",
   "... The Wild West"
);
// @formatter:on

To bypass a FindBugs error, you should use the @SupressFBWarnings annotation.

@SupressFBWarnings(value="FINDBUGS_ERROR_CODE", justification="I Know Better Than FindBugs")
public class IKnowBetterThanFindbugs() {
    // ...
}

HeroicShell

Heroic comes with a shell that contains many useful tasks, these can either be run in a readline-based shell with some basic completions and history, or standalone.

You can use the following helper script to run the shell directly from the project.

$ tools/heroic-shell [opts]

There are a few interesting options available, most notably is --connect that allows the shell to connect to a remote heroic instance.

See -h for a full listing of options.

You can run individual tasks in standalone mode, giving you a bit more options (like redirecting output) through the following.

$ tools/heroic-shell <heroic-options> -- com.spotify.heroic.shell.task.<task-name> <task-options>

There are also profiles that can be activated with the -P <profile> switch, available profiles are listed in --help.

Repackaged Dependencies

These are third-party dependencies that has to be repackaged to avoid binary incompatibilities with dependencies.

Every time these are upgraded, they must be inspected for new conflicts. The easiest way to do this, is to build the project and look at the warnings for the shaded jar.

$> ./gradlew clean assemble
...
[WARNING] foo-3.5.jar, foo-4.5.jar define 10 overlapping classes:
[WARNING]   - com.foo.ConflictingClass
...

This would indicate that there is a package called foo with overlapping classes.

You can find the culprit using the dependencies task.

$> ./gradlew <module>:dependencies

Name		Name	Last commit message	Last commit date
Latest commit History 1,150 Commits
.cache		.cache
.github		.github
.scripts		.scripts
aggregation		aggregation
assets		assets
consumer		consumer
debian		debian
discovery/simple		discovery/simple
docs		docs
example		example
gradle/wrapper		gradle/wrapper
heroic-all		heroic-all
heroic-component-test		heroic-component-test
heroic-component		heroic-component
heroic-core		heroic-core
heroic-dist		heroic-dist
heroic-elasticsearch-test		heroic-elasticsearch-test
heroic-elasticsearch-utils		heroic-elasticsearch-utils
heroic-loading		heroic-loading
heroic-parser		heroic-parser
heroic-shell		heroic-shell
heroic-test-it		heroic-test-it
heroic-test		heroic-test
idea		idea
metadata		metadata
metric		metric
repackaged		repackaged
rpc		rpc
src/main/resources		src/main/resources
statistics/semantic		statistics/semantic
suggest		suggest
tools		tools
.editorconfig		.editorconfig
.gitignore		.gitignore
.travis.yml		.travis.yml
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
build.gradle		build.gradle
checkstyle.xml		checkstyle.xml
codecov.yml		codecov.yml
gradlew		gradlew
guide-to-dagger2.md		guide-to-dagger2.md
logo.42.png		logo.42.png
rfcs.md		rfcs.md
run-heroic.sh		run-heroic.sh
settings.gradle		settings.gradle
suppressions.xml		suppressions.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Heroic

Building

Running

Building/Running via docker

Logging

Testing

Remote Integration Tests

Elasticsearch

Datastax

Bigtable

PubSub

Full Cluster Tests

Coverage

Building a Debian Package

Contributing

Module Orientation

Bypassing Validation

HeroicShell

Repackaged Dependencies

About

Releases

Packages

Languages

License

odenio/heroic

Folders and files

Latest commit

History

Repository files navigation

Heroic

Building

Running

Building/Running via docker

Logging

Testing

Remote Integration Tests

Elasticsearch

Datastax

Bigtable

PubSub

Full Cluster Tests

Coverage

Building a Debian Package

Contributing

Module Orientation

Bypassing Validation

HeroicShell

Repackaged Dependencies

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages