Ruby client for Druid using the Kafka Indexing Service
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
bin
lib
spec
.gitignore
.rspec
.rubocop.yml
.rubocop_todo.yml
.travis.yml
Dockerfile
Gemfile
LICENSE.txt
README.md
Rakefile
docker-compose.yml
druiddb.gemspec

README.md

druiddb-ruby

Build Status Gem Version Code Climate Dependency Status

This documentation is intended to be a quick-start guide, not a comprehensive list of all available methods and configuration options. Please look through the source for more information; a great place to get started is DruidDB::Client and the DruidDB::Query modules as they expose most of the methods on the client.

This guide assumes a significant knowledge of Druid, for more info: http://druid.io/docs/latest/design/index.html

What Does it Do

druiddb-ruby provides a client for your Ruby application to push data to Druid leveraging the Kafka Indexing Service. The client also provides an interface for querying and performing management tasks. It will automatically find and connect to Kafka and the Druid nodes through ZooKeeper, which means you only need to provide the ZooKeeper host and it will find everything else.

Install

$ gem install druiddb

Usage

Creating a Client

client = DruidDB::Client.new()

Note: There are many configuration options, please take a look at DruidDB::Configuration for more details.

Writing Data

Kafka Indexing Service

This gem leverages the Kafka Indexing Service for ingesting data. The gem pushes datapoints onto Kafka topics (typically named after the datasource). You can also use the gem to upload an ingestion spec, which is needed for Druid to consume the Kafka topic.

This repo contains a docker-compose.yml build that may help bootstrap development with Druid and the Kafka Indexing Service. It's what we use for integration testing.

Submitting an Ingestion Spec

path = 'path/to/spec.json'
client.submit_supervisor_spec(path)

Writing Datapoints

topic_name = 'foo'
datapoint = {
  timestamp: Time.now.utc.iso8601,
  foo: 'bar',
  units: 1
}
client.write_point(topic_name, datapoint)

Reading Data

Querying

client.query(
  queryType: 'timeseries',
  dataSource: 'foo',
  granularity: 'day',
  intervals: Time.now.utc.advance(days: -30) + '/' + Time.now.utc.iso8601,
  aggregations: [{ type: 'longSum', name: 'baz', fieldName: 'baz' }]
)

The query method POSTs the query to Druid; for information on querying Druid: http://druid.io/docs/latest/querying/querying.html. This is intentionally simple to allow all current features and hopefully all future features of the Druid query language without updating the gem.

Fill Empty Intervals

Currently, Druid will not fill empty intervals for which there are no points. To accommodate this need until it is handled more efficiently in Druid, use the experimental fill_value feature in your query. This ensure you get a result for every interval in intervals.

This has only been tested with 'timeseries' and single-dimension 'groupBy' queries with simple granularities.

client.query(
  queryType: 'timeseries',
  dataSource: 'foo',
  granularity: 'day',
  intervals: Time.now.utc.advance(days: -30) + '/' + Time.now.utc.iso8601,
  aggregations: [{ type: 'longSum', name: 'baz', fieldName: 'baz' }],
  fill_value: 0
)

Management

List datasources.

client.list_datasources

List supervisor tasks.

client.supervisor_tasks

Development

Docker Compose

This project uses docker-compose to provide a development environment.

  1. git clone the project
  2. cd into project
  3. docker-compose up - this will download necessary images and run all dependencies in the foreground.

When changes are made to the project, rebuild the Docker image with:

$ docker build -t <some_tag> .

Where <some_tag> is something like druiddb-ruby.

To interact with the newly changed project, run it with:

$ docker run -it --network=druiddbruby_druiddb <some_tag> <some_command>

Where <some_command> is a shell command that can be run on the docker image (i.e. bash or anything in the bin folder)

Metabase

Viewing data in the database can be a bit annoying, use a tool like Metabase makes this much easier and is what I personally do when developing.

Testing

Testing is run utilizing the docker-compose environment.

  1. docker-compose up
  2. docker run -it --network=druiddbruby_druiddb <some_tag> bin/run_tests.sh

License

The gem is available as open source under the terms of the MIT License.