Apache Pinot (Incubating) - A realtime distributed OLAP datastore
Branch: master
Clone or download
akshayrai [TE] Enable Piwik tracking ref: #3839 (#3846)
* [TE] Enable Piwik tracking ref: #3839wq

* Fix build
Latest commit a3dd11b Feb 15, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
config Add Pinot code style (#3705) Jan 18, 2019
contrib/pinot-druid-benchmark Auto-reformat all java source files (#3739) Jan 29, 2019
docs Add headers for docs (#3840) Feb 15, 2019
licenses-binary Update LICENSE and NOTICE for jersey version update (#3791) Feb 7, 2019
licenses Update LICENSE and NOTICE files (#3722) Jan 31, 2019
pinot-api Update pom files for preparing Apache release (#3772) Feb 1, 2019
pinot-azure-filesystem Fix wrong Pinot versions (0.016->0.1.0-SNAPSHOT) (#3778) Feb 1, 2019
pinot-broker Update pom files for preparing Apache release (#3772) Feb 1, 2019
pinot-common Refactor periodic task (#3819) Feb 13, 2019
pinot-controller Clarify all methods in PinotFS (#3836) Feb 14, 2019
pinot-core Refactor SegmentNameGenerators and integrate them into Hadoop (#3821) Feb 14, 2019
pinot-distribution Remove temp files from maven-release plugin from source tarbell (#3845) Feb 15, 2019
pinot-filesystem Clarify all methods in PinotFS (#3836) Feb 14, 2019
pinot-hadoop-filesystem Update pom files for preparing Apache release (#3772) Feb 1, 2019
pinot-hadoop Bug fix in SegmentCreationJob and SegmentCreationMapper (#3844) Feb 15, 2019
pinot-integration-tests Refactor periodic task (#3819) Feb 13, 2019
pinot-minion Update pom files for preparing Apache release (#3772) Feb 1, 2019
pinot-perf Add headers for docs (#3840) Feb 15, 2019
pinot-server Fix some indentation for the pom files (#3797) Feb 6, 2019
pinot-tools Refactor SegmentNameGenerators and integrate them into Hadoop (#3821) Feb 14, 2019
pinot-transport Fix some indentation for the pom files (#3797) Feb 6, 2019
thirdeye [TE] Enable Piwik tracking ref: #3839 (#3846) Feb 15, 2019
.codecov.yml Update license-maven-plugin setting to correctly exclude files (#3691) Jan 15, 2019
.codecov_bash Fix codecov ignore coverage for certain files (#1054) Feb 14, 2017
.gitignore Update license-maven-plugin setting to correctly exclude files (#3691) Jan 15, 2019
.travis.yml [TE] build - cut down CI build time (#3401) Oct 31, 2018
.travis_install.sh Update license header (#3664) Jan 8, 2019
.travis_test.sh Update license header (#3664) Jan 8, 2019
DISCLAIMER Update pom files for preparing Apache release (#3772) Feb 1, 2019
HEADER Cleaning up the license-maven-plugin (#3706) Jan 17, 2019
LICENSE Update LICENSE and NOTICE files (#3722) Jan 31, 2019
LICENSE-binary Update LICENSE and NOTICE for jersey version update (#3791) Feb 7, 2019
NOTICE Update pom files for preparing Apache release (#3772) Feb 1, 2019
NOTICE-binary Update LICENSE and NOTICE for jersey version update (#3791) Feb 7, 2019
README.md Add headers for docs (#3840) Feb 15, 2019
pom.xml Add headers for docs (#3840) Feb 15, 2019

README.md

Apache Pinot (incubating)

Build Status codecov.io Join the chat at https://gitter.im/linkedin/pinot license

Apache Pinot is a realtime distributed OLAP datastore, which is used to deliver scalable real time analytics with low latency. It can ingest data from offline data sources (such as Hadoop and flat files) as well as online sources (such as Kafka). Pinot is designed to scale horizontally.

These presentations on Pinot give an overview of Pinot:

Looking for the ThirdEye anomaly detection and root-cause analysis platform? Check out the Pinot/ThirdEye project

Key Features

  • A column-oriented database with various compression schemes such as Run Length, Fixed Bit Length
  • Pluggable indexing technologies - Sorted Index, Bitmap Index, Inverted Index
  • Ability to optimize query/execution plan based on query and segment metadata
  • Near real time ingestion from Kafka and batch ingestion from Hadoop
  • SQL like language that supports selection, aggregation, filtering, group by, order by, distinct queries on fact data
  • Support for multivalued fields
  • Horizontally scalable and fault tolerant

Because of the design choices we made to achieve these goals, there are certain limitations present in Pinot:

  • Pinot is not a replacement for database i.e it cannot be used as source of truth store, cannot mutate data
  • Not a replacement for search engine i.e full text search, relevance not supported
  • Query cannot span across multiple tables

Pinot works very well for querying time series data with lots of Dimensions and Metrics. Example - Query (profile views, ad campaign performance, etc.) in an analytical fashion (who viewed this profile in the last weeks, how many ads were clicked per campaign).

Instructions to build Pinot

More detailed instructions can be found at Quick Demo section in the documentation.

# Clone a repo
$ git clone https://github.com/apache/incubator-pinot.git
$ cd incubator-pinot

# Build Pinot
$ mvn clean install -DskipTests -Pbin-dist

# Run Quck Demo
$ cd pinot-distribution/target/apache-pinot-incubating-<version>-SNAPSHOT-bin
$ bin/quick-start-offline.sh

Getting Involved

Documentation

Check out Pinot documentation for a complete description of Pinot's features.

License

Apache Pinot is under Apache License, Version 2.0