Cross-platform, multi-language implementation of multiple streaming percentile algorithms
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
CMake
cpp
doc
js
.gitignore
.travis.yml
CHANGELOG.md
CMakeLists.txt
CONTRIBUTING.md
LICENSE
README.md
build.bat
build.sh

README.md

streaming-percentiles

Build Status

Develop Branch Master Branch
Build Status Build Status

About the Library

This is a cross-platform library with implementations of various percentile algorithms on streams of data. These algorithms allow you to calculate approximate percentiles (e.g. 50th percentile, 95th percentile) in a single pass over a data set. They are particularly useful for calculating percentiles for immense data sets, for extremely high-throughput systems, or for near-real time use cases.

The library supports the following languages:

  • C++
  • JavaScript

The library implements the following streaming percentile algorithms:

For more background on streaming percentiles, see Calculating Percentiles on Streaming Data.

Obtaining the Library

The current version of the library is 3.0.0, and it was released December 21, 2018.

C++

You can download the latest release of the library from the streaming-percentiles latest release page.

JavaScript

If you use NPM, npm install streaming-percentiles. Otherwise, download the latest release of the library from the streaming-percentiles latest release page page.

For convenience, you can also use the latest release JS directly from sengelha.github.io:

Source Code

You can download the latest release's source code from the streaming-percentiles latest release page.

See CONTRIBUTING.md for instructions on how to build the release from source.

Historical Releases

Historical releases may be downloaded from the streaming-percentiles release page.

Using the Library

C++

Here's an example on how to use the Greenwald-Khanna streaming percentile algorithm from C++:

#include <stmpct/gk.hpp>

double epsilon = 0.1;
stmpct::gk<double> g(epsilon);
for (int i = 0; i < 1000; ++i)
    g.insert(rand());
double p50 = g.quantile(0.5); // Approx. median
double p95 = g.quantile(0.95); // Approx. 95th percentile

JavaScript

Node.JS

Here's an example of how to use the library from Node.JS:

var sp = require('streaming-percentiles');

var epsilon = 0.1;
var g = new sp.GK(epsilon);
for (var i = 0; i < 1000; ++i)
    g.insert(Math.random());
var p50 = g.quantile(0.5); // Approx. median
var p95 = g.quantile(0.95); // Approx. 95th percentile

Browser

Here's an example of how to use the library from a browser. Note that the default module name is streamingPercentiles:

<script src="//sengelha.github.io/streaming-percentiles/streamingPercentiles.v1.min.js"></script>
<script>
var epsilon = 0.1;
var g = new streamingPercentiles.GK(epsilon);
for (var i = 0; i < 1000; ++i)
    g.insert(Math.random());
var p50 = g.quantile(0.5);
</script>

API Reference

See doc/api_reference/ for detailed API reference documentation.

License

This project is licensed under the MIT License. See LICENSE for more information.

Contributing

If you are interested in contributing to the library, please see CONTRIBUTING.md.