Skip to content

guntersp/streamhist-cpp

Repository files navigation

Building Language grade: C++

streamhist-cpp

header only streamhist port in C++

Overview

This project is C++ port of the streamhist library written in Python and an implementation of the streaming, one-pass histograms described in Ben-Haim's Streaming Parallel Decision Trees <http://jmlr.org/papers/volume11/ben-haim10a/ben-haim10a.pdf>__. The histograms act as an approximation of the underlying dataset. The histogram bins do not have a preset size, so as values stream into the histogram, bins are dynamically added and merged as needed. One particularly nice feature of streaming histograms is that they can be used to approximate quantiles without sorting (or even individually storing) values. Additionally, they can be used for learning, visualization, discretization, or analysis. The histograms may be built independently and merged, making them convenient for parallel and distributed algorithms.

This C++ version is a port of the Python streamhist library and its of the algorithm combines ideas and code from BigML <https://bigml.com>'s Streaming Histograms for Clojure/Java <https://github.com/bigmlcom/histogram> and VividCortex <https://vividcortex.com>'s Streaming approximate histograms in Go <https://github.com/VividCortex/gohistogram>.

License

  • Copyright © 2020 Gunter Spöcker guntersp0@gmail.com
  • Copyright © 2015 Carson Farmer carsonfarmer@gmail.com
  • Copyright © 2013 VividCortex
  • All rights reserved. MIT Licensed.
  • Copyright © 2013 BigML
  • Licensed under the Apache License, Version 2.0

About

streamhist port in C++

Resources

License

Stars

Watchers

Forks

Packages

No packages published