Skip to content
A curated collection of papers on streaming algorithms
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
anomaly_detection
distinct_value_counting
distribution_functions
duplicate_detection
gradient_methods
summary_statistics
Finding Frequent Items in Data Streams.pdf
README.md

README.md

streaming-papers

A curated collection of papers on streaming algorithms

Please Contribute

If you have papers you want to add, make a pull request. Categories are wide open right now, so just put in a folder that makes sense to you and we'll figure it out.

Distinct Value Counting

distinct_value_counting/Probabilistic_Multiplicity_Counting-Lieven2010a.pdf

Known Implementations

===

Data Streams as Random Permutations: the Distinct Element Problem - Helmi, Lumbroso, Martinez, Viola

distinct_value_counting/data_streams_as_random_permutations.pdf

Known Implementations:

Distribution Functions

===

Dynamic Histograms: Capturing Evolving Data Sets - Donko Donjerkovic, Yannis Ioannidis, Raghu Ramakrishnan

distribution_functions/dynamic-histograms.pdf

Known Implementations:

===

The P2 Algorithm for Dynamic Calculation of Quantiles and Histograms Without Storing Observations - Raj Jain, IMRICH CHLAMTAC

distribution_functions/psqr.pdf

Known Implementations:

===

Effective Computation of Biased Quantiles over Data Streams: Cormode, Korn, Muthukrishnan, Srivastava

distribution_functions/bquant.pdf

Known Implementations:

===

Summary Statistics

Formulas for Robust, One-Pass Parallel Computation of Covariances and Arbitrary-Order Statistical Moments - Pilippe Pebay

Summary Statistics/one_pass_moments_Pebay.pdf

Known Implementations: Kitware/VTK (mirror) - C++ (check in filters/statistics/vtkStatisticsAlgorithm.h)

You can’t perform that action at this time.