Skip to content
/ hoidla Public

Set of real time stream processing algorithms that can be used by big data streaming platform

Notifications You must be signed in to change notification settings

pranab/hoidla

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

Set of reusable big data real time streaming algorithms. Can be used by Spark Streaming, Storm or any other stream computation framework

Philosophy

  • Plain java API that can be used from any stream computation framework

Blogs

The following blogs of mine are good source of details. These are the only source of detail documentation

Solution

  • Probabilstic frequent count with sketches and count based algorithms
  • Probabilstic cardinality or unique item count
  • Probabilstic set inclusion
  • Different sampling methods
  • Windowing including simple stats
  • Pattern detection
  • Event cluster detection

Getting started

Project's resource directory has various tutorial documents for the use cases described in the blogs.

Help

Please feel free to email me at pkghosh99@gmail.com

About

Set of real time stream processing algorithms that can be used by big data streaming platform

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages