Stream Processing

There are two ways of run (sbt)

Bloom Filters Collection of input data is hashed (MurmurHash3) -> hash table (represents our input data). Then, we can check if the given item is in the data collection representing by hash table. This method let us save memory and complexity of data.

run BloomFilter <sizeOfHashTable: Int> <"seeds"> <elementsToCheck>

Mirsa - Gries Algorithm The frequency algorithm which finds elements in the stream that occur more than streamLength/k, k is the parameter of the algorithm. There is returned k-1 elements.

run MirsaGries <k: Int>

HyperLogLog Algorithm Let's calculate the number of distinct elements. *(Parameter b describes first bits.)

run HyperLogLog <b: Int>

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.bsp		.bsp
project		project
src/main		src/main
target		target
README.md		README.md
build.sbt		build.sbt

Provide feedback