Skip to content

andreistan26/top-string

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Top-String

Objective

How to implement an efficient system for counting the most frequent strings out of many records. We will implemment and then benchmark the following solutions:

  • local data and local counter
  • remote data and local counter
  • remote data and distributed counter

We will test for different data volumes

  • a lot of small strings (10 - 100 milion 256 char strings)
  • not as many big strings (10k 10mb strings)

The Base Model and Easy Benchmarking

So this app should be resumed to interfaces as the main abstraction. Essentially, we will need 2 interfaces, a sender and a reciver.

Receiver

  • StartServer
  • Accept

Sender

Local data and local counter

For this we need a basic hashmap.

Usage

For local usage:

top-string start local [PATH] [OPTIONS]

For remote server usage:

top-string start server [OPTIONS]

For remote sender usage:

top-string start send [IP:PORT] [PATH] [OPTIONS]

About

Benchmarked way of finding the most frequent string out of many records

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages