GitHub - msb1/triple-join: 3-way INNER JOINS with aggregation -- Python SQL, Pandas, Scala Kafka Streams, Scala Flink and Scala Spark Structured Streams

Triple Joins with Simulated Data

Cards are generated with id and additional data
Verifications by Users are generated
User are generated
First verifications are joined to cards
Next, users are joined to the verified cards
Then an aggregation is performed on the users where the cards verified by each user are determined
Finally a filter is applied to only output users with greater than a certain number (200) of verifications

Case 1: SQL with SQLite3 in Python (no ORM)

Case 2: Pandas in Python with Dataframes

Case 3: Scala Spark Structured Streaming with Kafka generated streams

Program 1 generates three simulated data streams to three Kafka Producer topics
Program 2 performs the Triple Join with Aggregation

Case 4: Scala Kafka Streams

Use same Program 1 from Case 3 to generate data records to Producer topics
Program is Scala Kafka Streams implementation of Triple Join with aggregation

Case 5: Scala Flink with Kafka generated streams

Use same Program 1 from Case 3 to generate data records to Producer topics
Program is Scala Flink implementation of Triple Join with aggregation

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
flink-scala		flink-scala
kafka-streams-scala		kafka-streams-scala
python-sql-pandas		python-sql-pandas
spark-structured-streams-scala		spark-structured-streams-scala
.gitattributes		.gitattributes
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Triple Joins with Simulated Data

Case 1: SQL with SQLite3 in Python (no ORM)

Case 2: Pandas in Python with Dataframes

Case 3: Scala Spark Structured Streaming with Kafka generated streams

Case 4: Scala Kafka Streams

Case 5: Scala Flink with Kafka generated streams

About

Uh oh!

Releases

Packages

Uh oh!

Languages

msb1/triple-join

Folders and files

Latest commit

History

Repository files navigation

Triple Joins with Simulated Data

Case 1: SQL with SQLite3 in Python (no ORM)

Case 2: Pandas in Python with Dataframes

Case 3: Scala Spark Structured Streaming with Kafka generated streams

Case 4: Scala Kafka Streams

Case 5: Scala Flink with Kafka generated streams

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages