Sprouter: Dynamic Graph Processing over Data Streams at Scale - Apache Spark

Graph data is becoming dominant for many applications such as social networks, targeted advertising, and web indexing. As a result of that, advances in machine learning and data mining techniques depend tightly on the ability to process this data structure efficiently and reliably. Despite the importance of processing dynamic graphs in real-time, it remains a challenge to maintain such graphs and to process them over data streams. We propose Sprouter, an end-to-end framework which is able to store enormous graph data, allows updates in real-time, and supports efficient complex analytics in addition to simple OLTP queries. We demonstrate that our framework is able to ingest and process streaming data efficiently using a scalable multi-cluster distributed architecture, apply incremental graph updates, and store the dynamic graph for fast query performance. Experiments showed the system ability to update graphs with up to 100 million edges in under 50 seconds in a moderate underlying cluster. This good performance is essential for the framework to serve its purpose.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
lib		lib
project		project
src/main/scala		src/main/scala
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lib

lib

project

project

src/main/scala

src/main/scala

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

build.sbt

build.sbt

Repository files navigation

Sprouter: Dynamic Graph Processing over Data Streams at Scale - Apache Spark

About

Releases

Packages

Languages

License

TariqAbughofa/sprouter

Folders and files

Latest commit

History

Repository files navigation

Sprouter: Dynamic Graph Processing over Data Streams at Scale - Apache Spark

About

Topics

Resources

License

Stars

Watchers

Forks

Languages