Skip to content

TariqAbughofa/sprouter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sprouter: Dynamic Graph Processing over Data Streams at Scale - Apache Spark

Graph data is becoming dominant for many applications such as social networks, targeted advertising, and web indexing. As a result of that, advances in machine learning and data mining techniques depend tightly on the ability to process this data structure efficiently and reliably. Despite the importance of processing dynamic graphs in real-time, it remains a challenge to maintain such graphs and to process them over data streams. We propose Sprouter, an end-to-end framework which is able to store enormous graph data, allows updates in real-time, and supports efficient complex analytics in addition to simple OLTP queries. We demonstrate that our framework is able to ingest and process streaming data efficiently using a scalable multi-cluster distributed architecture, apply incremental graph updates, and store the dynamic graph for fast query performance. Experiments showed the system ability to update graphs with up to 100 million edges in under 50 seconds in a moderate underlying cluster. This good performance is essential for the framework to serve its purpose.

About

Sprouter: Dynamic Graph Processing over Data Streams at Scale

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages