Skip to content
Anomaly Prediction in Large Scale Distributed Systems
HTML Jupyter Notebook Python Shell
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
Cassandra Setup
Collectd
Data_Sets
Fault Injection
LSTM
Load Generation
Prometheus
SOM
README.md
Report.pdf

README.md

Anomaly-Prediction-in-Large-Scale-Distributed-Systems


           Performance anomaly prediction is crucial for long running, large scale distributed systems. Many existing monitoring systems are built to analyze system logs produced by distributed systems for troubleshooting and problem diagnosis. However, inspection of such logs are non-trivial owing to the difficulty in converting text logs to vectorized data. This becomes infeasible with the increasing scale and complexity of distributed systems. Few other effective methods employ statistical learning to detect performance anomalies. However, most existing schemes assume labelled training data which requires significant human effort to create annotations and can only handle previously seen anomalies. In this paper, we present two anomaly prediction algorithms based on Self Organizing Maps and Long Short-Term Memory networks. We implemented a prototype of our system on Amazon Web Services and conducted extensive experiments to model the system behavior of Cassandra. Our analysis and results show that both these algorithms pose minimal overhead on the system and are able to predict performance anomalies with high accuracy and achieve sufficient lead time in the process.


System Setup, Prediction Model, and Data files have been divided into seperate folders according to their functionalities. Below are the links to each of those:


System Setup:

         1. Cassandra System Setup
         2. Prometheus
         3. CollectD

Load Generation:

         1. Load Generation

Fault Injection:

         1. Fault Injection

Models:

         1. Long Short-Term Memory
         2. Self Organizing Maps

Data Sets:

         1. Data sets

You can’t perform that action at this time.