This repository contains Python scripts to complete the assignment provided.
The dataset required for the assignment is available in the "files.zip" file, which can be downloaded from the following link: files.zip
The "files.zip" contains 4 sets of CSV files, each consisting of a time series dataset and its corresponding anomaly labels:
- test.csv --> test_label.csv
- smap_test.csv --> smap_test_labels.csv
- msl_test.csv --> msl_test_labels.csv
- psm_test.csv --> psm_test_labels.csv
The assignment tasks and the corresponding Python scripts are as follows:
- Read test and label files: a_ReadTestLabel.py
This script reads the time series data and its corresponding anomaly labels from the CSV files. - Draw time series plots with anomaly regions: b_TimeSeriesPlots.py
This script generates time series plots with highlighted anomaly regions based on the provided labels. - Perform EDA and find out root cause: c_EdaRootCause.py
This script performs exploratory data analysis (EDA) to identify significant variables contributing to anomalies and determines the root cause. - Find out the variables which are the root cause for the anomaly: d_RootCauseVariables.py
This script identifies the variables that are the root cause for anomalies based on the EDA results.
To run the Python scripts, follow these steps:
- Download the "files.zip" dataset from the provided link.
- Extract the contents of the zip file.
- Ensure that Python is installed on your system along with the required libraries (e.g., pandas, matplotlib).
- Run each Python script using a Python interpreter (e.g., python script_name.py).