Anomaly detection project for turbofan engines using the NASA C-MAPSS (Commercial Modular Aero-Propulsion System Simulation) dataset. This project implements and compares multiple anomaly detection techniques: classical methods, unsupervised learning, and deep learning.
The C-MAPSS dataset contains turbofan engine degradation simulation data with multivariate time series from sensors. It includes 4 sub-datasets (FD001-FD004) with different operating conditions and failure modes.
Anomaly_detection/
├── data/ # Raw CMAPSS dataset
│ ├── train_FD001.txt # Training data
│ ├── test_FD001.txt # Test data
│ ├── RUL_FD001.txt # Remaining Useful Life (ground truth)
│ └── ... # FD002, FD003, FD004
│
├── utils/ # Utilities and helper functions
│ ├── load_dataset.py # Data loading and preprocessing
│ ├── metrics.py # Evaluation metrics
│ └── plots.py # Visualization functions
│
├── FD001/ # FD001 experiments
│ ├── 01_exploration.ipynb # Exploratory Data Analysis
│ ├── data/ # Processed data
│ │ ├── train.csv
│ │ ├── test.csv
│ │ └── rul.csv
│ │
│ ├── clasic_methods/ # Classical statistical methods
│ │ ├── z-score.ipynb # Z-score based detection
│ │ ├── PCA.ipynb # Principal Component Analysis
│ │ └── outputs/ # Results and plots
│ │
│ ├── unsupervised_learning/ # Unsupervised learning methods
│ │ ├── isolation_forest.ipynb
│ │ ├── One_Class_SVM.ipynb
│ │ └── outputs/
│ │
│ └── deep_learning/ # Deep Learning approaches
│ ├── Autoencoder.ipynb # Basic Autoencoder
│ ├── LSTM_autoencoder.ipynb # LSTM Autoencoder
│ ├── TCN-VAE.ipynb # Temporal Convolutional Network + VAE
│ └── outputs/
│
├── FD002/ # Same structure for FD002
├── FD003/ # Same structure for FD003
├── FD004/ # Same structure for FD004
│
├── data_extraction.py # Script to download dataset
├── requirements.txt # Project dependencies
└── README.md # This file
# Clone the repository
git clone <repository-url>
cd Anomaly_detection
# Create virtual environment
python3 -m venv .venv
# Activate virtual environment
# On Linux/Mac:
source .venv/bin/activate
# On Windows:
# .venv\Scripts\activate
# Install dependencies
pip install --upgrade pip
pip install -r requirements.txtuv is an ultra-fast Python package manager written in Rust.
# Install uv (if you don't have it)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Clone the repository
git clone <repository-url>
cd Anomaly_detection
# Create virtual environment with uv
uv venv
# Activate virtual environment
source .venv/bin/activate
# Install dependencies with uv (much faster)
uv pip install -r requirements.txtThe project uses the CMAPSS Jet Engine Simulated Data dataset from Kaggle. There are two ways to download it:
# Make sure your virtual environment is activated
python data_extraction.pyThis script will:
- Automatically create the
data/directory if it doesn't exist - Download the dataset using
kagglehub - Place all files in the
data/directory
- Visit: https://www.kaggle.com/datasets/palbha/cmapss-jet-engine-simulated-data
- Download the dataset manually
- Extract the files into the
data/directory
Start with the exploration notebook to understand the dataset:
jupyter notebook FD001/01_exploration.ipynbEach subdirectory (FD001-FD004) contains notebooks organized by method type:
-
Classical Methods:
clasic_methods/- Z-score for outlier detection
- PCA for dimensionality reduction
-
Unsupervised Learning:
unsupervised_learning/- Isolation Forest
- One-Class SVM
-
Deep Learning:
deep_learning/- Basic Autoencoder
- LSTM Autoencoder
- TCN-VAE (Temporal Convolutional Network + Variational Autoencoder)
Results from each experiment are saved in the outputs/ folders within each method. This includes:
- Detected anomaly plots
- Evaluation metrics
- Trained models (checkpoints)
The project includes experiments with all 4 sub-datasets:
- FD001: One operating condition, one failure mode
- FD002: Six operating conditions, one failure mode
- FD003: One operating condition, two failure modes
- FD004: Six operating conditions, two failure modes
To contribute to the project:
- Create a branch for your feature
- Implement your changes
- Make sure notebooks run correctly
- Create a Pull Request
This project is for academic and research purposes.