A predictive maintenance system that forecasts industrial equipment failures from sensor data using gradient boosting, survival analysis for remaining useful life estimation, and SHAP-based explanations.
Unplanned equipment downtime costs industrial operations thousands of dollars per hour. This project builds a failure prediction pipeline that ingests sensor readings (temperature, vibration, pressure, RPM, power consumption) from 50 machines and predicts whether a failure will occur within the next 7 days. Four classifiers are trained, a Weibull survival model estimates remaining useful life, and a cost-based threshold optimizer balances unplanned downtime ($15K) against preventive maintenance ($1.5K).
Problem → Predicting equipment failures before they happen using sensor data
Solution → XGBoost with survival analysis, SHAP explanations, and cost-based threshold tuning
Impact → AUC 0.94, catches 91% of failures with optimized maintenance scheduling
| Metric | Value |
|---|---|
| AUC-ROC | 0.94 |
| Recall (failures caught) | 91% |
| Precision | 78% |
| PR-AUC | 0.76 |
| Best model | XGBoost |
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ Sensor data │───▶│ Feature │───▶│ Rolling │
│ generation │ │ extraction │ │ aggregations │
└──────────────────┘ └──────────────────┘ └────────┬─────────┘
│
┌──────────────────────────────┘
▼
┌──────────────────────┐ ┌──────────────────────┐
│ Model training │───▶│ Survival analysis │
│ (4 classifiers) │ │ (Weibull AFT / RUL) │
└──────────────────────┘ └──────────┬───────────┘
│
┌──────────────────────────┘
▼
┌──────────────────────┐ ┌──────────────────────┐
│ SHAP explanations │───▶│ Maintenance │
│ + threshold tuning │ │ dashboard │
└──────────────────────┘ └──────────────────────┘
Project structure
project_21_predictive_maintenance/
├── data/
│ ├── sensor_readings.csv # Sensor dataset
│ └── generate_data.py # Synthetic data generator
├── src/
│ ├── __init__.py
│ ├── data_loader.py # Data generation and loading
│ └── model.py # Training, evaluation, SHAP, RUL
├── notebooks/
│ ├── 01_eda.ipynb # Exploratory data analysis
│ ├── 02_feature_engineering.ipynb # Rolling features, interactions
│ ├── 03_modeling.ipynb # Model training and CV
│ └── 04_evaluation.ipynb # ROC, SHAP, cost analysis
├── app.py # Streamlit dashboard
├── requirements.txt
└── README.md
# Clone and navigate
git clone https://github.com/guydev42/calgary-data-portfolio.git
cd calgary-data-portfolio/project_21_predictive_maintenance
# Install dependencies
pip install -r requirements.txt
# Generate sensor data
python data/generate_data.py
# Launch dashboard
streamlit run app.py| Property | Details |
|---|---|
| Source | Synthetic industrial sensor data |
| Readings | 15,000 |
| Machines | 50 |
| Failure rate | ~8% (1,200 pre-failure readings) |
| Features | 11 (temperature, vibration, pressure, rpm, power, rolling stats) |
| Target | failure_within_7days (binary) |
Sensor feature engineering
- Rolling 24h mean temperature and standard deviation of vibration
- Temperature-pressure ratio as an interaction feature
- Machine-level attributes: age, operating hours, maintenance history
Model training
- Four classifiers: Logistic Regression, Random Forest, XGBoost, Gradient Boosting
- 5-fold StratifiedKFold cross-validation
- Class imbalance handled via class_weight and scale_pos_weight
- Metrics: AUC-ROC, precision, recall, F1, PR-AUC
Survival analysis
- Weibull Accelerated Failure Time (AFT) model from lifelines
- Covariates: machine age, mean temperature, mean vibration
- Outputs remaining useful life (RUL) estimates per machine
SHAP explainability
- TreeExplainer for gradient boosting models
- Global feature importance via mean absolute SHAP values
- Waterfall plots for individual sensor reading explanations
Cost-optimized threshold
- Business cost model: FN cost ($15,000 unplanned downtime) vs FP cost ($1,500 preventive maintenance)
- Sweep thresholds from 0.05 to 0.95 to minimize total cost
- Achieves 91% recall with optimized maintenance scheduling
Built as part of the Calgary Data Portfolio.