# Predicting Remaining Useful Life (RUL) of Turbofan Engines with the NASA C‑MAPSS FD001 Dataset

## 1. Business understanding

Predictive maintenance aims to forecast when industrial equipment will fail so that maintenance can be scheduled proactively. For aircraft engines, accurately estimating the **remaining useful life (RUL)** helps airlines plan maintenance, reduce downtime and avoid catastrophic failures. NASA created the **Commercial Modular Aero‑Propulsion System Simulation (C‑MAPSS)** dataset to support research in prognostics and health management.  Each dataset consists of multivariate time‑series from a fleet of simulated turbofan engines that are run until failure.  The FD001 subset used here has 100 training trajectories and 100 test trajectories, one operating condition (sea level) and one fault mode (high‑pressure‑compressor degradation)【932307771945710†L75-L85】.  Each row represents a single operational cycle and includes an engine unit identifier, cycle number, three operational settings and 21 sensor measurements【932307771945710†L68-L73】.  The engines start in a healthy condition and gradually deteriorate; the training sets run until failure whereas the test sets stop before failure.  A separate file contains the true RUL for each test engine【932307771945710†L59-L66】.

In this notebook we follow the **CRISP‑DM** (Cross‑Industry Standard Process for Data Mining) framework:

1. **Business understanding** – articulate the problem and its value.
2. **Data understanding** – explore the structure and quality of the data.
3. **Data preparation** – clean and transform the data for modeling.
4. **Modeling** – train a random forest regression model to estimate RUL.
5. **Evaluation** – assess model performance using appropriate metrics.
6. **Deployment / Results** – summarize findings and potential next steps.

Citations to the official dataset description and the damage propagation paper are provided throughout to ground the discussion.【932307771945710†L68-L73】【522820073572242†L241-L355】


## 2. Data understanding

The FD001 dataset is provided in three text files:

- **`train_FD001.txt`** – 100 run‑to‑failure trajectories with 26 space‑separated columns.  The first column is the engine unit number, the second is the cycle index and the next 3 columns are operational settings followed by 21 sensor measurements【932307771945710†L68-L73】.
- **`test_FD001.txt`** – 100 partial trajectories that end before failure.
- **`RUL_FD001.txt`** – a vector of the true remaining useful life for each test engine.

According to the data description, the engines operate under one sea‑level condition and a single fault mode【932307771945710†L75-L85】.  This subset therefore avoids confounding from multiple operating conditions.  Each engine begins at a different initial state due to manufacturing variation and sensor noise, so individual trajectories are not identical.

We load the data using `pandas` and assign meaningful column names.  The column list comprises `unit_id`, `time_cycles`, the three operational settings and sensor names `sensor_1` to `sensor_21`.  We compute the RUL for each observation in the training set by subtracting the current cycle from the engine’s maximum cycle.  A higher RUL means more time until failure, while RUL = 0 indicates failure at that cycle【173156099207161†L131-L144】.

The code cell below loads and inspects the data.


In [1]:
import pandas as pd
import numpy as np

# Path to dataset files
DATA_PATH = '/home/oai/share'

# Column names: unit id, time cycles, 3 operational settings, 21 sensors
columns = ['unit_id', 'time_cycles', 'op_setting_1', 'op_setting_2', 'op_setting_3'] + [f'sensor_{i}' for i in range(1, 22)]

# Load the training, test and RUL files
train = pd.read_csv(f'{DATA_PATH}/train_FD001.txt', sep=r'\s+', header=None, names=columns)
test = pd.read_csv(f'{DATA_PATH}/test_FD001.txt', sep=r'\s+', header=None, names=columns)
rul_test = pd.read_csv(f'{DATA_PATH}/RUL_FD001.txt', sep=r'\s+', header=None, names=['RUL'])

# Compute RUL for training data: RUL = max_cycle - current_cycle per engine
train['RUL'] = train.groupby('unit_id')['time_cycles'].transform(lambda x: x.max() - x)

# Peek at the data
print('Training set shape:', train.shape)
print('Test set shape:', test.shape)
print('')
print('First few rows of training data:')
print(train.head())
print('')

# Summary: number of cycles per engine
cycles_summary = train.groupby('unit_id')['time_cycles'].max().describe()
print('Summary of maximum cycles per engine in training set:')
print(cycles_summary)


Training set shape: (20631, 27)
Test set shape: (13096, 26)

First few rows of training data:
   unit_id  time_cycles  op_setting_1  op_setting_2  op_setting_3  sensor_1  \
0        1            1       -0.0007       -0.0004         100.0    518.67   
1        1            2        0.0019       -0.0003         100.0    518.67   
2        1            3       -0.0043        0.0003         100.0    518.67   
3        1            4        0.0007        0.0000         100.0    518.67   
4        1            5       -0.0019       -0.0002         100.0    518.67   

   sensor_2  sensor_3  sensor_4  sensor_5  ...  sensor_13  sensor_14  \
0    641.82   1589.70   1400.60     14.62  ...    2388.02    8138.62   
1    642.15   1591.82   1403.14     14.62  ...    2388.07    8131.49   
2    642.35   1587.99   1404.20     14.62  ...    2388.03    8133.23   
3    642.35   1582.79   1401.87     14.62  ...    2388.08    8133.83   
4    642.37   1582.85   1406.22     14.62  ...    2388.04    8133.80   

## 3. Data preparation

### 3.1 Feature selection

Some sensors may provide little or no useful information for degradation because they remain constant across all cycles.  Following common practice, we compute the variance of each sensor across the training set and remove sensors with variance below a threshold (0.01)【173156099207161†L154-L161】.

### 3.2 Feature scaling

Because the sensors measure different physical quantities (e.g., temperatures in °R and pressures in psia) their ranges differ widely【173156099207161†L175-L184】.  To ensure that the model treats each variable fairly, we apply min‑max scaling to the operational settings and sensor features, fitting the scaler on the training data and applying it to both training and test sets.

The following code performs feature selection and scaling, and prepares `X_train`, `y_train` for modeling.


In [2]:
from sklearn.preprocessing import MinMaxScaler

# Identify sensor columns
sensor_cols = [col for col in train.columns if col.startswith('sensor_')]

# Drop sensors with very low variance (< 0.01)
variances = train[sensor_cols].var()
drop_sensors = variances[variances < 0.01].index.tolist()

print(f'Dropping {len(drop_sensors)} constant/low‑variance sensors: {drop_sensors}')

# Prepare feature columns: operational settings + remaining sensors
feature_cols = ['op_setting_1', 'op_setting_2', 'op_setting_3'] + [c for c in sensor_cols if c not in drop_sensors]

# Initialize MinMaxScaler and fit on training data
scaler = MinMaxScaler()
train_scaled = train.copy()
test_scaled = test.copy()

train_scaled[feature_cols] = scaler.fit_transform(train[feature_cols])
test_scaled[feature_cols] = scaler.transform(test[feature_cols])

# Prepare training features and target
X_train = train_scaled[feature_cols]
y_train = train_scaled['RUL']


Dropping 10 constant/low‑variance sensors: ['sensor_1', 'sensor_5', 'sensor_6', 'sensor_8', 'sensor_10', 'sensor_13', 'sensor_15', 'sensor_16', 'sensor_18', 'sensor_19']


## 4. Modeling

We choose a **Random Forest Regressor** to estimate the remaining useful life.  Random forests are ensemble models that average the predictions of multiple decision trees; they can capture nonlinear relationships and handle correlated features.  We split the training data into a training set and a validation set, train the model and evaluate its performance using root‑mean‑squared error (RMSE) and mean absolute error (MAE).  Cross‑validation could also be used but is computationally expensive with a large dataset.

The code below fits the model and evaluates it on a hold‑out validation set.  We also compute predictions for the test set and compare them against the provided `RUL_FD001.txt` values.


In [3]:
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, mean_absolute_error

# Split into train and validation sets
data_X = X_train
data_y = y_train

X_tr, X_val, y_tr, y_val = train_test_split(data_X, data_y, test_size=0.2, random_state=42)

# Define RandomForestRegressor model
rf_model = RandomForestRegressor(n_estimators=200, random_state=42, n_jobs=-1)

# Fit model
rf_model.fit(X_tr, y_tr)

# Predict on validation set
y_pred_val = rf_model.predict(X_val)

# Calculate evaluation metrics
rmse = mean_squared_error(y_val, y_pred_val, squared=False)
mae = mean_absolute_error(y_val, y_pred_val)
print(f'Validation RMSE: {rmse:.4f}')
print(f'Validation MAE: {mae:.4f}')

# Predict RUL for the last cycle of each engine in the test set
# For each engine we take only its last row (latest time_cycles)
test_last_cycles = test_scaled.groupby('unit_id').tail(1)
X_test_last = test_last_cycles[feature_cols]

# Predict RUL on test data
y_test_pred = rf_model.predict(X_test_last)

# Compare with true RUL
true_rul = rul_test['RUL'].values

rmse_test = mean_squared_error(true_rul, y_test_pred, squared=False)
mae_test = mean_absolute_error(true_rul, y_test_pred)
print('Test set results (comparing predicted RUL on last cycle vs true RUL):')
print(f'Test RMSE: {rmse_test:.4f}')
print(f'Test MAE: {mae_test:.4f}')


Validation RMSE: 41.6561
Validation MAE: 29.8038
Test set results (comparing predicted RUL on last cycle vs true RUL):
Test RMSE: 34.0676
Test MAE: 25.2416




## 5. Evaluation

The random forest model achieves a validation RMSE and MAE that indicate how closely the predicted RUL matches the actual values in the hold‑out set.  On the test set, we evaluate the model by comparing predictions for the last available cycle of each engine to the true RUL provided in `RUL_FD001.txt`.  These metrics help assess whether the model can generalize to unseen engines.

A future improvement would be to perform k‑fold cross‑validation and tune hyperparameters (number of trees, maximum depth, minimum samples per leaf) to improve accuracy.  Feature engineering (e.g., constructing trend‑based features or using sequences) may also boost performance.


## 6. Deployment / Conclusion

This notebook followed the CRISP‑DM methodology to build a predictive maintenance model for turbofan engines using NASA’s C‑MAPSS FD001 data.  We described the business value of estimating remaining useful life, explored the structure of the data, prepared it by removing low‑variance sensors and scaling features, and trained a random forest regressor.  Evaluation on a validation set and the provided test RUL values demonstrated reasonable predictive performance.

In a production setting, such a model could be integrated into a maintenance management system to alert engineers when an engine is approaching end‑of‑life, enabling proactive maintenance.  Further work could involve using more sophisticated models (e.g., gradient boosting, deep learning), incorporating temporal sequence features and performing rigorous hyperparameter tuning.
