<a href="https://colab.research.google.com/github/arpdm/predictive-maintenance-platform/blob/main/PdM_NASA_ENGINES_003.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [12]:
"""
    Project Name: PdM Model for NASA Jet Engines
    Data: 10/9/2022
    Author: Arpi Derm
    Description: This model aims to use PdM model and RUL to predict failures in point time T in near future.
                 This specific model is built and trained for the NASA's PCoE Turbofan Engine Degradation (Run-To-Failure) dataset.
    Dataset Description: 
                Dataset description along with its variable can be found in the dataset paper written by 
                Manuel Arias Chao,Chetan Kulkarni, Kai Goebel and Olga Fink. https://dx.doi.org/10.3390/data6010005
    Variable Name Descriptions:
            w = Scenario-descriptor operating conditions (inputs to system model)
            x_s = sensor signal measurements (physical properties)
            x_v = virtual sensor signals
            t = engine health parameters 
            y_rul = target output Y - RUL (Remanding Useful Life) of engine unit
            aux = Auxiliary data such as cycle count, unit id, ...
"""

# Colab data file preparation
from google.colab import drive

drive.mount("/content/drive")

!cp predictive-maintenance-platform/engine_data.py .
!cp predictive-maintenance-platform/plot_util.py .
!cp predictive-maintenance-platform/pdm_models.py .
!cp predictive-maintenance-platform/pcoe_engine_data_visualizer.py .


Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [15]:
import tensorflow as tf

from engine_data import EngineData
from pcoe_engine_data_visualizer import PcoeEngingeVis
from plot_util import PlotUtil
from pdm_models import PdMModels

In [10]:
# Load all datasets
DS_001 = "/content/drive/MyDrive/Predictive_Maintenence_Fault_Detection/data_set/N-CMAPSS_DS01-005.h5"
DS_002 = "/content/drive/MyDrive/Predictive_Maintenence_Fault_Detection/data_set/N-CMAPSS_DS02-006.h5"
DS_003 = "/content/drive/MyDrive/Predictive_Maintenence_Fault_Detection/data_set/N-CMAPSS_DS03-012.h5"
DS_004 = "/content/drive/MyDrive/Predictive_Maintenence_Fault_Detection/data_set/N-CMAPSS_DS04.h5"
DS_005 = "/content/drive/MyDrive/Predictive_Maintenence_Fault_Detection/data_set/N-CMAPSS_DS05.h5"
DS_006 = "/content/drive/MyDrive/Predictive_Maintenence_Fault_Detection/data_set/N-CMAPSS_DS06.h5"
DS_007 = "/content/drive/MyDrive/Predictive_Maintenence_Fault_Detection/data_set/N-CMAPSS_DS07.h5"
DS_008 = "/content/drive/MyDrive/Predictive_Maintenence_Fault_Detection/data_set/N-CMAPSS_DS08a-009.h5"
DS_009 = "/content/drive/MyDrive/Predictive_Maintenence_Fault_Detection/data_set/N-CMAPSS_DS08c-008.h5"
DS_010 = "/content/drive/MyDrive/Predictive_Maintenence_Fault_Detection/data_set/N-CMAPSS_DS08d-010.h5"

In [None]:
# Load data set and prepare data frames
ed = EngineData()
pcoe_enging_vis = PcoeEngingeVis(ed)
plot_util = PlotUtil()
pdmModels = PdMModels()

# Load Data from selected dataset file
ed.load_hdf5_to_numpy_arr(DS_004)

In [None]:
# Generate heatmap of sensor measurements dataframe to find correlation between each feature
plot_util.generate_correlation_heatmap(ed.df_x_s_train)

In [None]:
# Add labels to training dataset and normalize data
ed.df_x_s_train = ed.add_labels_to_dataset(ed.df_x_s_train)
ed.df_x_s_train = ed.normalize_dataset(ed.df_x_s_train)

# Add lavels to test dataset and normalize data
ed.df_x_s_test = ed.add_labels_to_dataset(ed.df_x_s_test)
ed.df_x_s_test = ed.normalize_dataset(ed.df_x_s_test)

## MODELING

The traditional predictive maintenance machine learning models are based on feature engineering which is manual construction of right features using domain expertise and similar methods. This usually makes these models hard to reuse since feature engineering is specific to the problem scenario and the available data which varies from one business to the other. Perhaps the most attractive part of applying deep learning in the predictive maintenance domain is the fact that these networks can automatically extract the right features from the data, eliminating the need for manual feature engineering.

he idea of using LSTMs is to let the model extract abstract features out of the sequence of sensor values in the window rather than engineering those manually. The expectation is that if there is a pattern in these sensor values within the window prior to failure, the pattern should be encoded by the LSTM.

One critical advantage of LSTMs is their ability to remember from long-term sequences (window sizes) which is hard to achieve by traditional feature engineering. For example, computing rolling averages over a window size of 50 cycles may lead to loss of information due to smoothing and abstracting of values over such a long period, instead, using all 50 values as input may provide better results. While feature engineering over large window sizes may not make sense, LSTMs are able to use larger window sizes and use all the information in the window as input. 
LSTM also has this long term memory over regular RNN architecture.

In [None]:
# Get single engine data for training and test datasets
single_engine_train = ed.generate_data_frame_for_specific_engine(ed.df_x_s_train, 3)
single_engine_test = ed.generate_data_frame_for_specific_engine(ed.df_x_s_test, 7)

# window of 50 cycles prior to a failure point for engine id 3
engine_id3_windowed = ed.generate_data_frame_for_specific_engine(ed.df_x_s_train, 3, 50)
ax1 = engine_id3_windowed.plot(subplots=True, sharex=True, figsize=(40,40))

In [None]:
window = 40
horizon = 5
batch_size = 256
buffer_size = 150

epochs = 40
steps = 100

# Prepare X and Y training and test sets
x_train, y_train = ed.generate_data_frame_for_x_and_y(single_engine_train)
x_test, y_test = ed.generate_data_frame_for_x_and_y(single_engine_test)

features = x_train.shape[2]
model_mark_001 = pdmModels.create_binary_classifier_model(window, features, 100, 50, 0.2)

In [None]:
# Fit the model
mark_001_history = pdmModels.fit_model(model_mark_001, x_train, y_train, epochs, steps, batch_size, 10)

In [None]:
# Generate accuracy and loss plots based on fitting history and save it to drive
MARK_001_Data = "/content/drive/MyDrive/Predictive_Maintenance_Fault_Detection/mark_001"
plot_util.generate_training_history_accuracy_and_loss_plot(mark_001_history, MARK_001_Data)