<a href="https://colab.research.google.com/github/Guruprasad7892/ML-Mini-Project1/blob/main/ML_Mini_Project1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Aircraft Engine Maintenance Prognosis**

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


# **Importing Libraries**

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import accuracy_score, classification_report

# **Loading Datasets**

df_train, df_test, df_truth: Reading datasets from Excel files that contain information about aircraft engine performance.

In [3]:
df_train = pd.read_excel(r'/content/drive/MyDrive/PM_train.xlsx')
df_test = pd.read_excel(r'/content/drive/MyDrive/PM_test.xlsx')
df_truth = pd.read_excel(r'/content/drive/MyDrive/PM_truth.xlsx')

# **Data Concatenation**

Concatenation: Combining df_train and df_test into a single dataframe (df) to work with the entire dataset.

In [4]:
df = pd.concat([df_train, df_test], ignore_index=True)
df.head()

Unnamed: 0,id,cycle,setting1,setting2,setting3,s1,s2,s3,s4,s5,...,s12,s13,s14,s15,s16,s17,s18,s19,s20,s21
0,1,1,-0.0007,-0.0004,100,518.67,641.82,1589.7,1400.6,14.62,...,521.66,2388.02,8138.62,8.4195,0.03,392.0,2388,100,39.06,23.419
1,1,2,0.0019,-0.0003,100,518.67,642.15,1591.82,1403.14,14.62,...,522.28,2388.07,8131.49,8.4318,0.03,392.0,2388,100,39.0,23.4236
2,1,3,-0.0043,0.0003,100,518.67,642.35,1587.99,1404.2,14.62,...,522.42,2388.03,8133.23,8.4178,0.03,390.0,2388,100,38.95,23.3442
3,1,4,0.0007,0.0,100,518.67,642.35,1582.79,1401.87,14.62,...,522.86,2388.08,8133.83,8.3682,0.03,392.0,2388,100,38.88,23.3739
4,1,5,-0.0019,-0.0002,100,518.67,642.37,1582.85,1406.22,14.62,...,522.19,2388.04,8133.8,8.4294,0.03,393.0,2388,100,38.9,23.4044


In [5]:
df.columns

Index(['id', 'cycle', 'setting1', 'setting2', 'setting3', 's1', 's2', 's3',
       's4', 's5', 's6', 's7', 's8', 's9', 's10', 's11', 's12', 's13', 's14',
       's15', 's16', 's17', 's18', 's19', 's20', 's21'],
      dtype='object')

# **Feature Engineering**

**RUL**:

Remaining Useful Life (RUL) is a crucial concept in predictive maintenance, representing the estimated operational lifespan or cycles remaining for an asset before it is expected to fail. In the context of aircraft engines, RUL helps anticipate the point at which an engine may no longer meet performance requirements. By calculating RUL, organizations can proactively plan maintenance activities, optimize resource allocation, and minimize downtime. This predictive approach enables timely interventions, reducing the risk of unexpected failures and enhancing the overall efficiency and reliability of the asset. RUL serves as a key metric for making informed decisions in maintenance and operational strategies.

a) Remaining Useful Life (RUL): Calculated by subtracting the current cycle from the maximum cycle for each engine.

b) Feature Columns: Defined a list of feature columns for model training.

In [6]:
df['RUL'] = df.groupby('id')['cycle'].transform(max) - df['cycle']
feature_columns = ['setting1', 'setting2', 'setting3', 's1', 's2', 's3', 's4', 's5', 's6', 's7', 's8', 's9', 's10', 's11', 's12', 's13', 's14', 's15', 's16', 's17', 's18', 's19', 's20', 's21']

# **Merging Truth Data**

Merge Operation: Combining df with truth data (df_truth) based on the engine ID, providing additional information for analysis.

In [7]:
df = pd.merge(df, df_truth, on='id', how='left')

# **Labeling**

Threshold Labeling: Creating a binary label ('label') based on a threshold of 25 for RUL, which helps in creating a classification problem

In [8]:
threshold = 25
df['label'] = (df['RUL'] <= threshold).astype(int)

# **Train-Test Split**

Splitting Data: Using the train-test split function to divide the data into training and testing sets. The model will be trained on the training set and evaluated on the testing set.

In [9]:
X_train, X_test, y_train, y_test = train_test_split(df[feature_columns], df['RUL'], test_size=0.2, random_state=42)

# **Model Training**

Linear Regression Model: Creating and training a linear regression model using the feature columns defined earlier.

In [10]:
# Creating a linear regression model
model = LinearRegression()
# Train the model
model.fit(X_train, y_train)

# **Prediction and Evaluation**

Making Predictions: Applying the trained model to predict RUL on the test set.

Binary Conversion: Converting RUL values to binary labels (0 or 1) for evaluation as a classification problem.

Evaluation Metrics: Calculating accuracy and classification report metrics for assessing the model's performance

In [11]:
# Make predictions on the test set
y_pred = model.predict(X_test)

# y_test contains the true labels and y_pred contains the predicted labels
y_test_binary = (y_test <= threshold).astype(int)
y_pred_binary = (y_pred <= threshold).astype(int)

# Evaluate the model
accuracy = accuracy_score(y_test_binary, y_pred_binary)
print(f'Accuracy: {accuracy}')
print(classification_report(y_test_binary, y_pred_binary))

Accuracy: 0.9551734725207246
              precision    recall  f1-score   support

           0       0.96      0.99      0.98      5995
           1       0.89      0.50      0.64       519

    accuracy                           0.96      6514
   macro avg       0.92      0.75      0.81      6514
weighted avg       0.95      0.96      0.95      6514



# **Conclusion**

In this project, we aimed to predict the Remaining Useful Life (RUL) of aircraft engines using a regression approach. The dataset, consisting of sensor readings and engine parameters, was utilized to train a Linear Regression model. The key steps in the project include:

Data Preprocessing: Merging training and test datasets, calculating RUL, and defining feature columns.

Model Training: A Linear Regression model was employed to predict the RUL based on the selected features.

Evaluation: The model was evaluated using metrics like Mean Squared Error (MSE) and accuracy for a specific threshold.
The Linear Regression model demonstrated promising results in predicting RUL. The accuracy and classification report indicate its effectiveness in estimating the remaining operational life of aircraft engines. However, it's important to note that no classification task was undertaken in this specific implementation.

Future work may involve exploring classification models to predict binary outcomes related to engine health, such as whether an engine will fail within a certain time frame. This would provide a more nuanced understanding of maintenance needs and further enhance the practical applications of predictive maintenance strategies.