### Quantitative Metrics for Model Evaluation

**Objective:**

  - Calculate key evaluation metrics to assess model prediction performance on the test set.
  - Understand RMSE (Root Mean Squared Error), MAE (Mean Absolute Error) and R2 Score as primary metrics for RUL prediction.
  - Implement metric calculations using the test dataset and previously trained model predictions.

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


#### 1. Imports and Setup

In [2]:
import numpy as np
import pandas as pd
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
import tensorflow as tf

# Suppress warnings for clearer ouput
import warnings
warnings.filterwarnings('ignore')

#### 2. Explanation: Evaluation Metrics for Regression (RUL Prediction)


2.1 Root Mean Squared Error (RMSE):
- RMSE measures the square root of the average squared differences between predicted and actual values.
- It penalizes larger errors more heavily, making it sensitive to outliers.
- Lower RMSE indicates better model performance.

Formula: RMSE = sqrt( (1/N) * Σ(y_true_i - y_pred_i)^2 )

2.2 Mean Absolute Error (MAE):
- MAE measures the average absolute differences between predicted and actual values.
- It provides a straightforward interpretation of average error magnitude.
- Less sensitive to large errors compared to RMSE.

Formula: MAE = (1/N) * Σ|y_true_i - y_pred_i|


#### 3. Load Test Data and Previous Model

In [5]:
# Load test features and labels
X_test = np.load("/content/drive/MyDrive/prognosAI-Infosys-intern-project/milestone_2/week_3/Day_10/rolling_window_sequences.npy")
metadata_test = pd.read_csv("/content/drive/MyDrive/prognosAI-Infosys-intern-project/milestone_2/week_3/Day_11/sequence_metadata_with_RUL.csv")
y_test = metadata_test['RUL'].values

# Print shapes
print("Test feature shape:", X_test.shape)
print("Test target shape:", y_test.shape)

Test feature shape: (152559, 30, 66)
Test target shape: (152559,)


In [8]:
model = tf.keras.models.load_model('/content/drive/MyDrive/prognosAI-Infosys-intern-project/milestone_2/week_4/Day_16/best_model.keras') # BI-DIRECTIONAL LSTM

#### 4. Generate Predictions on Test Set

In [9]:
y_pred = model.predict(X_test)
y_pred = y_pred.flatten() # Flatten to 1-D array for metric evaluations

[1m4768/4768[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m14s[0m 3ms/step


#### 5. Calculate RMSE and MAE

In [None]:
rmse = np.sqrt(mean_squared_error(y_test, y_pred))
mae = mean_absolute_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f"Root Mean Squared Error (RMSE): {rmse:.4f}")
print(f"Mean Absolute Error (MAE): {mae:.4f}")
print(f"R² Score: {r2:.4f}")

Root Mean Squared Error (RMSE): 33.9933
Mean Absolute Error (MAE): 25.2118
R² Score: 0.8696


In [10]:
rmse = np.sqrt(mean_squared_error(y_test, y_pred))
mae = mean_absolute_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f"Root Mean Squared Error (RMSE): {rmse:.4f}")
print(f"Mean Absolute Error (MAE): {mae:.4f}")
print(f"R² Score: {r2:.4f}")

Root Mean Squared Error (RMSE): 22.6019
Mean Absolute Error (MAE): 16.7243
R² Score: 0.9424


Observatations

* All three evaluation metrics improved significantly in the second run.

* RMSE dropped by ~33%, indicating far fewer large prediction errors.

* MAE dropped by ~34%, meaning the model is more reliable on average.

* R² increased from 0.86 to 0.94, showing that the model captures most of the RUL signal.