# Power Transformer Oil Temperature Prediction using RNN

## Introduction

This notebook implements a Recurrent Neural Network (RNN) model for predicting the oil temperature (OT) of power transformers. 

**Dataset**: ETDataset (Electricity Transformer Temperature)
- Source: https://github.com/zhouhaoyi/ETDataset
- From AAAI 2021 Best Paper (Informer model)

**Features**:
- HUFL: High UseFul Load
- HULL: High UseLess Load  
- MUFL: Middle UseFul Load
- MULL: Middle UseLess Load
- LUFL: Low UseFul Load
- LULL: Low UseLess Load

**Target**: OT (Oil Temperature)

**Goal**: Build an RNN model to predict oil temperature based on historical load data and demonstrate:
- Time series data preprocessing
- RNN architecture for sequence prediction
- Model training and evaluation
- Performance metrics and visualization

## Step 1: Import Required Libraries

We'll use:
- pandas/numpy for data manipulation
- sklearn for preprocessing and metrics
- tensorflow/keras for building the RNN model
- matplotlib for visualization

In [None]:
# Data manipulation
import pandas as pd
import numpy as np

# Preprocessing
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score

# Deep Learning
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import SimpleRNN, Dense, Dropout
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau

# Visualization
import matplotlib.pyplot as plt
import seaborn as sns

# Utilities
import warnings
warnings.filterwarnings('ignore')

# Set random seeds for reproducibility
np.random.seed(42)
tf.random.set_seed(42)

print(f"TensorFlow version: {tf.__version__}")
print(f"GPU Available: {tf.config.list_physical_devices('GPU')}")

## Step 2: Load and Explore Data

We'll load the transformer temperature dataset and perform initial exploration to understand:
- Data shape and structure
- Missing values
- Statistical properties
- Time series characteristics

In [None]:
def load_data(filepath):
    """
    Load the ETT dataset
    
    Parameters:
    -----------
    filepath : str
        Path to the CSV file
    
    Returns:
    --------
    df : pd.DataFrame
        Loaded dataframe with date as index
    """
    df = pd.read_csv(filepath)
    df['date'] = pd.to_datetime(df['date'])
    df = df.set_index('date')
    return df

# Load pre-split training and test data
train_filepath = '../dataset/processed_data/train.csv'
test_filepath = '../dataset/processed_data/test.csv'

df_train_full = load_data(train_filepath)
df_test_full = load_data(test_filepath)

# For initial exploration and model training, we'll use the training data
# The test data will be used only for final evaluation
df = df_train_full

print("Dataset loaded from pre-split files:")
print(f"  Training data: {train_filepath}")
print(f"  Test data: {test_filepath}")
print(f"\nTraining set shape: {df_train_full.shape}")
print(f"Test set shape: {df_test_full.shape}")
print(f"\nFirst few rows of training data:")
print(df.head())
print("\nDataset Info:")
print(df.info())
print("\nBasic Statistics:")
print(df.describe())
print("\nMissing Values:")
print(df.isnull().sum())

## Step 3: Data Visualization

Visualize the time series to understand patterns and relationships

In [None]:
# Plot all features over time
fig, axes = plt.subplots(4, 2, figsize=(15, 12))
fig.suptitle('Time Series of All Features', fontsize=16)

for idx, col in enumerate(df.columns):
    ax = axes[idx // 2, idx % 2]
    ax.plot(df.index[:2000], df[col][:2000])  # Plot first 2000 points for clarity
    ax.set_title(col)
    ax.set_xlabel('Time')
    ax.set_ylabel('Value')
    ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Correlation heatmap
plt.figure(figsize=(10, 8))
sns.heatmap(df.corr(), annot=True, cmap='coolwarm', center=0, 
            square=True, linewidths=1, fmt='.2f')
plt.title('Feature Correlation Heatmap')
plt.tight_layout()
plt.show()

## Step 4: Data Preprocessing

### 4.1 Feature and Target Separation

We'll separate features (load data) from target (OT - Oil Temperature)

In [None]:
def prepare_features_target(df):
    """
    Separate features and target variable
    
    Parameters:
    -----------
    df : pd.DataFrame
        Input dataframe
    
    Returns:
    --------
    X : np.ndarray
        Features (all columns except OT)
    y : np.ndarray
        Target (OT column)
    """
    # Features: all columns except OT
    X = df.drop('OT', axis=1).values
    # Target: OT (Oil Temperature)
    y = df['OT'].values
    
    return X, y

X, y = prepare_features_target(df)
print(f"Features shape: {X.shape}")
print(f"Target shape: {y.shape}")
print(f"\nFeature names: {df.drop('OT', axis=1).columns.tolist()}")

### 4.2 Data Normalization

Normalize features and target to improve RNN training stability and convergence

In [None]:
def normalize_data(X, y):
    """
    Normalize features and target using StandardScaler
    
    Parameters:
    -----------
    X : np.ndarray
        Features
    y : np.ndarray
        Target
    
    Returns:
    --------
    X_scaled : np.ndarray
        Normalized features
    y_scaled : np.ndarray
        Normalized target
    scaler_X : StandardScaler
        Fitted scaler for features
    scaler_y : StandardScaler
        Fitted scaler for target
    """
    scaler_X = StandardScaler()
    scaler_y = StandardScaler()
    
    X_scaled = scaler_X.fit_transform(X)
    y_scaled = scaler_y.fit_transform(y.reshape(-1, 1)).flatten()
    
    return X_scaled, y_scaled, scaler_X, scaler_y

X_scaled, y_scaled, scaler_X, scaler_y = normalize_data(X, y)
print("Data normalized successfully!")
print(f"\nFeatures - Mean: {X_scaled.mean():.4f}, Std: {X_scaled.std():.4f}")
print(f"Target - Mean: {y_scaled.mean():.4f}, Std: {y_scaled.std():.4f}")

### 4.3 Create Time Series Sequences

Transform data into sequences for RNN input. We use a sliding window approach:
- Input: Past `seq_length` time steps
- Output: Next time step's oil temperature

In [None]:
def create_sequences(X, y, seq_length=24):
    """
    Create sequences for time series prediction
    
    Parameters:
    -----------
    X : np.ndarray
        Features array
    y : np.ndarray
        Target array
    seq_length : int
        Number of time steps to look back (default: 24 = 6 hours with 15-min intervals)
    
    Returns:
    --------
    X_seq : np.ndarray
        Sequences of features (samples, seq_length, n_features)
    y_seq : np.ndarray
        Target values for each sequence
    """
    X_seq = []
    y_seq = []
    
    for i in range(len(X) - seq_length):
        X_seq.append(X[i:i+seq_length])
        y_seq.append(y[i+seq_length])
    
    return np.array(X_seq), np.array(y_seq)

# Create sequences with lookback window of 24 time steps (6 hours)
seq_length = 24
X_seq, y_seq = create_sequences(X_scaled, y_scaled, seq_length)

print(f"Sequence length (lookback window): {seq_length}")
print(f"X_seq shape: {X_seq.shape} (samples, time_steps, features)")
print(f"y_seq shape: {y_seq.shape}")
print(f"\nExample:")
print(f"- Input: {seq_length} time steps of {X_seq.shape[2]} features")
print(f"- Output: 1 time step oil temperature prediction")

### 4.4 Train/Test Split

Split data into training and testing sets. For time series, we use temporal split (not random).

In [None]:
# Since data is already pre-split into train and test sets,
# we process them separately and create sequences for each

# Process training data
X_train_full, y_train_full = prepare_features_target(df_train_full)
X_train_scaled, y_train_scaled, scaler_X, scaler_y = normalize_data(X_train_full, y_train_full)
X_train_seq, y_train_seq = create_sequences(X_train_scaled, y_train_scaled, seq_length)

# Further split training data for validation (80% train, 20% validation)
val_split_idx = int(len(X_train_seq) * 0.8)
X_train = X_train_seq[:val_split_idx]
y_train = y_train_seq[:val_split_idx]
X_val = X_train_seq[val_split_idx:]
y_val = y_train_seq[val_split_idx:]

# Process test data (using fitted scalers from training)
X_test_full, y_test_full = prepare_features_target(df_test_full)
X_test_scaled = scaler_X.transform(X_test_full)
y_test_scaled = scaler_y.transform(y_test_full.reshape(-1, 1)).flatten()
X_test, y_test = create_sequences(X_test_scaled, y_test_scaled, seq_length)

print("Data prepared from pre-split files!")
print(f"\nTraining set (for model fitting):")
print(f"  X_train shape: {X_train.shape}")
print(f"  y_train shape: {y_train.shape}")
print(f"\nValidation set (from training data):")
print(f"  X_val shape: {X_val.shape}")
print(f"  y_val shape: {y_val.shape}")
print(f"\nTest set (held-out data):")
print(f"  X_test shape: {X_test.shape}")
print(f"  y_test shape: {y_test.shape}")
print(f"\nNote: Scalers fitted on training data and applied to test data")

## Step 5: Build RNN Model

We'll build a Simple RNN model with:
- Input layer: sequences of features
- RNN layers: to capture temporal dependencies
- Dropout layers: for regularization
- Dense output layer: for regression

In [None]:
def build_rnn_model(input_shape, units=[64, 32]):
    """
    Build a Simple RNN model for time series prediction
    
    Parameters:
    -----------
    input_shape : tuple
        Shape of input (time_steps, n_features)
    units : list
        Number of units in each RNN layer
    
    Returns:
    --------
    model : keras.Model
        Compiled RNN model
    """
    model = Sequential([
        # First RNN layer - returns sequences for next layer
        SimpleRNN(units[0], return_sequences=True, input_shape=input_shape),
        Dropout(0.2),
        
        # Second RNN layer - returns only last output
        SimpleRNN(units[1], return_sequences=False),
        Dropout(0.2),
        
        # Dense layers for final prediction
        Dense(16, activation='relu'),
        Dropout(0.1),
        
        # Output layer (single value - oil temperature)
        Dense(1)
    ])
    
    # Compile model
    model.compile(
        optimizer='adam',
        loss='mse',
        metrics=['mae']
    )
    
    return model

# Build model
input_shape = (X_train.shape[1], X_train.shape[2])  # (seq_length, n_features)
model = build_rnn_model(input_shape, units=[64, 32])

print("Model Architecture:")
model.summary()

## Step 6: Train the Model

Train the RNN with callbacks for:
- Early stopping: prevent overfitting
- Learning rate reduction: improve convergence

In [None]:
# Define callbacks
early_stopping = EarlyStopping(
    monitor='val_loss',
    patience=10,
    restore_best_weights=True,
    verbose=1
)

reduce_lr = ReduceLROnPlateau(
    monitor='val_loss',
    factor=0.5,
    patience=5,
    min_lr=1e-7,
    verbose=1
)

# Train model using separate validation set
print("Training the RNN model...\n")
history = model.fit(
    X_train, y_train,
    epochs=100,
    batch_size=32,
    validation_data=(X_val, y_val),  # Use separate validation set
    callbacks=[early_stopping, reduce_lr],
    verbose=1
)

print("\nTraining completed!")

### 6.1 Training History Visualization

Plot training and validation loss curves to check for overfitting

In [None]:
def plot_training_history(history):
    """
    Plot training and validation metrics
    """
    fig, axes = plt.subplots(1, 2, figsize=(15, 5))
    
    # Loss
    axes[0].plot(history.history['loss'], label='Training Loss')
    axes[0].plot(history.history['val_loss'], label='Validation Loss')
    axes[0].set_title('Model Loss During Training')
    axes[0].set_xlabel('Epoch')
    axes[0].set_ylabel('Loss (MSE)')
    axes[0].legend()
    axes[0].grid(True, alpha=0.3)
    
    # MAE
    axes[1].plot(history.history['mae'], label='Training MAE')
    axes[1].plot(history.history['val_mae'], label='Validation MAE')
    axes[1].set_title('Model MAE During Training')
    axes[1].set_xlabel('Epoch')
    axes[1].set_ylabel('MAE')
    axes[1].legend()
    axes[1].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()

plot_training_history(history)

## Step 7: Model Evaluation

### 7.1 Make Predictions

Generate predictions on test set and inverse transform to original scale

In [None]:
# Make predictions
y_train_pred = model.predict(X_train)
y_test_pred = model.predict(X_test)

# Inverse transform to original scale
y_train_actual = scaler_y.inverse_transform(y_train.reshape(-1, 1)).flatten()
y_train_pred_inv = scaler_y.inverse_transform(y_train_pred).flatten()

y_test_actual = scaler_y.inverse_transform(y_test.reshape(-1, 1)).flatten()
y_test_pred_inv = scaler_y.inverse_transform(y_test_pred).flatten()

print("Predictions generated successfully!")
print(f"\nSample predictions (first 5):")
print(f"Actual:    {y_test_actual[:5]}")
print(f"Predicted: {y_test_pred_inv[:5]}")

### 7.2 Calculate Evaluation Metrics

Compute standard regression metrics to assess model performance

In [None]:
def calculate_metrics(y_true, y_pred, set_name='Test'):
    """
    Calculate and display regression metrics
    """
    mse = mean_squared_error(y_true, y_pred)
    rmse = np.sqrt(mse)
    mae = mean_absolute_error(y_true, y_pred)
    r2 = r2_score(y_true, y_pred)
    
    # MAPE (Mean Absolute Percentage Error)
    mape = np.mean(np.abs((y_true - y_pred) / y_true)) * 100
    
    print(f"\n{set_name} Set Performance Metrics:")
    print("=" * 50)
    print(f"MSE (Mean Squared Error):        {mse:.4f}")
    print(f"RMSE (Root Mean Squared Error):  {rmse:.4f}")
    print(f"MAE (Mean Absolute Error):       {mae:.4f}")
    print(f"R² Score:                        {r2:.4f}")
    print(f"MAPE (Mean Absolute % Error):    {mape:.2f}%")
    print("=" * 50)
    
    return {'MSE': mse, 'RMSE': rmse, 'MAE': mae, 'R2': r2, 'MAPE': mape}

# Calculate metrics for both sets
train_metrics = calculate_metrics(y_train_actual, y_train_pred_inv, 'Training')
test_metrics = calculate_metrics(y_test_actual, y_test_pred_inv, 'Test')

## Step 8: Results Visualization

### 8.1 Actual vs Predicted Values

In [None]:
# Plot predictions vs actual
fig, axes = plt.subplots(2, 1, figsize=(15, 10))

# Test set - full view
axes[0].plot(y_test_actual, label='Actual', alpha=0.7, linewidth=1.5)
axes[0].plot(y_test_pred_inv, label='Predicted', alpha=0.7, linewidth=1.5)
axes[0].set_title('Oil Temperature Prediction - Test Set (Full View)', fontsize=14)
axes[0].set_xlabel('Time Step')
axes[0].set_ylabel('Oil Temperature (°C)')
axes[0].legend()
axes[0].grid(True, alpha=0.3)

# Test set - zoomed view (first 500 points)
zoom_range = 500
axes[1].plot(y_test_actual[:zoom_range], label='Actual', alpha=0.7, linewidth=1.5)
axes[1].plot(y_test_pred_inv[:zoom_range], label='Predicted', alpha=0.7, linewidth=1.5)
axes[1].set_title(f'Oil Temperature Prediction - Test Set (First {zoom_range} Points)', fontsize=14)
axes[1].set_xlabel('Time Step')
axes[1].set_ylabel('Oil Temperature (°C)')
axes[1].legend()
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

### 8.2 Scatter Plot: Predicted vs Actual

In [None]:
# Scatter plot
plt.figure(figsize=(10, 8))
plt.scatter(y_test_actual, y_test_pred_inv, alpha=0.5, s=20)
plt.plot([y_test_actual.min(), y_test_actual.max()], 
         [y_test_actual.min(), y_test_actual.max()], 
         'r--', lw=2, label='Perfect Prediction')
plt.xlabel('Actual Oil Temperature (°C)', fontsize=12)
plt.ylabel('Predicted Oil Temperature (°C)', fontsize=12)
plt.title('Predicted vs Actual Oil Temperature (Test Set)', fontsize=14)
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

### 8.3 Residual Analysis

In [None]:
# Calculate residuals
residuals = y_test_actual - y_test_pred_inv

fig, axes = plt.subplots(1, 2, figsize=(15, 5))

# Residual plot
axes[0].scatter(y_test_pred_inv, residuals, alpha=0.5, s=20)
axes[0].axhline(y=0, color='r', linestyle='--', linewidth=2)
axes[0].set_xlabel('Predicted Oil Temperature (°C)')
axes[0].set_ylabel('Residuals (°C)')
axes[0].set_title('Residual Plot')
axes[0].grid(True, alpha=0.3)

# Residual distribution
axes[1].hist(residuals, bins=50, edgecolor='black', alpha=0.7)
axes[1].axvline(x=0, color='r', linestyle='--', linewidth=2)
axes[1].set_xlabel('Residuals (°C)')
axes[1].set_ylabel('Frequency')
axes[1].set_title('Distribution of Residuals')
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print(f"Residual Statistics:")
print(f"Mean: {residuals.mean():.4f}°C")
print(f"Std: {residuals.std():.4f}°C")
print(f"Min: {residuals.min():.4f}°C")
print(f"Max: {residuals.max():.4f}°C")

## Step 9: Conclusion and Summary

### Model Performance Summary

In [None]:
# Create summary comparison
summary_df = pd.DataFrame({
    'Metric': ['MSE', 'RMSE', 'MAE', 'R²', 'MAPE (%)'],
    'Training': [
        train_metrics['MSE'],
        train_metrics['RMSE'],
        train_metrics['MAE'],
        train_metrics['R2'],
        train_metrics['MAPE']
    ],
    'Test': [
        test_metrics['MSE'],
        test_metrics['RMSE'],
        test_metrics['MAE'],
        test_metrics['R2'],
        test_metrics['MAPE']
    ]
})

print("\n" + "=" * 60)
print("RNN MODEL PERFORMANCE SUMMARY")
print("=" * 60)
print(summary_df.to_string(index=False))
print("=" * 60)

print("\n### Key Observations:")
print(f"1. The model achieves R² score of {test_metrics['R2']:.4f} on test set")
print(f"2. Average prediction error (MAE): {test_metrics['MAE']:.4f}°C")
print(f"3. MAPE: {test_metrics['MAPE']:.2f}%")

if abs(train_metrics['R2'] - test_metrics['R2']) < 0.05:
    print("4. Model shows good generalization (minimal overfitting)")
else:
    print("4. Consider regularization or more data to reduce overfitting")

print("\n### Model Architecture:")
print(f"- Sequence Length: {seq_length} time steps (6 hours)")
print(f"- RNN Units: {[64, 32]}")
print(f"- Total Parameters: {model.count_params():,}")

print("\n### Next Steps:")
print("1. Compare with other models (Linear Regression, Random Forest, MLP)")
print("2. Try advanced architectures (LSTM, GRU)")
print("3. Experiment with different sequence lengths")
print("4. Feature engineering to improve performance")
print("5. Compare with SOTA model (Informer)")

### Optional: Save the Model

In [None]:
# Uncomment to save the model
# model.save('../models/rnn_model.h5')
# print("Model saved to ../models/rnn_model.h5")