# Introduction:

This project focuses on predicting battery voltage using deep learning techniques, particularly **Long Short-Term Memory (LSTM)** networks. 

**Dataset Used :** NASA Battery Dataset

**Aim :** 
To build a robust predictive model that can forecast battery voltage based on parameters like:
- Voltage_measured
- Current_measured
- Temperature_measured
- Current_load
- Voltage_load
- Time

# Imports

In [None]:
import pandas as pd
import numpy as np
import glob
import os
import seaborn as sns
import matplotlib.pyplot as plt
import torch

# Dataset

In [None]:
# for dirname, _, filenames in os.walk('/kaggle/input'):
#     for filename in filenames:
#         print(os.path.join(dirname, filename))

# Read & Merge all valid CSVs with Battery_ID

- Reads and merges NASA Battery Dataset CSVs
- Filters valid files with key features
- Assigns unique Battery_ID
- Sorts data by time
- Combines into a single dataset
- Enables time-series analysis and ML (e.g., LSTMs) for battery health and voltage prediction

In [None]:
directory_path = '/kaggle/input/cleaned_dataset/data/'

required_features = ['Voltage_measured', 'Current_measured', 'Temperature_measured', 
                     'Current_load', 'Voltage_load', 'Time']

In [None]:
def merge_time_series_with_id(directory_path):
    all_files = glob.glob(os.path.join(directory_path, '*.csv'))
    print(f"Found {len(all_files)} CSV files in {directory_path}")

    if not all_files:
        raise ValueError(f"No CSV files found in {directory_path}")

    selected_files = []
    for file in all_files:
        try:
            df = pd.read_csv(file, nrows=1)
            if set(required_features).issubset(df.columns):
                selected_files.append(file)
            else:
                pass
                # print(f"File {file} skipped: Missing some required features {set(required_features) - set(df.columns)}")
        except Exception as e:
            print(f"Error reading {file}: {e}")

    if not selected_files:
        raise ValueError(f"No files contain all required features: {required_features}")

    data_list = []
    for i, file in enumerate(selected_files):
        try:
            df = pd.read_csv(file, usecols=required_features)
            df['Battery_ID'] = f'Battery_{i+1}'
            df = df.sort_values(by=["Battery_ID", "Time"]).reset_index(drop=True)
            data_list.append(df)
            # print(f"Processed {file} as Battery_{i+1}")
        except Exception as e:
            print(f"Error processing {file}: {e}")

    if not data_list:
        raise ValueError("No dataframes to concatenate after processing")

    merged_df = pd.concat(data_list, ignore_index=True)
    print(f"\nSuccessfully merged {len(selected_files)} files!...")
    return merged_df

Example usage: Merge all valid files

In [None]:
final_df = merge_time_series_with_id(directory_path)

# Exploratory Data Analysis (EDA) and Preprocessing

- **Exploratory Data Analysis (EDA)**:
  - Understand the dataset by visualizing trends.
  - Identify patterns in battery behavior.
  - Detect anomalies in battery behavior.
  - Analyze key features (voltage, current, temperature).
  - Gain insights into feature distributions and correlations.

- **Preprocessing**:
  - Clean the data.
  - Handle missing values.
  - Normalize numerical features.
  - Structure data into a time-series format for LSTM models.
  - Ensure the dataset is well-prepared for training.
  - Improve model performance.

In [None]:
final_df.head()

In [None]:
final_df.info()

In [None]:
final_df.describe()

Create Charge_Discharge Feature (1 for Charging, 0 for Discharging)

In [None]:
final_df['Charge_Discharge'] = final_df['Current_measured'].apply(lambda x: 1 if x > 0 else 0)

Convert Current_measured to absolute value

In [None]:
final_df['Current_measured'] = final_df['Current_measured'].abs()

In [None]:
final_df.head()

## Plot

In [None]:
plt.figure(figsize=(12,6))
sns.boxplot(data=final_df[['Voltage_measured', 'Current_measured', 'Temperature_measured', 'Current_load', 'Voltage_load']])
plt.xticks(rotation=45)
plt.title("Boxplot for Outlier Detection")
plt.show()

In [None]:
for col in ['Voltage_measured', 'Current_measured', 'Current_load']:
    final_df[col] = final_df[col].clip(lower=final_df[col].quantile(0.05),
                                       upper=final_df[col].quantile(0.95))

In [None]:
q_low, q_high = final_df['Voltage_load'].quantile([0.001, 0.999])
final_df = final_df[(final_df['Voltage_load'] >= q_low) & (final_df['Voltage_load'] <= q_high)]

In [None]:
final_df.describe()

In [None]:
plt.figure(figsize=(12,6))
sns.boxplot(data=final_df[['Voltage_measured', 'Current_measured', 'Temperature_measured', 'Current_load', 'Voltage_load']])
plt.xticks(rotation=45)
plt.title("Boxplot for Outlier Detection")
plt.show()

In [None]:
plt.figure(figsize=(10, 5))
sns.histplot(final_df["Voltage_measured"], kde=True, bins=50, color="blue", alpha=0.7)
plt.title("Voltage Distribution Across Batteries")
plt.xlabel("Voltage (V)")
plt.ylabel("Frequency")
plt.show()

In [None]:
plt.figure(figsize=(12, 6))
sns.scatterplot(data=final_df, x="Time", y="Temperature_measured", hue="Battery_ID", alpha=0.5, legend=False)
plt.title("Temperature vs. Time for Different Batteries")
plt.xlabel("Time (Seconds)")
plt.ylabel("Temperature (°C)")
plt.show()

## Explanation of the Feature Correlation Heatmap:
This heatmap visualizes the correlation between different features in the dataset, where values range from -1 to 1:

### Strong Correlation (Closer to 1 or -1):
1. Current_measured and Current_load have a very high positive correlation (0.97), indicating that as one increases, the other also increases.
2. Voltage_measured and Current_measured show a moderate negative correlation (-0.35), suggesting that higher current measurements may lead to lower voltage.
3. Time has a negative correlation with Current_measured (-0.45) and Current_load (-0.44), indicating that as time progresses, these values tend to decrease.
   
### Weak Correlation (Closer to 0):
1. Charge_Discharge has very weak correlations with most variables, suggesting that charge/discharge state doesn't significantly influence other numerical features.
2. Temperature_measured shows mild correlations with Current_measured (0.31) and Current_load (0.29), indicating that temperature may slightly increase as current increases.
3. This heatmap helps identify which features strongly impact each other, guiding feature selection and model optimization.

In [None]:
numeric_df = final_df.select_dtypes(include=['number'])

In [None]:
plt.figure(figsize=(10, 6))
sns.heatmap(numeric_df.corr(), annot=True, cmap="coolwarm", fmt=".2f")
plt.title("Feature Correlation Heatmap")
plt.show()

# RandomForestRegressor

In [None]:
from sklearn.ensemble import RandomForestRegressor

X = final_df.drop(columns=['Voltage_measured', 'Time', 'Battery_ID'])  # Predicting Voltage, so drop it
y = final_df['Voltage_measured']

model = RandomForestRegressor()
model.fit(X, y)

# Plot feature importance
importances = model.feature_importances_
features = X.columns
sorted_idx = np.argsort(importances)

plt.figure(figsize=(8,5))
plt.barh(range(len(importances)), importances[sorted_idx], align='center')
plt.yticks(range(len(importances)), [features[i] for i in sorted_idx])
plt.xlabel("Feature Importance Score")
plt.title("Feature Importance (Random Forest)")
plt.show()

In [None]:
final_df = final_df.drop(columns=['Temperature_measured', 'Charge_Discharge'])

In [None]:
final_df

In [None]:
corr_matrix = final_df[['Current_measured', 'Current_load']].corr()
print(corr_matrix)

In [None]:
sns.heatmap(corr_matrix, annot=True, cmap='coolwarm', fmt=".2f")
plt.title("Correlation between Current_measured & Current_load")
plt.show()

In [None]:
final_df = final_df.drop(columns=['Current_load'])

In [None]:
final_df

In [None]:
from sklearn.preprocessing import LabelEncoder

encoder = LabelEncoder()
final_df['Battery_ID'] = encoder.fit_transform(final_df['Battery_ID'])

In [None]:
final_df

In [None]:
from sklearn.preprocessing import MinMaxScaler

num_features = ['Voltage_measured', 'Current_measured', 'Voltage_load', 'Time','Battery_ID']

scaler = MinMaxScaler()
final_df[num_features] = scaler.fit_transform(final_df[num_features])

In [None]:
final_df

Define target variable (Voltage to predict)

In [None]:
target_col = 'Voltage_measured'

Define features (excluding target)

In [None]:
feature_cols = ['Current_measured', 'Voltage_load', 'Time', 'Battery_ID']

In [None]:
X = final_df[feature_cols].values
y = final_df[target_col].values

In [None]:
from sklearn.model_selection import train_test_split

# 80% Train, 20% Test (Keeping sequential order)
train_size = int(0.8 * len(final_df))

X_train, X_test = X[:train_size], X[train_size:]
y_train, y_test = y[:train_size], y[train_size:]

print(f"Train Size: {len(X_train)}, Test Size: {len(X_test)}...")

In [None]:
sequence_length = 30  # Using 30 past time steps

import numpy as np

def create_sequences(X, y, seq_length):
    Xs, ys = [], []
    for i in range(len(X) - seq_length):
        Xs.append(X[i : i + seq_length])  # Select past `seq_length` values
        ys.append(y[i + seq_length])  # Predict the next value
    return np.array(Xs), np.array(ys)

# Convert train and test sets into sequences
X_train_seq, y_train_seq = create_sequences(X_train, y_train, sequence_length)
X_test_seq, y_test_seq = create_sequences(X_test, y_test, sequence_length)

print(f"X_train shape: {X_train_seq.shape}, y_train shape: {y_train_seq.shape}")
print(f"X_test shape: {X_test_seq.shape}, y_test shape: {y_test_seq.shape}")

# LSTM for Battery Voltage Prediction

- **Why LSTMs?**
  - Specialized RNNs for sequential dependencies in time-series data.
  - Overcome vanishing gradient problem.
  - Retain information over long time steps.
  - Ideal for battery voltage forecasting.

- **Model Architecture**
  - **LSTM Layers**:
    - 64 units → 32 units: Captures short- and long-term dependencies.
  - **Dropout**:
    - 0.2 & 0.4: Reduces overfitting.
  - **Dense Layers**:
    - 16 units → 1 unit: Learns patterns and outputs voltage prediction.
  - **Optimizer**:
    - Adam with adaptive learning rate: Ensures stable and efficient training.

- **Optimizations Used**
  - **Early Stopping**: Stops training when validation loss stops improving.
  - **ReduceLROnPlateau**: Lowers learning rate if training stagnates.

## Define the LSTM model

In [None]:
import torch
import torch.nn as nn

class LSTMModel(nn.Module):
    def __init__(self, input_size, sequence_length):
        super(LSTMModel, self).__init__()
        # self.lstm1 = nn.LSTM(input_size, 64, batch_first=True, return_sequences=True)
        self.lstm1 = nn.LSTM(input_size, 64, batch_first=True)
        self.dropout1 = nn.Dropout(0.2)
        # self.lstm2 = nn.LSTM(64, 32, batch_first=True, return_sequences=False)
        self.lstm2 = nn.LSTM(64, 32, batch_first=True)
        self.dropout2 = nn.Dropout(0.4)
        self.dense1 = nn.Linear(32, 16)
        self.relu = nn.ReLU()
        self.dense2 = nn.Linear(16, 1)
        
    def forward(self, x):
        x, _ = self.lstm1(x)
        x = self.dropout1(x)
        x, _ = self.lstm2(x)
        x = self.dropout2(x)
        x = self.dense1(x)
        x = self.relu(x)
        x = self.dense2(x)
        return x

# Initialize model, loss, and optimizer
input_size = X_train_seq.shape[2]  # Assuming X_train_seq is defined
sequence_length = X_train_seq.shape[1]
model = LSTMModel(input_size, sequence_length)
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

# Model summary
print(model)

In [None]:
import torch
import torch.nn as nn
from torch.optim.lr_scheduler import ReduceLROnPlateau
import numpy as np

# Assuming model, X_train_seq, y_train_seq, X_test_seq, y_test_seq are defined
# Convert data to PyTorch tensors
X_train_seq = torch.tensor(X_train_seq, dtype=torch.float32)
y_train_seq = torch.tensor(y_train_seq, dtype=torch.float32)
X_test_seq = torch.tensor(X_test_seq, dtype=torch.float32)
y_test_seq = torch.tensor(y_test_seq, dtype=torch.float32)

# Create DataLoader for batch processing
from torch.utils.data import TensorDataset, DataLoader
train_dataset = TensorDataset(X_train_seq, y_train_seq)
test_dataset = TensorDataset(X_test_seq, y_test_seq)
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False)

# Initialize optimizer
initial_learning_rate = 0.001
optimizer = torch.optim.Adam(model.parameters(), lr=initial_learning_rate)

# Loss function
criterion = nn.MSELoss()

# Learning rate scheduler
lr_scheduler = ReduceLROnPlateau(optimizer, mode='min', factor=0.5, patience=5, min_lr=1e-6, verbose=True)

## Early stopping implementation

In [None]:
class EarlyStopping:
    def __init__(self, patience=10, min_delta=0, restore_best_weights=True):
        self.patience = patience
        self.min_delta = min_delta
        self.restore_best_weights = restore_best_weights
        self.best_loss = float('inf')
        self.best_model = None
        self.counter = 0
        
    def __call__(self, val_loss, model):
        if val_loss < self.best_loss - self.min_delta:
            self.best_loss = val_loss
            self.counter = 0
            if self.restore_best_weights:
                self.best_model = model.state_dict()
        else:
            self.counter += 1
            if self.counter >= self.patience:
                print(f'Early stopping triggered after {self.counter} epochs')
                if self.restore_best_weights:
                    model.load_state_dict(self.best_model)
                return True
        return False

early_stopping = EarlyStopping(patience=10, restore_best_weights=True)

## Checkpointing

In [None]:
import os

checkpoint_dir = "checkpoints"
os.makedirs(checkpoint_dir, exist_ok=True)

# Initialize best validation loss for checkpointing
best_val_loss = float('inf')
best_model_path = os.path.join(checkpoint_dir, 'best_model.pth')

# Check for existing checkpoint
start_epoch = 0
checkpoint_path = os.path.join(checkpoint_dir, 'latest_checkpoint.pth')
if os.path.exists(checkpoint_path):
    checkpoint = torch.load(checkpoint_path)
    model.load_state_dict(checkpoint['model_state_dict'])
    optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
    start_epoch = checkpoint['epoch'] + 1
    best_val_loss = checkpoint['best_val_loss']
    lr_scheduler.load_state_dict(checkpoint['scheduler_state_dict'])
    early_stopping.best_loss = checkpoint['early_stopping_best_loss']
    early_stopping.best_model = checkpoint['early_stopping_best_model']
    early_stopping.counter = checkpoint['early_stopping_counter']
    print(f"Resuming from epoch {start_epoch}")

## Train the LSTM model

In [None]:
# Training loop
num_epochs = 50
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = model.to(device)

for epoch in range(start_epoch, num_epochs):
    model.train()
    train_loss = 0.0
    train_mae = 0.0
    
    for X_batch, y_batch in train_loader:
        X_batch, y_batch = X_batch.to(device), y_batch.to(device)
        
        optimizer.zero_grad()
        outputs = model(X_batch)
        loss = criterion(outputs, y_batch)
        loss.backward()
        optimizer.step()
        
        train_loss += loss.item() * X_batch.size(0)
        train_mae += torch.mean(torch.abs(outputs - y_batch)).item() * X_batch.size(0)
    
    train_loss /= len(train_loader.dataset)
    train_mae /= len(train_loader.dataset)
    
    # Validation
    model.eval()
    val_loss = 0.0
    with torch.no_grad():
        for X_batch, y_batch in test_loader:
            X_batch, y_batch = X_batch.to(device), y_batch.to(device)
            outputs = model(X_batch)
            val_loss += criterion(outputs, y_batch).item() * X_batch.size(0)
    
    val_loss /= len(test_loader.dataset)
    
    print(f'Epoch {epoch+1}/{num_epochs}, Train Loss: {train_loss:.6f}, Train MAE: {train_mae:.6f}, Val Loss: {val_loss:.6f}')
    
    # Save checkpoint every 10 epochs
    if (epoch + 1) % 10 == 0:
        checkpoint = {
            'epoch': epoch,
            'model_state_dict': model.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
            'scheduler_state_dict': lr_scheduler.state_dict(),
            'best_val_loss': best_val_loss,
            'early_stopping_best_loss': early_stopping.best_loss,
            'early_stopping_best_model': early_stopping.best_model,
            'early_stopping_counter': early_stopping.counter
        }
        torch.save(checkpoint, os.path.join(checkpoint_dir, f'checkpoint_epoch_{epoch+1}.pth'))
        print(f"Saved checkpoint at epoch {epoch+1}")
    
    # Save latest checkpoint
    torch.save(checkpoint, checkpoint_path)
    
    # Save best model if validation loss improves
    if val_loss < best_val_loss:
        best_val_loss = val_loss
        torch.save(model.state_dict(), best_model_path)
        print(f"New best model saved with val_loss: {best_val_loss:.4f}")
    
    # Update learning rate
    lr_scheduler.step(val_loss)
    
    # Check early stopping
    if early_stopping(val_loss, model):
        break

## Load Best Model for evaluating 

In [None]:
model.load_state_dict(torch.load(os.path.join(checkpoint_dir, 'best_model.pth')))

## Evaluate the model

In [None]:
import numpy as np
import matplotlib.pyplot as plt

model.eval()
y_pred = []
with torch.no_grad():
    for X_batch, _ in test_loader:
        X_batch = X_batch.to(device)
        outputs = model(X_batch)  # Shape: (batch_size, sequence_length, 1)
        # Extract the prediction from the last timestep
        outputs = outputs[:, -1, :]  # Shape: (batch_size, 1)
        y_pred.append(outputs.cpu().numpy())

# Concatenate predictions and flatten to 1D
y_pred = np.concatenate(y_pred, axis=0).flatten()  # Shape: (n_samples,)

# Ensure y_test_seq is 1D
y_test_seq_np = y_test_seq.cpu().numpy().flatten()  # Shape: (n_samples,)

# Plot actual vs predicted
plt.figure(figsize=(10, 5))
plt.plot(y_test_seq_np[:100], label="Actual Voltage", marker="o")
plt.plot(y_pred[:100], label="Predicted Voltage", marker="x")
plt.legend()
plt.xlabel("Time Step")
plt.ylabel("Voltage")
plt.title("Actual vs Predicted Voltage")
plt.show()

In [None]:
# model.eval()
# test_loss = 0.0
# test_mae = 0.0

# with torch.no_grad():
#     for X_batch, y_batch in test_loader:
#         X_batch, y_batch = X_batch.to(device), y_batch.to(device)
#         outputs = model(X_batch)
#         test_loss += criterion(outputs, y_batch).item() * X_batch.size(0)
#         test_mae += torch.mean(torch.abs(outputs - y_batch)).item() * X_batch.size(0)

# test_loss /= len(test_loader.dataset)
# test_mae /= len(test_loader.dataset)

# print(f"Test Loss: {test_loss:.4f}, Test MAE: {test_mae:.4f}")

## Predict on test data

In [None]:
# model.eval()
# y_pred = []
# with torch.no_grad():
#     for X_batch, _ in test_loader:
#         X_batch = X_batch.to(device)
#         outputs = model(X_batch)
#         y_pred.append(outputs.cpu().numpy())
# y_pred = np.concatenate(y_pred, axis=0)

# # Plot actual vs predicted
# plt.figure(figsize=(10, 5))
# plt.plot(y_test_seq[:100], label="Actual Voltage", marker="o")
# plt.plot(y_pred[:100], label="Predicted Voltage", marker="x")
# plt.legend()
# plt.xlabel("Time Step")
# plt.ylabel("Voltage")
# plt.title("Actual vs Predicted Voltage")
# plt.show()

## Calculate prediction errors

In [None]:
errors = y_test_seq.cpu().numpy() - y_pred.flatten()

### Plot histogram of errors

In [None]:
plt.hist(errors, bins=50, alpha=0.7, color='blue')
plt.xlabel("Prediction Error")
plt.ylabel("Frequency")
plt.title("Distribution of Prediction Errors")
plt.show()

In [None]:
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

# Convert tensors to numpy arrays and flatten
actual = y_test_seq.cpu().numpy().flatten()
predicted = y_pred.flatten()

# Calculate metrics
mae = mean_absolute_error(actual, predicted)
mse = mean_squared_error(actual, predicted)
rmse = np.sqrt(mse)
r2 = r2_score(actual, predicted)

# Print results
print(f"MAE: {mae}")
print(f"MSE: {mse}")
print(f"RMSE: {rmse}")
print(f"R² Score: {r2}")

In [None]:
# Select a random sample from the test set
sample_index = np.random.randint(0, X_test_seq.shape[0])
sample_input = X_test_seq[sample_index]

# Reshape for LSTM input (batch_size=1)
sample_input = torch.tensor(sample_input, dtype=torch.float32).unsqueeze(0).to(device)

In [None]:
# Predict voltage for the sample
model.eval()
with torch.no_grad():
    predicted_voltage = model(sample_input).cpu().numpy()[0][0]

actual_voltage = y_test_seq[sample_index].cpu().numpy()

print(f"Predicted Voltage: {predicted_voltage:.4f} V")
print(f"Actual Voltage: {actual_voltage:.4f} V")

In [None]:
import matplotlib.pyplot as plt

plt.figure(figsize=(6, 4))
plt.bar(["Actual Voltage", "Predicted Voltage"], [actual_voltage, predicted_voltage], color=['blue', 'orange'])
plt.ylabel("Voltage (V)")
plt.title("Actual vs. Predicted Voltage")
plt.show()

In [None]:
plt.figure(figsize=(8, 5))
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.xlabel("Epochs")
plt.ylabel("Loss (MSE)")
plt.title("Training vs Validation Loss Curve")
plt.legend()
plt.show()

In [None]:
# Get model predictions
y_pred = model.predict(X_test_seq)

# Scatter plot of actual vs predicted
plt.figure(figsize=(6, 6))
plt.scatter(y_test_seq, y_pred, alpha=0.5, color='blue')
plt.plot([min(y_test_seq), max(y_test_seq)], [min(y_test_seq), max(y_test_seq)], color='red', linestyle='--')  
plt.xlabel("Actual Voltage")
plt.ylabel("Predicted Voltage")
plt.title("Actual vs. Predicted Voltage")
plt.show()

In [None]:
# --- 5. Evaluation ---
model.eval() # Ensure model is in evaluation mode (important if dropout/batchnorm are used)
test_loss = 0.0
test_mae = 0.0
all_preds = []
all_targets = []

with torch.no_grad():
    for X_batch, y_batch in test_loader:
        X_batch, y_batch = X_batch.to(device), y_batch.to(device)
        outputs = model(X_batch)

        # Accumulate loss and MAE
        test_loss += criterion(outputs, y_batch).item() * X_batch.size(0)
        test_mae += torch.mean(torch.abs(outputs - y_batch)).item() * X_batch.size(0)

        # Store predictions and targets for later analysis (e.g., R2 score)
        all_preds.append(outputs.cpu().numpy())
        all_targets.append(y_batch.cpu().numpy())

# Calculate final metrics
final_test_loss = test_loss / len(test_loader.dataset)
final_test_mae = test_mae / len(test_loader.dataset)

# Concatenate predictions and targets from all batches
y_pred_np = np.concatenate(all_preds, axis=0)
y_test_np = np.concatenate(all_targets, axis=0)

# Calculate other metrics using sklearn
final_test_rmse = np.sqrt(mean_squared_error(y_test_np, y_pred_np))
final_r2_score = r2_score(y_test_np, y_pred_np)

print(f"\n--- Evaluation Results ---")
print(f"Test Loss (MSE): {final_test_loss:.6f}")
print(f"Test MAE: {final_test_mae:.6f}")
print(f"Test RMSE: {final_test_rmse:.6f}")
print(f"Test R² Score: {final_r2_score:.6f}")


# --- 6. Plotting and Prediction Example ---

# Plot actual vs predicted for a subset of test data
plt.figure(figsize=(12, 6))
plt.plot(y_test_np[:200], label="Actual Voltage", marker="o", linestyle='-', markersize=4)
plt.plot(y_pred_np[:200], label="Predicted Voltage", marker="x", linestyle='--', markersize=4)
plt.legend()
plt.xlabel("Time Step (in test set sample)")
plt.ylabel("Normalized Voltage") # Assuming data was normalized
plt.title("Actual vs Predicted Voltage (First 200 Test Samples)")
plt.grid(True)
plt.show()

# Plot training & validation loss curves
plt.figure(figsize=(10, 5))
plt.plot(range(1, len(train_losses) + 1), train_losses, label='Training Loss (MSE)')
plt.plot(range(1, len(val_losses) + 1), val_losses, label='Validation Loss (MSE)')
plt.xlabel("Epochs")
plt.ylabel("Loss (MSE)")
plt.title("Training vs Validation Loss Curve")
plt.legend()
plt.grid(True)
plt.show()

# Plot histogram of prediction errors
errors = y_test_np - y_pred_np
plt.figure(figsize=(10, 5))
plt.hist(errors, bins=50, alpha=0.7, color='blue', edgecolor='black')
plt.xlabel("Prediction Error (Actual - Predicted)")
plt.ylabel("Frequency")
plt.title("Distribution of Prediction Errors")
plt.grid(axis='y', alpha=0.75)
plt.show()

# Scatter plot of actual vs predicted
plt.figure(figsize=(6, 6))
plt.scatter(y_test_np, y_pred_np, alpha=0.5, color='blue', label='Predictions')
# Add identity line (y=x)
min_val = min(y_test_np.min(), y_pred_np.min())
max_val = max(y_test_np.max(), y_pred_np.max())
plt.plot([min_val, max_val], [min_val, max_val], color='red', linestyle='--', label='Ideal Fit (y=x)')
plt.xlabel("Actual Voltage (Normalized)")
plt.ylabel("Predicted Voltage (Normalized)")
plt.title("Actual vs. Predicted Voltage Scatter Plot")
plt.legend()
plt.grid(True)
plt.axis('equal') # Ensure axes have the same scale for better interpretation
plt.show()


# --- Example: Predict on a single sample ---
# Select a random sample from the test set
sample_index = np.random.randint(0, X_test_tensor.shape[0])
sample_input_tensor = X_test_tensor[sample_index] # Shape: (sequence_length, num_features)
actual_voltage_sample = y_test_tensor[sample_index].item() # Get scalar value

# Reshape for LSTM input (batch_size=1) -> (1, sequence_length, num_features)
sample_input_tensor = sample_input_tensor.unsqueeze(0).to(device)

# Predict voltage for the sample
model.eval()
with torch.no_grad():
    predicted_voltage_sample = model(sample_input_tensor).item() # Get scalar value

print(f"\n--- Single Sample Prediction ---")
print(f"Sample Index: {sample_index}")
print(f"Predicted Voltage: {predicted_voltage_sample:.4f}")
print(f"Actual Voltage:    {actual_voltage_sample:.4f}")

# Optional: Bar plot for single prediction
plt.figure(figsize=(6, 4))
plt.bar(["Actual Voltage", "Predicted Voltage"], [actual_voltage_sample, predicted_voltage_sample], color=['blue', 'orange'])
plt.ylabel("Voltage (Normalized)")
plt.title(f"Actual vs. Predicted Voltage (Sample {sample_index})")
plt.ylim(min_val, max_val) # Use limits from scatter plot for consistency
plt.show()

In [None]:
# import torch
# import torch.nn as nn
# from torch.optim.lr_scheduler import ReduceLROnPlateau
# import numpy as np
# from torch.utils.data import TensorDataset, DataLoader
# from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
# import matplotlib.pyplot as plt

# # Assuming X_train_seq, y_train_seq, X_test_seq, y_test_seq are NumPy arrays
# # defined earlier in your script (e.g., after create_sequences function)
# # Example shapes (replace with your actual shapes):
# # X_train_seq.shape: (num_train_samples, sequence_length, num_features)
# # y_train_seq.shape: (num_train_samples,)
# # X_test_seq.shape: (num_test_samples, sequence_length, num_features)
# # y_test_seq.shape: (num_test_samples,)

# # --- 1. LSTM Model Definition ---
# class LSTMModel(nn.Module):
#     def __init__(self, input_size, hidden_size1=64, hidden_size2=32, dense_size=16, dropout1=0.2, dropout2=0.4):
#         super(LSTMModel, self).__init__()
#         # First LSTM layer
#         self.lstm1 = nn.LSTM(input_size, hidden_size1, batch_first=True) # Note: PyTorch LSTM doesn't have return_sequences, it's controlled by batch_first and output slicing
#         self.dropout1 = nn.Dropout(dropout1)
#         # Second LSTM layer
#         self.lstm2 = nn.LSTM(hidden_size1, hidden_size2, batch_first=True)
#         self.dropout2 = nn.Dropout(dropout2)
#         # Dense layers
#         self.dense1 = nn.Linear(hidden_size2, dense_size)
#         self.relu = nn.ReLU()
#         self.dense2 = nn.Linear(dense_size, 1) # Output layer for voltage prediction

#     def forward(self, x):
#         # LSTM layers
#         # lstm1 output shape: (batch, seq_len, hidden_size1)
#         # hn/cn shape: (num_layers * num_directions, batch, hidden_size)
#         lstm_out1, _ = self.lstm1(x)
#         # Take the output of the last time step from the first LSTM layer
#         # In PyTorch with batch_first=True, output is (batch, seq_len, feature), so we need the last sequence element
#         # However, the second LSTM layer expects the full sequence output from the first
#         x = self.dropout1(lstm_out1)

#         lstm_out2, (hn, cn) = self.lstm2(x)
#         # Use the hidden state of the last time step from the second LSTM layer
#         # hn shape is (num_layers*num_directions, batch, hidden_size2), we want the last layer's hidden state
#         # For a single-layer, non-bidirectional LSTM, hn[-1] gets the hidden state for the last time step implicitly because return_sequences=False equivalent behavior
#         x = self.dropout2(hn[-1]) # hn[-1] takes the hidden state of the last layer for all items in the batch

#         # Dense layers
#         x = self.dense1(x)
#         x = self.relu(x)
#         x = self.dense2(x)
#         # Ensure output is squeezed to match target shape (batch_size,) if y_batch is (batch_size,)
#         # If criterion expects (batch_size, 1), remove the .squeeze()
#         return x.squeeze(-1) # Squeeze the last dimension if target is 1D

# # --- 2. Data Preparation ---
# # Assuming sequence_length and feature dimensions are known
# # Replace dummy data with your actual X_train_seq, y_train_seq, etc.
# sequence_length = 30
# num_features = 4 # Example: ['Current_measured', 'Voltage_load', 'Time', 'Battery_ID']
# # Dummy Data (Replace with your actual loaded and preprocessed data)
# X_train_seq = np.random.rand(1000, sequence_length, num_features)
# y_train_seq = np.random.rand(1000)
# X_test_seq = np.random.rand(200, sequence_length, num_features)
# y_test_seq = np.random.rand(200)


# # Convert data to PyTorch tensors
# X_train_tensor = torch.tensor(X_train_seq, dtype=torch.float32)
# y_train_tensor = torch.tensor(y_train_seq, dtype=torch.float32)
# X_test_tensor = torch.tensor(X_test_seq, dtype=torch.float32)
# y_test_tensor = torch.tensor(y_test_seq, dtype=torch.float32)

# # Create DataLoader for batch processing
# batch_size = 64
# train_dataset = TensorDataset(X_train_tensor, y_train_tensor)
# test_dataset = TensorDataset(X_test_tensor, y_test_tensor)
# train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
# test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

# # --- 3. Model Initialization and Training Setup ---
# input_size = X_train_tensor.shape[2] # Number of features
# model = LSTMModel(input_size=input_size)

# # Loss function
# criterion = nn.MSELoss()

# # Optimizer
# initial_learning_rate = 0.001
# optimizer = torch.optim.Adam(model.parameters(), lr=initial_learning_rate)

# # Learning rate scheduler
# lr_scheduler = ReduceLROnPlateau(optimizer, mode='min', factor=0.5, patience=5, min_lr=1e-6, verbose=True)

# # Early stopping implementation
# class EarlyStopping:
#     def __init__(self, patience=10, min_delta=0, restore_best_weights=True):
#         self.patience = patience
#         self.min_delta = min_delta
#         self.restore_best_weights = restore_best_weights
#         self.best_loss = float('inf')
#         self.best_model_state_dict = None # Store the state_dict
#         self.counter = 0

#     def __call__(self, val_loss, model):
#         if val_loss < self.best_loss - self.min_delta:
#             self.best_loss = val_loss
#             self.counter = 0
#             if self.restore_best_weights:
#                 # Save the model's state_dict instead of the model itself
#                 self.best_model_state_dict = model.state_dict()
#         else:
#             self.counter += 1
#             print(f"EarlyStopping counter: {self.counter} out of {self.patience}")
#             if self.counter >= self.patience:
#                 print(f'Early stopping triggered after {self.patience} epochs of no improvement.')
#                 if self.restore_best_weights and self.best_model_state_dict:
#                     print("Restoring best model weights.")
#                     model.load_state_dict(self.best_model_state_dict)
#                 return True
#         return False

# early_stopping = EarlyStopping(patience=10, restore_best_weights=True)

# # --- 4. Training Loop ---
# num_epochs = 50
# device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
# print(f"Using device: {device}")
# model = model.to(device)

# train_losses = []
# val_losses = []
# train_maes_list = [] # To store MAE per epoch

# for epoch in range(num_epochs):
#     model.train() # Set model to training mode
#     epoch_train_loss = 0.0
#     epoch_train_mae = 0.0

#     for i, (X_batch, y_batch) in enumerate(train_loader):
#         X_batch, y_batch = X_batch.to(device), y_batch.to(device)

#         # Zero gradients
#         optimizer.zero_grad()

#         # Forward pass
#         outputs = model(X_batch)

#         # Calculate loss
#         loss = criterion(outputs, y_batch)

#         # Backward pass and optimization
#         loss.backward()
#         optimizer.step()

#         # Accumulate loss and MAE for the epoch
#         epoch_train_loss += loss.item() * X_batch.size(0)
#         # Calculate MAE for the batch
#         mae_batch = torch.mean(torch.abs(outputs.detach() - y_batch)).item() # Use detach() for metrics
#         epoch_train_mae += mae_batch * X_batch.size(0)


#     # Calculate average loss and MAE for the epoch
#     avg_train_loss = epoch_train_loss / len(train_loader.dataset)
#     avg_train_mae = epoch_train_mae / len(train_loader.dataset)
#     train_losses.append(avg_train_loss)
#     train_maes_list.append(avg_train_mae)

#     # --- Validation ---
#     model.eval() # Set model to evaluation mode
#     epoch_val_loss = 0.0
#     with torch.no_grad(): # Disable gradient calculation for validation
#         for X_batch, y_batch in test_loader:
#             X_batch, y_batch = X_batch.to(device), y_batch.to(device)
#             outputs = model(X_batch)
#             loss = criterion(outputs, y_batch)
#             epoch_val_loss += loss.item() * X_batch.size(0)

#     avg_val_loss = epoch_val_loss / len(test_loader.dataset)
#     val_losses.append(avg_val_loss)

#     print(f'Epoch {epoch+1}/{num_epochs}, Train Loss: {avg_train_loss:.6f}, Train MAE: {avg_train_mae:.6f}, Val Loss: {avg_val_loss:.6f}')

#     # Update learning rate based on validation loss
#     lr_scheduler.step(avg_val_loss)

#     # Check for early stopping based on validation loss
#     if early_stopping(avg_val_loss, model):
#         break # Stop training

# print("Training finished.")

# # --- 5. Evaluation ---
# model.eval() # Ensure model is in evaluation mode (important if dropout/batchnorm are used)
# test_loss = 0.0
# test_mae = 0.0
# all_preds = []
# all_targets = []

# with torch.no_grad():
#     for X_batch, y_batch in test_loader:
#         X_batch, y_batch = X_batch.to(device), y_batch.to(device)
#         outputs = model(X_batch)

#         # Accumulate loss and MAE
#         test_loss += criterion(outputs, y_batch).item() * X_batch.size(0)
#         test_mae += torch.mean(torch.abs(outputs - y_batch)).item() * X_batch.size(0)

#         # Store predictions and targets for later analysis (e.g., R2 score)
#         all_preds.append(outputs.cpu().numpy())
#         all_targets.append(y_batch.cpu().numpy())

# # Calculate final metrics
# final_test_loss = test_loss / len(test_loader.dataset)
# final_test_mae = test_mae / len(test_loader.dataset)

# # Concatenate predictions and targets from all batches
# y_pred_np = np.concatenate(all_preds, axis=0)
# y_test_np = np.concatenate(all_targets, axis=0)

# # Calculate other metrics using sklearn
# final_test_rmse = np.sqrt(mean_squared_error(y_test_np, y_pred_np))
# final_r2_score = r2_score(y_test_np, y_pred_np)

# print(f"\n--- Evaluation Results ---")
# print(f"Test Loss (MSE): {final_test_loss:.6f}")
# print(f"Test MAE: {final_test_mae:.6f}")
# print(f"Test RMSE: {final_test_rmse:.6f}")
# print(f"Test R² Score: {final_r2_score:.6f}")


# # --- 6. Plotting and Prediction Example ---

# # Plot actual vs predicted for a subset of test data
# plt.figure(figsize=(12, 6))
# plt.plot(y_test_np[:200], label="Actual Voltage", marker="o", linestyle='-', markersize=4)
# plt.plot(y_pred_np[:200], label="Predicted Voltage", marker="x", linestyle='--', markersize=4)
# plt.legend()
# plt.xlabel("Time Step (in test set sample)")
# plt.ylabel("Normalized Voltage") # Assuming data was normalized
# plt.title("Actual vs Predicted Voltage (First 200 Test Samples)")
# plt.grid(True)
# plt.show()

# # Plot training & validation loss curves
# plt.figure(figsize=(10, 5))
# plt.plot(range(1, len(train_losses) + 1), train_losses, label='Training Loss (MSE)')
# plt.plot(range(1, len(val_losses) + 1), val_losses, label='Validation Loss (MSE)')
# plt.xlabel("Epochs")
# plt.ylabel("Loss (MSE)")
# plt.title("Training vs Validation Loss Curve")
# plt.legend()
# plt.grid(True)
# plt.show()

# # Plot histogram of prediction errors
# errors = y_test_np - y_pred_np
# plt.figure(figsize=(10, 5))
# plt.hist(errors, bins=50, alpha=0.7, color='blue', edgecolor='black')
# plt.xlabel("Prediction Error (Actual - Predicted)")
# plt.ylabel("Frequency")
# plt.title("Distribution of Prediction Errors")
# plt.grid(axis='y', alpha=0.75)
# plt.show()

# # Scatter plot of actual vs predicted
# plt.figure(figsize=(6, 6))
# plt.scatter(y_test_np, y_pred_np, alpha=0.5, color='blue', label='Predictions')
# # Add identity line (y=x)
# min_val = min(y_test_np.min(), y_pred_np.min())
# max_val = max(y_test_np.max(), y_pred_np.max())
# plt.plot([min_val, max_val], [min_val, max_val], color='red', linestyle='--', label='Ideal Fit (y=x)')
# plt.xlabel("Actual Voltage (Normalized)")
# plt.ylabel("Predicted Voltage (Normalized)")
# plt.title("Actual vs. Predicted Voltage Scatter Plot")
# plt.legend()
# plt.grid(True)
# plt.axis('equal') # Ensure axes have the same scale for better interpretation
# plt.show()


# # --- Example: Predict on a single sample ---
# # Select a random sample from the test set
# sample_index = np.random.randint(0, X_test_tensor.shape[0])
# sample_input_tensor = X_test_tensor[sample_index] # Shape: (sequence_length, num_features)
# actual_voltage_sample = y_test_tensor[sample_index].item() # Get scalar value

# # Reshape for LSTM input (batch_size=1) -> (1, sequence_length, num_features)
# sample_input_tensor = sample_input_tensor.unsqueeze(0).to(device)

# # Predict voltage for the sample
# model.eval()
# with torch.no_grad():
#     predicted_voltage_sample = model(sample_input_tensor).item() # Get scalar value

# print(f"\n--- Single Sample Prediction ---")
# print(f"Sample Index: {sample_index}")
# print(f"Predicted Voltage: {predicted_voltage_sample:.4f}")
# print(f"Actual Voltage:    {actual_voltage_sample:.4f}")

# # Optional: Bar plot for single prediction
# plt.figure(figsize=(6, 4))
# plt.bar(["Actual Voltage", "Predicted Voltage"], [actual_voltage_sample, predicted_voltage_sample], color=['blue', 'orange'])
# plt.ylabel("Voltage (Normalized)")
# plt.title(f"Actual vs. Predicted Voltage (Sample {sample_index})")
# plt.ylim(min_val, max_val) # Use limits from scatter plot for consistency
# plt.show()

# Conclusion & Scope for Improvement

- **Conclusion**:
  - Successfully implemented an LSTM-based deep learning model for battery voltage prediction using the NASA Battery Dataset.
  - Effective data preprocessing and exploratory data analysis (EDA) conducted.
  - Model tuned with adaptive learning rates, dropout layers for regularization, and early stopping to prevent overfitting.
  - Evaluation metrics (MAE, MSE, RMSE, R² score) indicate effective capture of battery behavior.
  - Promising approach for battery health monitoring and predictive maintenance.

- **Scope for Improvement**:
  - **Hyperparameter Optimization**:
    - Experiment with Bidirectional LSTMs or GRUs for improved learning efficiency.
    - Fine-tune hyperparameters (batch size, dropout rate, sequence length) using Bayesian Optimization or Grid Search.
  - **Feature Engineering**:
    - Introduce derived features (e.g., voltage change rate, cumulative charge/discharge cycles) to enhance predictions.
  - **Handling Imbalanced Data**:
    - Address datasets with uneven charge/discharge cycles using resampling techniques for better generalization.
  - **External Data Integration**:
    - Incorporate environmental factors (humidity, pressure, real-world usage scenarios) for robust predictions.
  - **Model Deployment & Real-Time Predictions**:
    - Convert model into a real-time monitoring system using Flask or FastAPI.
    - Deploy on Kaggle, Hugging Face, or cloud platforms for live battery voltage inference.
  - **Outcome**:
    - Improvements can make the model more generalizable, efficient, and deployable for industrial battery monitoring systems.