# LSTM Classification Model

This notebook demonstrates how to load data, preprocess it, define an LSTM model, train the model, and evaluate its performance. The data is assumed to be in CSV format and stored in a directory.

## Setup

First, we need to install the necessary libraries. Run the following cell to install them.

In [37]:
%pip install torch torchvision torchaudio
%pip install pandas scikit-learn
%pip install wandb onnx -Uq



## Import Libraries and seed
Import the necessary libraries for data processing, model building, training, and evaluation. Adding a seed ensures reproducibility by making sure that the random number generation is consistent across different runs.

In [38]:
import os
import random

import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder
from torch.utils.data import DataLoader, TensorDataset

import wandb

def set_seed(seed):
    np.random.seed(seed)
    torch.manual_seed(seed)
    random.seed(seed)
    if torch.cuda.is_available():
        torch.cuda.manual_seed(seed)
        torch.cuda.manual_seed_all(seed)
        torch.backends.cudnn.deterministic = True
        torch.backends.cudnn.benchmark = False

# Device configuration
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

In [39]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [40]:
wandb.login()

True

## Load Data from Github Repository


In [41]:
## Remove PIC-PAPER-01 folder:
!rm -rf PIC-PAPER-01

# # Download Github Repo (Private) https://stackoverflow.com/questions/74532852/clone-github-repo-with-fine-grained-token/78280453#78280453
# !git clone --no-checkout https://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@github.com/danimp94/PIC-PAPER-01.git

# # To clone data folder only:
# %cd PIC-PAPER-01 # Navigate to the repository directory
# !git sparse-checkout init --cone # Initialize sparse-checkout
# !git sparse-checkout set data # Set the sparse-checkout to include only the data/ folder
# !git checkout # Checkout the specified folder

In [42]:
input_path = '/content/drive/MyDrive/PhD/Colab Notebooks/'

data_frames = []
for file in os.listdir(input_path):
    if file.endswith('.csv'):
        df = pd.read_csv(os.path.join(input_path, file), delimiter=';', header=0)
        data_frames.append(df)
data = pd.concat(data_frames, ignore_index=True)

print(data)
print(data.shape)

        Sample  Frequency (GHz)     LG (mV)    HG (mV)  Thickness (mm)
0           A1            100.0   -7.080942  -0.854611             0.2
1           A1            100.0   67.024785   0.244141             0.2
2           A1            100.0  124.893178  -1.098776             0.2
3           A1            100.0   91.075571   0.000000             0.2
4           A1            100.0   48.956174   0.122094             0.2
...        ...              ...         ...        ...             ...
2737958    REF            600.0    0.366256  16.237333             0.0
2737959    REF            600.0    0.000000  -7.080942             0.0
2737960    REF            600.0   -0.244170  15.260652             0.0
2737961    REF            600.0    0.366256  20.021975             0.0
2737962    REF            600.0    0.122085  13.185203             0.0

[2737963 rows x 5 columns]
(2737963, 5)


## Preprocessing Data
Define a function to preprocess the data. This includes encoding categorical labels and standardizing the features.

In [43]:
def calculate_averages_and_dispersion(data, data_percentage=3.7):
    df = data
    results = []
    for (sample, freq), group in df.groupby(['Sample', 'Frequency (GHz)']):
        window_size = max(1, int(len(group) * data_percentage / 100))
        # print(f"Processing sample: {sample}, frequency: {freq} with window size: {window_size}")
        for start in range(0, len(group), window_size):
            window_data = group.iloc[start:start + window_size]
            mean_values = window_data[['LG (mV)', 'HG (mV)']].mean()
            std_deviation_values = window_data[['LG (mV)', 'HG (mV)']].std()
            results.append({
                'Frequency (GHz)': freq,
                'LG (mV) mean': mean_values['LG (mV)'],
                'HG (mV) mean': mean_values['HG (mV)'],
                'LG (mV) std deviation': std_deviation_values['LG (mV)'],
                'HG (mV) std deviation': std_deviation_values['HG (mV)'],
                'Thickness (mm)': window_data['Thickness (mm)'].iloc[0],
                'Sample': sample,
            })
    results_df = pd.DataFrame(results)
    # results_df.to_csv(output_file, sep=';', index=False)
    # print(f"Processed {input_file} and saved to {output_file}")
    print(results_df)
    return results_df

In [44]:
def preprocess_data(data):
    # Windowing the data
    data = calculate_averages_and_dispersion(data)
    print(data.shape)

    # Assuming the last column is the target
    X = data.iloc[:, :-1].values
    y = data.iloc[:, -1].values

    # Encode the target variable if it's categorical
    if y.dtype == 'object':
        le = LabelEncoder()
        y = le.fit_transform(y)

    # Standardize the features
    scaler = StandardScaler()
    X = scaler.fit_transform(X)

    # Convert to PyTorch tensors
    X = torch.tensor(X, dtype=torch.float32)
    y = torch.tensor(y, dtype=torch.long)

    return X, y

In [45]:
# Load and preprocess data
X, y = preprocess_data(data)

       Frequency (GHz)  LG (mV) mean  HG (mV) mean  LG (mV) std deviation  \
0                100.0     54.879155     -0.022198              29.958659   
1                100.0     54.511665      0.048093              28.096155   
2                100.0     55.099894     -0.118380              26.833871   
3                100.0     48.387674      0.103588              28.843498   
4                100.0     49.932853     -0.071525              23.093397   
...                ...           ...           ...                    ...   
24271            600.0     -0.006143     10.237498               0.890999   
24272            600.0     -0.006910     10.949278               0.858148   
24273            600.0      0.029178     10.547702               0.842082   
24274            600.0      0.065266     10.051683               0.891632   
24275            600.0     -0.228910      9.766817               0.856929   

       HG (mV) std deviation  Thickness (mm) Sample  
0                   0

## Config

In [46]:
config = dict(
    epochs=2000,
    seed = 40,
    classes = data['Sample'].nunique(), # Each different sample is a different class
    batch_size=128,
    learning_rate=0.001,
    dataset="experiment_1",
    architecture="LSTM")

print(config)

{'epochs': 2000, 'seed': 40, 'classes': 17, 'batch_size': 128, 'learning_rate': 0.001, 'dataset': 'experiment_1', 'architecture': 'LSTM'}


## Define Model
Define the LSTM model architecture

In [47]:
class LSTMModel(nn.Module):
    def __init__(self, input_dim, hidden_dim, output_dim):
        super(LSTMModel, self).__init__()
        self.hidden_dim = hidden_dim
        self.lstm = nn.LSTM(input_dim, hidden_dim)
        self.dropout = nn.Dropout(0.2)
        self.fc = nn.Linear(hidden_dim, output_dim)

    def forward(self, x):
        # Reshape input to (seq_len, batch_size, input_size)
        x = x.unsqueeze(1)  # Adding a dimension for batch size (batch_size=1 in this case)

        # Initialize hidden and cell states with the correct dimensions
        h_0 = torch.zeros(1, x.size(1), self.hidden_dim).to(x.device)
        c_0 = torch.zeros(1, x.size(1), self.hidden_dim).to(x.device)

        # Now the input is 3-dimensional, so hx and cx should be 3-dimensional as well
        out, _ = self.lstm(x, (h_0, c_0))

        out = self.dropout(out[:, -1, :])
        out = self.fc(out)
        return out

## Train Model
Define a function to train the model

In [48]:
def train_model(model, train_loader, criterion, optimizer, device, num_epochs=10):
    for epoch in range(num_epochs):
        model.train()
        running_loss = 0.0
        for X_batch, y_batch in train_loader:
              X_batch, y_batch = X_batch.to(device), y_batch.to(device)

              outputs = model(X_batch)
              loss = criterion(outputs, y_batch)

              optimizer.zero_grad()
              loss.backward()
              optimizer.step()

              running_loss += loss.item()

                # Log metrics to W&B
        wandb.log({"epoch": epoch, "train_loss": running_loss / len(train_loader)})
        print(f"Epoch [{epoch+1}/{num_epochs}], Train Loss: {running_loss/len(train_loader):.4f}")




## Evaluate Model


In [49]:
def evaluate_model(model, test_loader, device):
    model.eval()
    with torch.no_grad():
        correct = 0
        total = 0
        for X_batch, y_batch in test_loader:
            X_batch, y_batch = X_batch.to(device), y_batch.to(device)
            outputs = model(X_batch)
            _, predicted = torch.max(outputs.data, 1)
            total += y_batch.size(0)
            correct += (predicted == y_batch).sum().item()

        accuracy = correct / total
        print(f'Test Accuracy: {accuracy:.4f}')

In [50]:
def make(config):
    # Split the data
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=config.seed)
    # X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2, random_state=config.seed)

    train_dataset = TensorDataset(X_train, y_train)
    # val_dataset = TensorDataset(X_val, y_val)
    test_dataset = TensorDataset(X_test, y_test)

    train_loader = DataLoader(train_dataset, batch_size=config.batch_size, shuffle=True)
    # val_loader = DataLoader(val_dataset, batch_size=config.batch_size, shuffle=False)
    test_loader = DataLoader(test_dataset, batch_size=config.batch_size, shuffle=False)

    # Define the model
    model = LSTMModel(X_train.shape[1], 64, config['classes']).to(device)

    # Loss and optimizer
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.Adam(model.parameters(), lr=config.learning_rate)

    return model, train_loader, test_loader, criterion, optimizer

In [51]:
def model_pipeline(hyperparameters):

    with wandb.init(project="PIC-PAPER-01-exp-1", config=hyperparameters):
        config = wandb.config
        set_seed(config.seed)
        # print(config['seed'])

        # Create data loaders and model
        model, train_loader, test_loader, criterion, optimizer = make(config)
        print(model)

        # Train the model
        train_model(model, train_loader, criterion, optimizer, device, config.epochs)

        # Evaluate the model
        evaluate_model(model, test_loader, device)

    return model


## Run Training

In [52]:
model = model_pipeline(config)

LSTMModel(
  (lstm): LSTM(6, 64)
  (dropout): Dropout(p=0.2, inplace=False)
  (fc): Linear(in_features=64, out_features=17, bias=True)
)
Epoch [1/2000], Train Loss: 2.6490
Epoch [2/2000], Train Loss: 2.2570
Epoch [3/2000], Train Loss: 2.1174
Epoch [4/2000], Train Loss: 1.9824
Epoch [5/2000], Train Loss: 1.8653
Epoch [6/2000], Train Loss: 1.7684
Epoch [7/2000], Train Loss: 1.6865
Epoch [8/2000], Train Loss: 1.6141
Epoch [9/2000], Train Loss: 1.5529
Epoch [10/2000], Train Loss: 1.4991
Epoch [11/2000], Train Loss: 1.4469
Epoch [12/2000], Train Loss: 1.4038
Epoch [13/2000], Train Loss: 1.3605
Epoch [14/2000], Train Loss: 1.3263
Epoch [15/2000], Train Loss: 1.2869
Epoch [16/2000], Train Loss: 1.2543
Epoch [17/2000], Train Loss: 1.2235
Epoch [18/2000], Train Loss: 1.1933
Epoch [19/2000], Train Loss: 1.1670
Epoch [20/2000], Train Loss: 1.1369
Epoch [21/2000], Train Loss: 1.1118
Epoch [22/2000], Train Loss: 1.0915
Epoch [23/2000], Train Loss: 1.0689
Epoch [24/2000], Train Loss: 1.0490
Epoch [2

VBox(children=(Label(value='0.083 MB of 0.083 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
epoch,▁▁▁▂▂▂▂▂▂▂▂▂▂▂▃▃▃▄▄▄▄▅▅▅▆▆▆▆▆▆▇▇▇▇▇█████
train_loss,█▆▃▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
epoch,1999.0
train_loss,0.34643


## Save the model

In [53]:
# Save the model
torch.save(model.state_dict(), 'lstm_model.pth')

# # Save the model as onnx
# torch.onnx.export(model, X_train, 'lstm_model.onnx')

## Run inference

In [54]:
# Run model Inference

# Load test data



# Load pretrained model
input_path = '/content/lstm_model.pth'
model.load_state_dict(torch.load(input_path))
model.eval()

with torch.no_grad():
    X = X.to(device)
    outputs = model(X)
    _, predicted = torch.max(outputs.data, 1)
print(predicted)

tensor([ 5,  5,  0,  ..., 16, 16, 16], device='cuda:0')


  model.load_state_dict(torch.load(input_path))
