<a href="https://colab.research.google.com/github/yeoanni/Stats-507-final-project/blob/main/project_notebook.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Welcome to this notebook, where we’ll walk through how to use a Transformer-based model to detect anomalies in multivariate time-series data. This project is built upon the codebase available on [GitHub](https://github.com/yeoanni/Stats-507-final-project). Let’s get started!


# Setup

First, we need to set up our environment. This includes installing necessary libraries and cloning the repository.


In [2]:
!pip install torch numpy matplotlib sklearn
!git clone https://github.com/yeoanni/Stats-507-final-project.git
%cd Stats-507-final-project

Collecting sklearn
  Using cached sklearn-0.0.post12.tar.gz (2.6 kB)
  [1;31merror[0m: [1msubprocess-exited-with-error[0m
  
  [31m×[0m [32mpython setup.py egg_info[0m did not run successfully.
  [31m│[0m exit code: [1;36m1[0m
  [31m╰─>[0m See above for output.
  
  [1;35mnote[0m: This error originates from a subprocess, and is likely not a problem with pip.
  Preparing metadata (setup.py) ... [?25l[?25herror
[1;31merror[0m: [1mmetadata-generation-failed[0m

[31m×[0m Encountered error while generating package metadata.
[31m╰─>[0m See above for output.

[1;35mnote[0m: This is an issue with the package mentioned above, not pip.
[1;36mhint[0m: See above for details.
Cloning into 'Stats-507-final-project'...
remote: Enumerating objects: 160, done.[K
remote: Counting objects: 100% (160/160), done.[K
remote: Compressing objects: 100% (150/150), done.[K
remote: Total 160 (delta 64), reused 0 (delta 0), pack-reused 0 (from 0)[K
Receiving objects: 100% (160/160

Now, let’s import the libraries and modules that we’ll use throughout this notebook.

In [None]:
!pip uninstall -y torch
!pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import precision_recall_curve, roc_auc_score
from data_loader import create_dataloaders
from model import TransformerAnomaly

Found existing installation: torch 2.5.1+cu121
Uninstalling torch-2.5.1+cu121:
  Successfully uninstalled torch-2.5.1+cu121
Looking in indexes: https://download.pytorch.org/whl/cu118
Collecting torch
  Downloading https://download.pytorch.org/whl/cu118/torch-2.5.1%2Bcu118-cp310-cp310-linux_x86_64.whl (838.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m838.3/838.3 MB[0m [31m2.3 MB/s[0m eta [36m0:00:00[0m
Collecting nvidia-cuda-nvrtc-cu11==11.8.89 (from torch)
  Downloading https://download.pytorch.org/whl/cu118/nvidia_cuda_nvrtc_cu11-11.8.89-py3-none-manylinux1_x86_64.whl (23.2 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m23.2/23.2 MB[0m [31m60.9 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting nvidia-cuda-runtime-cu11==11.8.89 (from torch)
  Downloading https://download.pytorch.org/whl/cu118/nvidia_cuda_runtime_cu11-11.8.89-py3-none-manylinux1_x86_64.whl (875 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m875.6/875

# Dataset Exploration

Before we dive into modeling, let’s explore the dataset. This helps us understand the structure and characteristics of the data.


In [1]:
train_loader, val_loader, test_loader = create_dataloaders(
    dataset_name="realAWSCloudwatch",
    batch_size=16
)

NameError: name 'create_dataloaders' is not defined

### Display summary statistics

In [None]:
for data, _ in train_loader:
    print("Shape of training data:", data.shape)
    break

### Visualize a sample time-series

In [None]:
sample_data, _ = next(iter(train_loader))
plt.plot(sample_data[0].numpy())
plt.title("Sample Time-Series")
plt.xlabel("Time Steps")
plt.ylabel("Values")
plt.show()

The dataset includes multivariate time-series data. Above, we visualized one example to get a sense of what it looks like.


# Model Architecture

Our model is based on a Transformer architecture, which excels at capturing long-term dependencies in sequential data. Here’s how it’s defined:


In [None]:
model = TransformerAnomaly(input_dim=1, d_model=64, n_heads=8, num_layers=3, output_dim=1)
print(model)

The model uses self-attention mechanisms to identify subtle patterns that may signify anomalies.


# Training the Model

Let’s set up our training loop. We’ll optimize the model using Mean Squared Error (MSE) loss and the Adam optimizer.


In [None]:
criterion = torch.nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

### Training loop

In [None]:
def train_model(model, train_loader, val_loader, epochs=20):
    train_losses, val_losses = [], []
    for epoch in range(epochs):
        model.train()
        train_loss = 0
        for data, _ in train_loader:
            optimizer.zero_grad()
            outputs = model(data)
            loss = criterion(outputs, data)
            loss.backward()
            optimizer.step()
            train_loss += loss.item()
        train_loss /= len(train_loader)
        train_losses.append(train_loss)

        # Validation phase
        model.eval()
        val_loss = 0
        with torch.no_grad():
            for data, _ in val_loader:
                outputs = model(data)
                loss = criterion(outputs, data)
                val_loss += loss.item()
        val_loss /= len(val_loader)
        val_losses.append(val_loss)

        print(f"Epoch {epoch+1}/{epochs}, Train Loss: {train_loss:.4f}, Val Loss: {val_loss:.4f}")
    return train_losses, val_losses

### Train the model

In [None]:
train_losses, val_losses = train_model(model, train_loader, val_loader)

### Plot training and validation losses

In [None]:
plt.plot(train_losses, label="Train Loss")
plt.plot(val_losses, label="Validation Loss")
plt.legend()
plt.title("Training and Validation Losses")
plt.xlabel("Epochs")
plt.ylabel("Loss")
plt.show()

During training, we observe how the losses decrease over epochs, indicating that the model is learning effectively.


# Model Evaluation

Next, we evaluate the model on the test set to measure its performance.

In [None]:
def evaluate_model(model, test_loader):
    y_true, y_pred = [], []
    for data, labels in test_loader:
        outputs = model(data)
        y_true.extend(labels.numpy())
        y_pred.extend(outputs.detach().numpy())
    pr_auc = roc_auc_score(y_true, y_pred)
    print(f"Precision-Recall AUC: {pr_auc:.4f}")
    return y_true, y_pred

### Run evaluation

In [None]:
y_true, y_pred = evaluate_model(model, test_loader)

The Precision-Recall AUC gives us an idea of how well the model distinguishes between normal and anomalous behavior.


# Results Visualization

Visualizing the results helps us interpret the model’s predictions and understand its behavior.


In [None]:
precision, recall, _ = precision_recall_curve(y_true, y_pred)
plt.plot(recall, precision)
plt.title("Precision-Recall Curve")
plt.xlabel("Recall")
plt.ylabel("Precision")
plt.show()

In [None]:
plt.hist(y_pred, bins=30, alpha=0.7)
plt.axvline(x=0.5, color="red", linestyle="--", label="Threshold")
plt.legend()
plt.title("Anomaly Score Distribution")
plt.xlabel("Anomaly Score")
plt.ylabel("Frequency")
plt.show()

The Precision-Recall Curve illustrates the trade-off between recall and precision, while the histogram highlights the distribution of anomaly scores.


# Conclusion

In this notebook, we demonstrated the entire workflow for anomaly detection using a Transformer-based model. The model showed strong performance, as evidenced by the evaluation metrics and visualizations. Future work can explore domain-specific tuning and unsupervised learning methods to further enhance the model’s effectiveness.
