<a href="https://colab.research.google.com/github/yeoanni/Stats-507-final-project/blob/main/project_notebook.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Time Series Anomaly Detection Project

Welcome to the Time Series Anomaly Detection project! In this notebook, we will walk through the implementation step-by-step, explaining each part of the code and how it contributes to the overall goal of detecting anomalies in time series data.

# Setup

First, let's make sure we have the necessary libraries installed. We'll be using NumPy, Pandas, Scikit-learn, and PyTorch.

In [1]:
!pip install numpy pandas scikit-learn torch



# preprocess.py

**Description**: This file contains the data preprocessing code for the time series anomaly detection project. It defines two main classes: TimeSeriesDataset and TimeSeriesPreprocessor. The TimeSeriesDataset class is a custom PyTorch dataset that handles the loading and formatting of time series sequences and their corresponding labels. The TimeSeriesPreprocessor class provides methods for adding enhanced time features to the raw data, creating sequences with a specified window size and stride, and generating PyTorch DataLoader objects for training and validation.

**Guidance**: To use this preprocessing code, create an instance of the TimeSeriesPreprocessor class and call the create_dataloaders method. This will load the default dataset, preprocess it, and return PyTorch DataLoader objects for training and validation. You can also pass a custom DataFrame to the create_dataloaders method if you want to use your own data.

In [2]:
import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
import torch
from torch.utils.data import Dataset, DataLoader

class TimeSeriesDataset(Dataset):
    def __init__(self, sequences, labels):
        self.sequences = sequences
        self.labels = labels

    def __len__(self):
        return len(self.sequences)

    def __getitem__(self, idx):
        return torch.FloatTensor(self.sequences[idx]), torch.FloatTensor([self.labels[idx]])

class TimeSeriesPreprocessor:
    def __init__(self, window_size=288, stride=12):
        self.window_size = window_size
        self.stride = stride
        self.scaler = MinMaxScaler()

    def add_time_features(self, df):
        df = df.copy()

        df['hour'] = df.timestamp.dt.hour
        df['day_of_week'] = df.timestamp.dt.dayofweek
        df['day_of_month'] = df.timestamp.dt.day

        df['hour_sin'] = np.sin(2 * np.pi * df['hour']/24)
        df['hour_cos'] = np.cos(2 * np.pi * df['hour']/24)
        df['day_sin'] = np.sin(2 * np.pi * df['day_of_week']/7)
        df['day_cos'] = np.cos(2 * np.pi * df['day_of_week']/7)

        df['rolling_mean'] = df['value'].rolling(window=12, min_periods=1).mean()
        df['rolling_std'] = df['value'].rolling(window=12, min_periods=1).std()
        df['rolling_max'] = df['value'].rolling(window=12, min_periods=1).max()
        df['rolling_min'] = df['value'].rolling(window=12, min_periods=1).min()

        df['lag_1'] = df['value'].shift(1)
        df['lag_6'] = df['value'].shift(6)
        df['lag_12'] = df['value'].shift(12)

        df = df.fillna(method='bfill').fillna(method='ffill')

        return df

    def create_sequences(self, df):
        feature_columns = ['value', 'rolling_mean', 'rolling_std', 'rolling_max',
                         'rolling_min', 'lag_1', 'lag_6', 'lag_12',
                         'hour_sin', 'hour_cos', 'day_sin', 'day_cos']

        scaled_features = self.scaler.fit_transform(df[feature_columns])

        sequences = []
        labels = []

        mean = df['value'].mean()
        std = df['value'].std()
        upper_threshold = mean + 2*std
        lower_threshold = mean - 2*std

        for i in range(0, len(df) - self.window_size, self.stride):
            seq = scaled_features[i:i + self.window_size]
            next_val = df['value'].iloc[i + self.window_size]

            is_anomaly = (next_val > upper_threshold) or (next_val < lower_threshold)

            sequences.append(seq)
            labels.append(float(is_anomaly))

            if is_anomaly:
                noise = np.random.normal(0, 0.1, seq.shape)
                augmented_seq = seq + noise
                sequences.append(augmented_seq)
                labels.append(1.0)

        return np.array(sequences), np.array(labels)

    def create_dataloaders(self, df=None, batch_size=16, train_split=0.8):
        if df is None:
            base_url = "https://raw.githubusercontent.com/numenta/NAB/master/data/"
            file_path = f"{base_url}realAWSCloudwatch/ec2_cpu_utilization_825cc2.csv"
            df = pd.read_csv(file_path)
            df['timestamp'] = pd.to_datetime(df['timestamp'])

        df = self.add_time_features(df)
        sequences, labels = self.create_sequences(df)

        indices = np.random.permutation(len(sequences))
        sequences = sequences[indices]
        labels = labels[indices]

        train_size = int(len(sequences) * train_split)

        train_sequences = sequences[:train_size]
        train_labels = labels[:train_size]
        val_sequences = sequences[train_size:]
        val_labels = labels[train_size:]

        train_dataset = TimeSeriesDataset(train_sequences, train_labels)
        val_dataset = TimeSeriesDataset(val_sequences, val_labels)

        train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
        val_loader = DataLoader(val_dataset, batch_size=batch_size)

        return train_loader, val_loader

# transformer.py

**Description**: This file contains the implementation of the Transformer-based anomaly detection model. It defines two main classes: PositionalEncoding and TransformerAnomaly. The PositionalEncoding class is used to add positional information to the input sequences, which is important for the Transformer architecture. The TransformerAnomaly class defines the architecture of the anomaly detection model, including the input projection, positional encoding, Transformer encoder, and output layers.

**Guidance**: To use the TransformerAnomaly model, create an instance of the class with the desired hyperparameters (e.g., input_dim, d_model, nhead, num_layers, dropout) and call the forward method with your input sequences. The model will return the anomaly probabilities for each input sequence.

In [3]:
import torch
import torch.nn as nn
import math

class PositionalEncoding(nn.Module):
    def __init__(self, d_model, max_seq_length=288):
        super().__init__()
        position = torch.arange(max_seq_length).unsqueeze(1)
        div_term = torch.exp(torch.arange(0, d_model, 2) * (-math.log(10000.0) / d_model))

        pe = torch.zeros(max_seq_length, 1, d_model)
        pe[:, 0, 0::2] = torch.sin(position * div_term)
        pe[:, 0, 1::2] = torch.cos(position * div_term)

        self.register_buffer('pe', pe)

    def forward(self, x):
        return x + self.pe[:x.size(0)]

class TransformerAnomaly(nn.Module):
    def __init__(self, input_dim=12, d_model=32, nhead=4, num_layers=1, dropout=0.1):
        super().__init__()

        self.input_projection = nn.Linear(input_dim, d_model)

        self.pos_encoder = PositionalEncoding(d_model)

        encoder_layers = nn.TransformerEncoderLayer(
            d_model=d_model,
            nhead=nhead,
            dim_feedforward=d_model*4,
            dropout=dropout,
            batch_first=True
        )
        self.transformer_encoder = nn.TransformerEncoder(encoder_layers, num_layers)

        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(d_model * 288, 64)
        self.dropout = nn.Dropout(dropout)
        self.fc2 = nn.Linear(64, 1)
        self.sigmoid = nn.Sigmoid()

    def forward(self, src):
        src = self.input_projection(src)
        src = self.pos_encoder(src)
        output = self.transformer_encoder(src)
        output = self.flatten(output)
        output = torch.relu(self.fc1(output))
        output = self.dropout(output)
        output = self.fc2(output)
        output = self.sigmoid(output)
        return output

# train.py

**Description**: This file contains the training code for the Transformer-based anomaly detection model. It loads and preprocesses the data using the TimeSeriesPreprocessor, initializes the TransformerAnomaly model, and defines the loss function (WeightedBCELoss) and optimizer (Adam). The code then trains the model for a specified number of epochs, using early stopping with a minimum improvement threshold to prevent overfitting. The training history is plotted and saved at the end.

**Guidance**: To train the model, simply run this script. It will load the data, initialize the model, and start the training process. The best model will be saved in the models/saved directory, and the training history plot will be saved in the results directory.

In [5]:
import torch
import torch.nn as nn
from torch.utils.data import DataLoader
from utils.preprocess import TimeSeriesPreprocessor
from models.transformer import TransformerAnomaly
import numpy as np
from pathlib import Path
import logging

class WeightedBCELoss(nn.Module):
    def __init__(self, pos_weight):
        super().__init__()
        self.pos_weight = pos_weight

    def forward(self, pred, target):
        weight = torch.where(target == 1, self.pos_weight, torch.tensor(1.0))
        loss = -(weight * (target * torch.log(pred + 1e-10) + \
                (1 - target) * torch.log(1 - pred + 1e-10)))
        return loss.mean()

def main():
    logging.basicConfig(level=logging.INFO)
    logger = logging.getLogger(__name__)

    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    logger.info(f"Using device: {device}")

    logger.info("Loading and preprocessing data...")
    preprocessor = TimeSeriesPreprocessor()
    train_loader, val_loader = preprocessor.create_dataloaders(batch_size=16)

    logger.info("Initializing model...")
    model = TransformerAnomaly(input_dim=12, d_model=64, nhead=4, num_layers=3, dropout=0.2).to(device)

    pos_weight = torch.tensor((1 - 0.0327) / 0.0327)
    criterion = WeightedBCELoss(pos_weight=pos_weight)
    optimizer = torch.optim.Adam(model.parameters(), lr=0.00005)

    num_epochs = 100
    best_val_loss = float('inf')

    train_losses = []
    val_losses = []

    logger.info("Starting training...")
    for epoch in range(num_epochs):
        model.train()
        total_train_loss = 0

        for batch, labels in train_loader:
            batch = batch.to(device)
            labels = labels.to(device)

            optimizer.zero_grad()
            outputs = model(batch)
            loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()

            total_train_loss += loss.item()

        avg_train_loss = total_train_loss / len(train_loader)
        train_losses.append(avg_train_loss)

        model.eval()
        total_val_loss = 0

        with torch.no_grad():
            for batch, labels in val_loader:
                batch = batch.to(device)
                labels = labels.to(device)

                outputs = model(batch)
                loss = criterion(outputs, labels)
                total_val_loss += loss.item()

        avg_val_loss = total_val_loss / len(val_loader)
        val_losses.append(avg_val_loss)

        logger.info(f"Epoch {epoch + 1}/{num_epochs}")
        logger.info(f"Train Loss: {avg_train_loss:.4f}, Validation Loss: {avg_val_loss:.4f}")

        if avg_val_loss < best_val_loss:
            best_val_loss = avg_val_loss
            logger.info("Saving new best model...")
            torch.save(model.state_dict(), Path('models/saved/best_model_realAWS.pth'))

    Path('results').mkdir(exist_ok=True, parents=True)
    np.save('results/train_losses_realAWS.npy', train_losses)
    np.save('results/val_losses_realAWS.npy', val_losses)
    logger.info("Training complete.")

if __name__ == "__main__":
    main()


ModuleNotFoundError: No module named 'utils'

# evaluate.py

**Description**: This file contains code for evaluating the trained anomaly detection model using precision-recall and ROC curves. It loads the best model from the models/saved directory, makes predictions on the entire dataset, and calculates the precision, recall, and false positive rates at different threshold values. It then plots the precision-recall and ROC curves, calculates the area under each curve (PR-AUC and ROC-AUC), and saves the plots and metrics in the results directory.

**Guidance**: To evaluate the model, run this script after training the model. The performance curves and metrics will be saved in the results directory.

In [None]:
import torch
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.metrics import precision_recall_curve, roc_curve, auc
from utils.preprocess import TimeSeriesPreprocessor
from models.transformer import TransformerAnomaly
from pathlib import Path

def evaluate_model(model_path='models/saved/best_model.pth'):

    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    print(f"Using device: {device}")

    print("Loading and preprocessing data...")
    preprocessor = TimeSeriesPreprocessor()
    base_url = "https://raw.githubusercontent.com/numenta/NAB/master/data/"
    file_path = f"{base_url}realAWSCloudwatch/ec2_cpu_utilization_825cc2.csv"
    df = pd.read_csv(file_path)
    df['timestamp'] = pd.to_datetime(df['timestamp'])

    print("Preparing evaluation data...")
    df_eval = preprocessor.add_time_features(df)
    sequences, labels = preprocessor.create_sequences(df_eval)

    print("Loading model...")
    model = TransformerAnomaly().to(device)
    checkpoint = torch.load(model_path, map_location=device)
    if isinstance(checkpoint, dict) and 'model_state_dict' in checkpoint:
        model.load_state_dict(checkpoint['model_state_dict'])
    else:
        model.load_state_dict(checkpoint)
    model.eval()

    print("Making predictions...")
    predictions = []
    batch_size = 32

    with torch.no_grad():
        for i in range(0, len(sequences), batch_size):
            batch = sequences[i:i + batch_size]
            batch_tensor = torch.FloatTensor(batch).to(device)
            outputs = model(batch_tensor)
            predictions.extend(outputs.cpu().numpy())

    predictions = np.array(predictions).flatten()

    precision, recall, _ = precision_recall_curve(labels, predictions)
    fpr, tpr, _ = roc_curve(labels, predictions)
    pr_auc = auc(recall, precision)
    roc_auc = auc(fpr, tpr)

    print("Generating visualizations...")
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 5))

    # PR Curve
    ax1.plot(recall, precision)
    ax1.set_title(f'Precision-Recall Curve (AUC = {pr_auc:.3f})')
    ax1.set_xlabel('Recall')
    ax1.set_ylabel('Precision')
    ax1.grid(True)

    # ROC Curve
    ax2.plot(fpr, tpr)
    ax2.set_title(f'ROC Curve (AUC = {roc_auc:.3f})')
    ax2.set_xlabel('False Positive Rate')
    ax2.set_ylabel('True Positive Rate')
    ax2.grid(True)

    plt.tight_layout()

    Path('results').mkdir(exist_ok=True)

    plt.savefig('results/performance_curves.png')
    print("Results saved to results/performance_curves.png")

    print("\nEvaluation Metrics:")
    print(f"PR-AUC: {pr_auc:.3f}")
    print(f"ROC-AUC: {roc_auc:.3f}")


if __name__ == "__main__":
    evaluate_model()

# analyze_results.py

**Description**: This file contains code for analyzing the performance of the trained anomaly detection model. It loads the best model from the models/saved directory and makes predictions on the entire dataset. It then generates a comprehensive analysis report, including a time series plot with detected anomalies, a plot of anomaly scores over time, and a histogram of the anomaly score distribution. The report also includes key metrics such as the total number of windows analyzed, the number and percentage of detected anomalies, and summary statistics of the anomaly scores.

**Guidance**: To perform the analysis, run this script after training the model. The generated report will be saved in the results/analysis_report directory.

In [None]:
import torch
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from utils.preprocess import TimeSeriesPreprocessor
from models.transformer import TransformerAnomaly
from pathlib import Path

class ModelAnalyzer:
    def __init__(self, model_path='models/saved/best_model.pth'):
        self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
        self.preprocessor = TimeSeriesPreprocessor()
        self.model = TransformerAnomaly().to(self.device)

        checkpoint = torch.load(model_path, map_location=self.device)
        if isinstance(checkpoint, dict) and 'model_state_dict' in checkpoint:
            self.model.load_state_dict(checkpoint['model_state_dict'])
        else:
            self.model.load_state_dict(checkpoint)
        self.model.eval()

    def analyze_performance(self):
        print("Loading data...")
        base_url = "https://raw.githubusercontent.com/numenta/NAB/master/data/"
        file_path = f"{base_url}realAWSCloudwatch/ec2_cpu_utilization_825cc2.csv"
        df = pd.read_csv(file_path)
        df['timestamp'] = pd.to_datetime(df['timestamp'])

        print("Making predictions...")
        df_eval = self.preprocessor.add_time_features(df)
        sequences, labels = self.preprocessor.create_sequences(df_eval)

        predictions = []
        batch_size = 32
        with torch.no_grad():
            for i in range(0, len(sequences), batch_size):
                batch = sequences[i:i + batch_size]
                batch_tensor = torch.FloatTensor(batch).to(self.device)
                outputs = self.model(batch_tensor)
                predictions.extend(outputs.cpu().numpy())

        predictions = np.array(predictions).flatten()

        print("Generating visualizations...")
        self.generate_report(predictions, labels, df)

    def generate_report(self, predictions, labels, df):
        save_path = 'results/analysis_report'
        Path(save_path).mkdir(parents=True, exist_ok=True)

        # 1. Time Series Plot
        plt.figure(figsize=(15, 10))

        plt.subplot(3, 1, 1)
        window_size = self.preprocessor.window_size
        valid_timestamps = df['timestamp'][window_size:len(predictions)+window_size]
        plt.plot(df['timestamp'], df['value'], label='Original', alpha=0.7)

        threshold = 0.3
        anomaly_mask = predictions > threshold
        anomaly_times = valid_timestamps[anomaly_mask]
        anomaly_values = df['value'][window_size:len(predictions)+window_size][anomaly_mask]
        plt.scatter(anomaly_times, anomaly_values, color='red', label=f'Detected Anomalies (threshold={threshold})')
        plt.title('CPU Utilization with Detected Anomalies')
        plt.legend()

        # 2. Prediction Scores
        plt.subplot(3, 1, 2)
        plt.plot(valid_timestamps, predictions, label='Anomaly Score')
        plt.axhline(y=threshold, color='r', linestyle='--', label='Threshold')
        plt.title('Anomaly Scores Over Time')
        plt.legend()

        # 3. Score Distribution
        plt.subplot(3, 1, 3)
        plt.hist(predictions, bins=50, density=True, alpha=0.7)
        plt.axvline(x=threshold, color='r', linestyle='--', label='Threshold')
        plt.title('Distribution of Anomaly Scores')
        plt.legend()

        plt.tight_layout()
        plt.savefig(f'{save_path}/anomaly_detection_results.png')
        plt.close()

        with open(f'{save_path}/detection_report.txt', 'w') as f:
            f.write("Anomaly Detection Report\n")
            f.write("======================\n\n")
            f.write(f"Total windows analyzed: {len(predictions)}\n")
            f.write(f"Number of anomalies detected: {np.sum(anomaly_mask)}\n")
            f.write(f"Percentage of anomalies: {100 * np.sum(anomaly_mask) / len(predictions):.2f}%\n")
            f.write(f"\nThreshold used: {threshold}\n")

            mean_score = np.mean(predictions)
            std_score = np.std(predictions)
            f.write(f"\nScore Statistics:\n")
            f.write(f"Mean score: {mean_score:.4f}\n")
            f.write(f"Standard deviation: {std_score:.4f}\n")
            f.write(f"Min score: {np.min(predictions):.4f}\n")
            f.write(f"Max score: {np.max(predictions):.4f}\n")

def main():
    analyzer = ModelAnalyzer()
    analyzer.analyze_performance()
    print("Analysis complete! Check results/analysis_report/ for detailed findings.")

if __name__ == "__main__":
    main()

# baseline_comparison.py

**Description**: This file contains code for comparing the Transformer-based anomaly detection model with two baseline models: an LSTM-based model and a simple moving average model. It includes the implementation of the LSTMAnomalyDetector class, which defines an LSTM-based anomaly detection model. The script generates synthetic data, trains the LSTM model, and evaluates both the LSTM and moving average models using PR-AUC and ROC-AUC metrics.

**Guidance**: To use this code, you can run the script as is to compare the performance of the LSTM and moving average models on synthetic data. The script will train the LSTM model for a few epochs and then evaluate both models using the precision_recall_curve and roc_curve functions from scikit-learn. The PR-AUC and ROC-AUC scores will be printed for each model, allowing you to compare their performance. You can modify the synthetic data generation or add your own data loading functions to test the models on different datasets.

In [None]:
import torch
import torch.nn as nn
from torch.utils.data import DataLoader, Dataset
from sklearn.metrics import precision_recall_curve, roc_curve, auc
import matplotlib.pyplot as plt
import numpy as np


class LSTMAnomalyDetector(nn.Module):
    def __init__(self, input_dim, hidden_dim, num_layers, output_dim):
        super(LSTMAnomalyDetector, self).__init__()
        self.lstm = nn.LSTM(input_dim, hidden_dim, num_layers, batch_first=True)
        self.fc = nn.Linear(hidden_dim, output_dim)

    def forward(self, x):
        out, _ = self.lstm(x)
        if len(out.shape) == 3:
            out = out[:, -1, :]
        return self.fc(out)


def create_synthetic_data(num_samples=1000, num_features=12, anomaly_rate=0.05):
    data = np.random.rand(num_samples, num_features)
    labels = (np.random.rand(num_samples) < anomaly_rate).astype(int)
    return data, labels


class TimeSeriesDataset(Dataset):
    def __init__(self, data, labels):
        self.data = data
        self.labels = labels

    def __len__(self):
        return len(self.data)

    def __getitem__(self, idx):
        return self.data[idx], self.labels[idx]


def create_dataloaders(data, labels, batch_size=32):
    data = torch.tensor(data, dtype=torch.float32)
    labels = torch.tensor(labels, dtype=torch.float32)

    dataset = TimeSeriesDataset(data, labels)
    dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True)

    return dataloader


def evaluate_model(model, dataloader, device):
    model.eval()
    y_true, y_scores = [], []
    for inputs, targets in dataloader:
        inputs, targets = inputs.to(device), targets.to(device)
        with torch.no_grad():
            outputs = model(inputs).squeeze()
        y_true.extend(targets.cpu().numpy())
        y_scores.extend(outputs.cpu().numpy())

    precision, recall, _ = precision_recall_curve(y_true, y_scores)
    pr_auc = auc(recall, precision)
    fpr, tpr, _ = roc_curve(y_true, y_scores)
    roc_auc = auc(fpr, tpr)
    return {'PR-AUC': pr_auc, 'ROC-AUC': roc_auc}


if __name__ == "__main__":
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

    data, labels = create_synthetic_data()
    dataloader = create_dataloaders(data, labels, batch_size=16)

    lstm_model = LSTMAnomalyDetector(input_dim=12, hidden_dim=64, num_layers=2, output_dim=1).to(device)
    criterion = nn.BCEWithLogitsLoss()
    optimizer = torch.optim.Adam(lstm_model.parameters(), lr=0.001)

    lstm_model.train()
    for epoch in range(10):
        epoch_loss = 0
        for inputs, targets in dataloader:
            inputs, targets = inputs.to(device), targets.to(device)
            optimizer.zero_grad()
            outputs = lstm_model(inputs).squeeze()
            loss = criterion(outputs, targets)
            loss.backward()
            optimizer.step()
            epoch_loss += loss.item()
        print(f"Epoch {epoch + 1}/10, Loss: {epoch_loss:.4f}")

    lstm_results = evaluate_model(lstm_model, dataloader, device)
    print(f"LSTM Results: PR-AUC = {lstm_results['PR-AUC']:.3f}, ROC-AUC = {lstm_results['ROC-AUC']:.3f}")

    data_flat = data[:, -1]
    moving_avg = np.convolve(data_flat, np.ones(5)/5, mode='same')
    anomaly_scores = np.abs(data_flat - moving_avg)
    precision, recall, _ = precision_recall_curve(labels, anomaly_scores)
    pr_auc = auc(recall, precision)
    fpr, tpr, _ = roc_curve(labels, anomaly_scores)
    roc_auc = auc(fpr, tpr)
    print(f"Moving Average Results: PR-AUC = {pr_auc:.3f}, ROC-AUC = {roc_auc:.3f}")


# visualize_results.py

**Description**: This file contains code for visualizing and comparing the performance of different anomaly detection models using bar plots. It plots the Precision-Recall AUC (PR-AUC) and ROC-AUC scores for three models: Transformer, LSTM, and Moving Average. The bar plots provide a clear visual comparison of the models' performance on these metrics.

**Guidance**: To use this code, make sure you have the necessary performance metrics (PR-AUC and ROC-AUC scores) for each model you want to compare. Update the models list with the names of your models, and the pr_auc_scores and roc_auc_scores lists with their corresponding scores. Run the script to generate the bar plots, which will help you quickly assess and compare the performance of different anomaly detection models.

In [None]:
import matplotlib.pyplot as plt

models = ["Transformer", "LSTM", "Moving Average"]
pr_auc_scores = [0.857, 0.057, 0.045]
roc_auc_scores = [0.976, 0.545, 0.497]

# Bar Plot for PR-AUC
plt.figure(figsize=(12, 6))
plt.bar(models, pr_auc_scores, color="blue", alpha=0.7)
plt.title("Precision-Recall AUC Comparison")
plt.ylabel("PR-AUC")
plt.xlabel("Models")
plt.ylim(0, 1)
plt.show()

# Bar Plot for ROC-AUC
plt.figure(figsize=(12, 6))
plt.bar(models, roc_auc_scores, color="green", alpha=0.7)
plt.title("ROC-AUC Comparison")
plt.ylabel("ROC-AUC")
plt.xlabel("Models")
plt.ylim(0, 1)
plt.show()


# Conclusion

The provided code files implement a comprehensive pipeline for anomaly detection in time series data using a Transformer-based approach. The pipeline includes data preprocessing, model architecture definition, training, evaluation, results analysis, visualization, and comparison with baseline models. The main focus is on the Transformer-based anomaly detection model, which leverages the power of self-attention mechanisms to capture complex patterns and dependencies in time series data.
The pipeline is designed to be modular and extensible, allowing users to easily adapt it to different datasets and model architectures. The code is well-organized and follows best practices for data preprocessing, model training, and evaluation. The use of PyTorch and popular libraries like scikit-learn ensures compatibility and ease of use.
The included baseline comparison code provides a way to evaluate the performance of the Transformer-based model against traditional approaches like LSTM and moving average. This enables users to assess the effectiveness of the Transformer architecture in detecting anomalies and make informed decisions about model selection.