[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ELTE-DSED/Intro-Data-Security/blob/main/module_01_foundations/Lab1_DNN_Training_and_Robust_Models.ipynb)

# **Lab 1: Deep Neural Network Training & Robust Models**

**Course:** Introduction to Data Security Pr. (Master's Level)  
**Module 1:** Foundations  
**Estimated Time:** 90-120 minutes

---
In this notebook, we will use the basic training functionalities of [SecML-Torch](https://secml-torch.readthedocs.io/) to train a regular PyTorch Deep Neural Network (DNN) classifier.

SecML-Torch (SecMLT) is an open-source Python library designed to facilitate research in the area of Adversarial Machine Learning (AML) and robustness evaluation. The library provides a simple yet powerful interface for generating various types of adversarial examples, as well as tools for evaluating the robustness of machine learning models against such attacks.

## **Learning Objectives**

By the end of this lab, you will be able to:

1. **Train** deep neural networks on standard image classification dataset (MNIST)
2. **Evaluate** model performance using standard metrics (accuracy, loss)
3. **Understand** the difference between standard models and robust models
4. **Load** and compare pre-trained robust models
5. **Establish** baseline models for subsequent security labs

## **Table of Contents**

1. [Setup & Imports](#setup)
2. [Part 1: Dataset Loading & Preprocessing](#part1)
3. [Part 2: Training a Standard DNN](#part2)
4. [Part 3: Evaluating Model Performance](#part3)
5. [Part 4: Loading Pre-trained & Robust Models](#part4)
6. [Conclusion & Next Steps](#conclusion)

## **Setup & Imports** <a name="setup"></a>

First, we'll install necessary libraries and import required modules.

In [None]:
# Install required packages
%pip install torch torchvision matplotlib numpy scikit-learn tqdm secml-torch -q

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from tqdm import tqdm
import warnings
warnings.filterwarnings('ignore')

# Set random seeds for reproducibility
torch.manual_seed(42)
np.random.seed(42)

# Device configuration
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

## **Part 1: Dataset Loading & Preprocessing** <a name="part1"></a>

We'll work with **MNIST** (handwritten digits) as our primary dataset. MNIST is a standard benchmark for:
- Image classification
- Neural network training
- Adversarial robustness research

**Dataset Details:**
- **Training samples:** 60,000
- **Test samples:** 10,000
- **Image size:** 28×28 grayscale
- **Classes:** 10 (digits 0-9)

We import the training and testing dataset of MNIST from `torchvision`, and provide them to the dedicated data loaders.

In [None]:
# Data preprocessing transformations
transform = transforms.Compose([
    transforms.ToTensor(),  # Convert to tensor and scale to [0, 1]
    transforms.Normalize((0.1307,), (0.3081,))  # Normalize with MNIST mean and std
])

# Load MNIST dataset
train_dataset = torchvision.datasets.MNIST(
    root='./data',
    train=True,
    transform=transform,
    download=True
)

test_dataset = torchvision.datasets.MNIST(
    root='./data',
    train=False,
    transform=transform,
    download=True
)

# Create data loaders
batch_size = 128
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, num_workers=2)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False, num_workers=2)

print(f"Training samples: {len(train_dataset)}")
print(f"Test samples: {len(test_dataset)}")
print(f"Batch size: {batch_size}")
print(f"Number of batches (train): {len(train_loader)}")

### **Visualize Sample Images**

In [None]:
# Visualize some training samples
def show_images(dataset, num_samples=10):
    fig, axes = plt.subplots(2, 5, figsize=(12, 5))
    axes = axes.ravel()
    
    for i in range(num_samples):
        image, label = dataset[i]
        # Denormalize for visualization
        image = image * 0.3081 + 0.1307
        axes[i].imshow(image.squeeze(), cmap='gray')
        axes[i].set_title(f'Label: {label}')
        axes[i].axis('off')
    
    plt.tight_layout()
    plt.show()

show_images(train_dataset)

We will train a classifier for the MNIST dataset. First, we define the model as a `torch.nn.Module`, as usually done in the `torch` library. The model is a simple fully-connected network with three layers. 

## **Part 2: Training a Standard DNN** <a name="part2"></a>

We'll implement a simple **fully connected network (MLP)** for MNIST classification.

In [None]:
class MNISTNet(torch.nn.Module):
    """Simple fully connected network for MNIST classification."""

    def __init__(self):
        super().__init__()
        self.fc1 = torch.nn.Linear(784, 200)
        self.fc2 = torch.nn.Linear(200, 200)
        self.fc3 = torch.nn.Linear(200, 10)

    def forward(self, x):
        x = x.flatten(1)
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        return self.fc3(x)

# Initialize model
net = MNISTNet().to(device)
print(net)

### **Train the Model**

we initialize the optimizer to use for training the model. We’re using Adam with a learning rate of 1e-3, which is a good default choice for our neural network and then we will start using the `SecML-Torch` functionalities to train the previously-defined model on the MNIST dataset.

In [None]:
# Ensure gradients are enabled (just in case they were disabled by a previous cell)
torch.set_grad_enabled(True)

# Define optimizer
optimizer = optim.Adam(net.parameters(), lr=0.001, weight_decay=1e-4)

We will use the class `secmlt.models.pytorch.base_pytorch_trainer.BasePyTorchTrainer` to prepare a training loop. This class implements the regular training loop which performs optimization steps (with the optimizer of choice) on a for loop on the batches of samples, for a given amount of epochs (passed as an input parameter). The trainer handles the forward pass, loss computation, backward pass, and parameter updates automatically.

We wrap the model into a `secmlt.models.pytorch.base_pytorch_nn.BasePytorchClassifier` class, which provides the APIs to use models subclassing the `torch.nn.Module` within SecML-Torch. This wrapper doesn’t modify your model but adds methods that integrate with SecML-Torch’s ecosystem for attacks and defenses. Then, we can train our model by calling `model.train(dataloader=training_loader)`. This single line replaces the typical PyTorch training loop boilerplate.

In [None]:
from secmlt.models.pytorch.base_pytorch_nn import BasePytorchClassifier
from secmlt.models.pytorch.base_pytorch_trainer import BasePyTorchTrainer


# Training MNIST model
trainer = BasePyTorchTrainer(optimizer=optimizer, epochs=1)
model = BasePytorchClassifier(model=net, trainer=trainer)

model.train(dataloader=train_loader)

## **Part 3: Evaluating Model Performance** <a name="part3"></a>
We can check how the model performs on the testing dataset by using the `secmlt.metrics.classification.Accuracy` wrapper. This provides the accuracy scoring loop that queries the model with all the batches and counts how many predictions are correct. The metric automatically handles device placement and batch aggregation, returning a single accuracy value for the entire test set.

In [None]:
from secmlt.metrics.classification import Accuracy

accuracy = Accuracy()(model, test_loader)
print(f"Test Accuracy: {accuracy * 100:.2f}%")

Finally, we can save our model weights with the torch saving functionalities. To get the model, we can access the model attribute of the `secmlt.models.pytorch.base_pytorch_nn.BasePytorchClassifier`.

In [None]:
from pathlib import Path

model_path = Path("models/mnist")
if not model_path.exists():
    model_path.mkdir(parents=True, exist_ok=True)
torch.save(model.model.state_dict(), model_path / "mnist_model.pt")

After saving the model we can load it for further evaluations by simply using the `torch.load` method and wrap it again with `secmlt.models.pytorch.base_pytorch_nn.BasePytorchClassifier`

In [None]:
trained_net = MNISTNet()
model_weights_path = model_path / "mnist_model.pt"
model_weights = torch.load(model_weights_path, map_location="cpu")
trained_net.eval()
trained_net.load_state_dict(model_weights)
trained_model = BasePytorchClassifier(model=trained_net, trainer=trainer)

## **Part 4: Loading Pre-trained & Robust Models** <a name="part4"></a>

we will use the basic functionalities of SecML-Torch to import a pre-trained model from torchvision or a robust model from RobustBench. This demonstrates how SecML-Torch can wrap any PyTorch model, whether it's a standard pre-trained model or one specifically trained for adversarial robustness.

### **4.1 Loading and Preprocessing an Image**

We'll download a sample image and prepare it for inference using the standard ImageNet preprocessing pipeline: resize to 256, center-crop to 224, and convert to tensor.

In [None]:
import io
import json
import requests
from PIL import Image
from torchvision import transforms
from torchvision.models import get_model
from secmlt.models.pytorch.base_pytorch_nn import BasePytorchClassifier

imagenet_transform = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
])

img_url = (
    "https://raw.githubusercontent.com/ajschumacher/imagen/master/imagen/"
    "n02342885_10908_hamster.jpg"
 )
labels_url = (
    "https://raw.githubusercontent.com/"
    "anishathalye/imagenet-simple-labels/master/"
    "imagenet-simple-labels.json"
 )

headers = {"User-Agent": "Mozilla/5.0"}

# Download hamster image with timeout
resp = requests.get(img_url, headers=headers, timeout=30)
resp.raise_for_status()

# Download ImageNet labels with timeout
labels_resp = requests.get(labels_url, headers=headers, timeout=30)
labels_resp.raise_for_status()

imagenet_labels = json.loads(labels_resp.text)
img = Image.open(io.BytesIO(resp.content)).convert("RGB")

plt.imshow(img)
plt.axis('off')
plt.title("Sample ImageNet Image")
plt.show()

imagenet_input_tensor = imagenet_transform(img)

### **4.2 Importing a Pre-trained Model from Torchvision**

Now we’ll load a pre-trained Vision Transformer (ViT) model from torchvision. The model has been trained on ImageNet-1K and expects images preprocessed as shown above. We wrap it with BasePytorchClassifier to integrate it with SecML-Torch’s functionality.

In [None]:
imagenet_net = get_model("vit_b_16", weights="IMAGENET1K_V1")
imagenet_net.to(device)
imagenet_net.eval()

imagenet_model = BasePytorchClassifier(imagenet_net)
print("Model loaded successfully")

### **4.3 Making Predictions with the Pre-trained Model**

Let’s use the wrapped model to classify our image. The `predict` method handles the forward pass and returns the predicted class index. We’ll map this to the human-readable ImageNet label.

In [None]:
imagenet_pred = imagenet_model.predict(imagenet_input_tensor.unsqueeze(0).to(device))
imagenet_label = imagenet_labels[imagenet_pred.item()]
print(f"Predicted class index: {imagenet_pred.item()}")
print(f"Predicted class label: {imagenet_label}")

### **4.4 Loading a Robust Model from RobustBench**

RobustBench provides models trained for adversarial robustness. We'll load a model robust to $L_\infty$ perturbations and wrap it the same way.

Before using models from RobustBench, we need to install the RobustBench package and its dependencies.

In [None]:
%%capture
%pip install git+https://github.com/RobustBench/robustbench.git

In [None]:
from robustbench.utils import load_model

robust_net = load_model(model_name="Salman2020Do_R18", dataset="imagenet", threat_model="Linf")
robust_net.to(device)
robust_net.eval()

robust_model = BasePytorchClassifier(robust_net)
print("Robust model loaded successfully")

### **4.5 Comparing Predictions**

Robust models may trade clean accuracy for resilience, but on this example both models should agree.

In [None]:
robust_pred = robust_model.predict(imagenet_input_tensor.unsqueeze(0).to(device))
robust_label = imagenet_labels[robust_pred.item()]
print(f"Predicted class index: {robust_pred.item()}")
print(f"Predicted class label: {robust_label}")

## **Conclusion & Next Steps** <a name="conclusion"></a>
---

### **What You Learned**

- **Neural Network Training:** Built and trained a DNN from scratch  
- **Model Evaluation:** Used accuracy 
- **Robust Models:** Understood the concept of adversarial robustness  
- **Model Persistence:** Saved and loaded trained models  
- **Baseline Establishment:** Created standard models for future attack labs  

### **Key Takeaways**

1. **Standard models** achieve high clean accuracy but are vulnerable to adversarial attacks
2. **Robust models** trade some clean accuracy for adversarial resilience
3. **Adversarial training** is the most effective defense but computationally expensive
4. **Model architecture** affects both performance and robustness

### **Preparing for Upcoming Labs**

- **Module 2:** Implement and defend against evasion attacks
- **Module 3-4:** Execute and detect poisoning attacks
- **Module 5:** Create and mitigate sponge attacks
- **Module 6:** Launch and prevent privacy attacks
- **Module 7:** Generate and evaluate synthetic data
- **Module 8:** Deploy comprehensive defense systems

### **Additional Resources**

**Foundational Papers:**
- [Explaining and Harnessing Adversarial Examples (Goodfellow et al., 2015)](https://arxiv.org/abs/1412.6572)
- [Towards Deep Learning Models Resistant to Adversarial Attacks (Madry et al., 2018)](https://arxiv.org/abs/1706.06083)
- [Adversarial Examples Are Not Bugs, They Are Features (Ilyas et al., 2019)](https://arxiv.org/abs/1905.02175)
- [SoK: Security and Privacy in Machine Learning (Papernot et al., 2018)](https://ieeexplore.ieee.org/document/8406613)
- [The NIST Adversarial ML Framework](https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-2e2023.pdf)

**Industry Standards:**
- MITRE ATLAS: Adversarial Threat Landscape for AI Systems
- OWASP Machine Learning Security Top 10
- ISO/IEC 24029: AI Trustworthiness

**Tools & Frameworks:**
- [SecML-Torch](https://secml-torch.readthedocs.io/)
- [Microsoft Threat Modeling Tool](https://www.microsoft.com/en-us/securityengineering/sdl/threatmodeling)
- [Adversarial Robustness Toolbox (ART)](https://github.com/Trusted-AI/adversarial-robustness-toolbox)