Isaac Hus <br>
Personal Research Project <br>
Winter 2024 <br>
MNIST NN Model Stealing <br>

In [18]:
# Import the required dependencies
import torch
from torch import nn
from torch import optim
from torchvision import datasets, transforms
from torch.utils.data import random_split, DataLoader

In [19]:
# Check for CUDA availability
# Print the CUDA details
print('CUDA Version: ',torch.version.cuda)
print('CUDA devices (GPU\'s) found: ',torch.cuda.device_count())
assert(torch.cuda.is_available())

CUDA Version:  11.8
CUDA devices (GPU's) found:  1


The next codeblock will create a class that outlines the structure of the NN <br>
I am using a linear NN with one hidden layer <br>
The forward function is used to get the NN output from an input 'x'<br>
Model is linear, look into changing to convoluted

In [20]:
# define the NN architecture as a class
class ResNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.l1 = nn.Linear(28 * 28, 64)
        self.l2 = nn.Linear(64,64)
        self.l3 = nn.Linear(64, 10)
        # dropout module with 0.1 drop probability
        # helps in preventing overfitting
        self.do = nn.Dropout(0.1)
    
    def forward(self,x):
        h1 = nn.functional.relu(self.l1(x))
        h2 = nn.functional.relu(self.l2(h1))
        do = self.do(h2 + h1)
        logits = self.l3(do)
        # return an array of logits
        return logits
model = ResNet().cuda()


Optimiser using Stocastic Gradient Descent <br>
Common and computationally easy optimiser to minimise loss

In [21]:
# optimiser using SGD
optimiser = optim.SGD(model.parameters(), lr=1e-2)

Loss function <br>
using cross entropy because its a classification problem

In [22]:
#loss
loss = nn.CrossEntropyLoss()

Load in the dataset <br>
I'm using the MNIST dataset that pytourch provides <br>
Also splitting to training and validation <br>
Where is test set?

In [23]:
data = datasets.MNIST('data', train=True, download=True, transform=transforms.ToTensor())
train, val = random_split(data, [55000, 5000])
training_loader = DataLoader(train, batch_size=32)
val_loader = DataLoader(val, batch_size=32)

Training loop, details in comments <br>
Most math is done on the GPU

In [24]:
# training loop

# go through the data set 5 times
num_epoch = 5
for epoch in range(num_epoch):
  losses = []

  # step through the data set batch by batch
  for batch in training_loader:

    # get the data and labels
    # x is an image
    # y is a label
    x, y = batch

    batch_size = x.size(0)
    x = x.view(batch_size, -1).cuda()

    #1 forward x through model to get logits
    logit = model(x)

    #2 compute loss
    objective = loss(logit, y.cuda())

    #3 cleaning gradient
    model.zero_grad()

    #4 back propagation to compute gradient
    objective.backward()

    #5 apply updates (step up gradients)
    optimiser.step()


    losses.append(objective.item())


  print(f'Epoch {epoch + 1}, train loss: {torch.tensor(losses).mean():.2f}')

  losses = []
  accuracy = []
  for batch in val_loader:
    x, y = batch
    batch_size = x.size(0)
    x = x.view(batch_size, -1).cuda()

    
    with torch.no_grad():
      logit = model(x)
    objective = loss(logit, y.cuda())
    losses.append(objective.item())
    accuracy.append(y.eq(logit.detach().argmax(dim=1).cpu()).float().mean())

  print(f'Epoch {epoch + 1}, validation loss: {torch.tensor(losses).mean():.2f}')
  print(f'Epoch {epoch + 1}, validation accuracy: {torch.tensor(accuracy).mean():.2f}')


Epoch 1, train loss: 0.85
Epoch 1, validation loss: 0.43
Epoch 1, validation accuracy: 0.88
Epoch 2, train loss: 0.37
Epoch 2, validation loss: 0.34
Epoch 2, validation accuracy: 0.90
Epoch 3, train loss: 0.30
Epoch 3, validation loss: 0.29
Epoch 3, validation accuracy: 0.91
Epoch 4, train loss: 0.26
Epoch 4, validation loss: 0.26
Epoch 4, validation accuracy: 0.92
Epoch 5, train loss: 0.23
Epoch 5, validation loss: 0.24
Epoch 5, validation accuracy: 0.93


In [25]:
torch.save(model.state_dict(), 'target.pth')

# Motivation for study <br>
Many current model stealing attacks involve some sort of data poisoning. They aim to interfeir with a possible shadow model while keeping the normal public from getting wrong answers. The feild right now is trying to optimose the ratio between high defence and low error. I plan to explore how model stealing attacks work and then to analys different non-compromising defences.

# Refrence Material #
https://www.frontiersin.org/articles/10.3389/fdata.2021.729663/full  (Look at backdoor watermarking)<br>
https://openaccess.thecvf.com/content_CVPR_2020/papers/Kariyappa_Defending_Against_Model_Stealing_Attacks_With_Adaptive_Misinformation_CVPR_2020_paper.pdf <br>
https://ieeexplore.ieee.org/abstract/document/8806737