# **How to train a Deep Learning model with Differential Privacy using PyTorch Opacus**


This is a step-by-step tutorial on how to train a simple PyTorch classification model on MNIST dataset using a differentially private - stochastic gradient descent optimizer.

Link to library: https://github.com/pytorch/opacus

### **Step 1: Importing PyTorch and Opacus**

In [39]:
import torch
from torchvision import datasets, transforms
import numpy as np
from opacus import PrivacyEngine
import torch.nn as nn
import torch.optim as optim
from tqdm import tqdm

In [16]:
!pip install -e .

Obtaining file:///home/kritika/opacus
Installing collected packages: opacus
  Attempting uninstall: opacus
    Found existing installation: opacus 0.1
    Uninstalling opacus-0.1:
      Successfully uninstalled opacus-0.1
  Running setup.py develop for opacus
Successfully installed opacus


## **Step 2: Loading MNIST data**

In [19]:
train_loader = torch.utils.data.DataLoader(
    datasets.MNIST('../mnist',
                   train=True,
                   download=True,
                   transform=transforms.Compose([transforms.ToTensor(),
                                                 transforms.Normalize((0.1307,), (0.3081,)),]),),
                   batch_size=64,
                   shuffle=True,
                   num_workers=1,
                   pin_memory=True)

test_loader = torch.utils.data.DataLoader(
        datasets.MNIST('../mnist', 
                       train=False, 
                       transform=transforms.Compose([transforms.ToTensor(), 
                                                     transforms.Normalize((0.1307,), (0.3081,)),]),), 
                       batch_size=1024,
                       shuffle=True,
                       num_workers=1,
                       pin_memory=True)

print(len(train_loader.dataset))

60000


## **Step 3: Creating a Neural Network Classification Model and Optimizer**

In [34]:
model = nn.Sequential(
        nn.Conv2d(1, 16, 8, 2, padding=3),
        nn.ReLU(),
        nn.MaxPool2d(2, 1),
        nn.Conv2d(16, 32, 4, 2),
        nn.ReLU(),
        nn.MaxPool2d(2, 1),
        nn.Flatten(),
        nn.Linear(32 * 4 * 4, 32),
        nn.ReLU(),
        nn.Linear(32, 10))

optimizer = optim.SGD(model.parameters(), lr=0.05)

## **Step 4: Creating and Attaching a Differential Privacy Engine to the Optimizer**

In [35]:
privacy_engine = PrivacyEngine(
    model,
    batch_size=64,
    sample_size=60000,
    alphas=[1.01, 10, 100],
    noise_multiplier=1.3,
    max_grad_norm=1.0,
)

privacy_engine.attach(optimizer)

## **Step 5: Creating a training function**

In [36]:
def train(model, train_loader, optimizer, epoch, device, delta):
    model.train()
    criterion = nn.CrossEntropyLoss()
    losses = []
    for _batch_idx, (data, target) in enumerate(tqdm(train_loader)):
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data)
        loss = criterion(output, target)
        loss.backward()
        optimizer.step()
        losses.append(loss.item())

    
    epsilon, best_alpha = optimizer.privacy_engine.get_privacy_spent(delta)
    print(
        f"Train Epoch: {epoch} \t"
        f"Loss: {np.mean(losses):.6f} "
        f"(ε = {epsilon:.2f}, δ = {delta}) for α = {best_alpha}"
    )

## **Step 7: Training the private model over multiple epochs**

In [37]:
for epoch in range(1, 11):
    train(model, train_loader, optimizer, epoch, device="cpu", delta=1e-5)

100%|██████████| 938/938 [00:33<00:00, 28.16it/s]
  0%|          | 0/938 [00:00<?, ?it/s]

Train Epoch: 1 	Loss: 1.290623 (ε = 1.28, δ = 1e-05) for α = 10.0


100%|██████████| 938/938 [00:46<00:00, 20.27it/s]
  0%|          | 0/938 [00:00<?, ?it/s]

Train Epoch: 2 	Loss: 0.542662 (ε = 1.29, δ = 1e-05) for α = 10.0


100%|██████████| 938/938 [00:42<00:00, 22.25it/s]
  0%|          | 0/938 [00:00<?, ?it/s]

Train Epoch: 3 	Loss: 0.489965 (ε = 1.29, δ = 1e-05) for α = 10.0


100%|██████████| 938/938 [00:40<00:00, 23.32it/s]
  0%|          | 0/938 [00:00<?, ?it/s]

Train Epoch: 4 	Loss: 0.467170 (ε = 1.30, δ = 1e-05) for α = 10.0


100%|██████████| 938/938 [00:40<00:00, 23.44it/s]
  0%|          | 0/938 [00:00<?, ?it/s]

Train Epoch: 5 	Loss: 0.446736 (ε = 1.30, δ = 1e-05) for α = 10.0


100%|██████████| 938/938 [00:40<00:00, 23.36it/s]
  0%|          | 0/938 [00:00<?, ?it/s]

Train Epoch: 6 	Loss: 0.464741 (ε = 1.31, δ = 1e-05) for α = 10.0


100%|██████████| 938/938 [00:40<00:00, 23.30it/s]
  0%|          | 0/938 [00:00<?, ?it/s]

Train Epoch: 7 	Loss: 0.467884 (ε = 1.31, δ = 1e-05) for α = 10.0


100%|██████████| 938/938 [00:40<00:00, 23.36it/s]
  0%|          | 0/938 [00:00<?, ?it/s]

Train Epoch: 8 	Loss: 0.480682 (ε = 1.31, δ = 1e-05) for α = 10.0


100%|██████████| 938/938 [00:40<00:00, 23.34it/s]
  0%|          | 0/938 [00:00<?, ?it/s]

Train Epoch: 9 	Loss: 0.479481 (ε = 1.32, δ = 1e-05) for α = 10.0


100%|██████████| 938/938 [00:40<00:00, 23.22it/s]

Train Epoch: 10 	Loss: 0.489091 (ε = 1.32, δ = 1e-05) for α = 10.0





#### **Ignore this section**

A differentially private neural network in __ lines of code  
Summary: 