In [1]:
%load_ext autoreload
%autoreload 2
import torch
from utils import get_mnist_data, get_device
from models import ConvNN
from training_and_evaluation import evaluate_robustness_smoothing

# Project 2, part 4: Randomized smoothing certification (20 pt)
In this notebook we compare the robustness of the classifiers from Parts 1-3 via randomized smoothing.

## Your task
Complete the missing code in the respective files, i.e. `models.py`, `training_and_evaluation.py`, `attacks.py`, and this notebook. Make sure that all the functions follow the provided specification, i.e. the output of the function exactly matches the description in the docstring. 

Specifically, for this part you will have to implement the following functions / classes:  
**`training_and_evaluation.py`**:
* `evaluate_robustness_smoothing` (20pt)

## General remarks

Do not add or modify any code outside of the following comment blocks, or where otherwise explicitly stated.
``` python
##########################################################
# YOUR CODE HERE
...
##########################################################
```
After you fill in all the missing code, restart the kernel and re-run all the cells in the notebook.

The following things are **NOT** allowed:
- Using additional `import` statements
- Copying / reusing code from other sources (e.g. code by other students)

Note that plagiarising even a single project task will make you ineligible for the bonus this semester.

In [2]:
mnist_testset = get_mnist_data(train=False)
device = get_device()

model = ConvNN()
model.to(device)
    
num_samples_1 = int(1e3)  # reduce this to 1e2 in case it takes too long, e.g. 
                          # because you don't have CUDA
num_samples_2 = int(1e4)  # reduce this to 1e3 in case it takes too long, e.g. 
                          # because you don't have CUDA
certification_batch_size = int(5e3)  # reduce this to 5e2 if required (e.g. not 
                                     # enough memory)
sigma = 1
alpha = 0.05

In [3]:
training_types = ["standard_training", 
                  "adversarial_training", 
                  "randomized_smoothing"]

### Robustness certification
Here we first load the checkpoints for the base classifiers of the different training methods of Parts 1-3. Then, perform robustness certification of the smooth classifier via randomized smoothing. For this, you need to implement `evaluate_robustness_smoothing` from `training_and_evaluation.py`. Follow the docstring in that file.

In [4]:
results = {}

for training_type in training_types:
    model.load_state_dict(torch.load(f"models/{training_type}.checkpoint"))
    certification_results = \
        evaluate_robustness_smoothing(model, sigma, mnist_testset, device,
                                      num_samples_1=num_samples_1,
                                      num_samples_2=num_samples_2, 
                                      alpha=alpha, 
                                      certification_batch_size=certification_batch_size)
    results[training_type] = certification_results

  0%|          | 0/10000 [00:00<?, ?it/s]

KeyboardInterrupt: 

### Robustness comparison
Compare the robustness of the different training types. As we can see, robust training via randomized smoothing leads to the best robustness.

Note that the number of certified points will be lower in case you had to reduce the number of samples for performance reasons.

In [None]:
for k,v in results.items():
    print(f"{k}: correct_certified {v['correct_certified']}, avg. certifiable "
          f"radius: {v['avg_radius']}")