# Team Ares -- Task 1 Report -- Fall 2020
## Contributions:
### Cody Shearer
- Code:
  - Generated BIM AEs
  - Evalauated BIM AEs
- Report:
  - Background
  - BIM Attack and Evaluation
- Created/managed team repository.
- Helped setup development environments.
- Organized team meetings.

### Zhymir Thompson
- Performed experiments, gathered results for Carlini Wagner attack.

### Mahmudul Hasan
- Performed JSMA experiment, did evaluation and wrote JSMA report.

### Vincent Davidson
- Co-managed team meetings
- Co-managed/organized individual contributions for each team member
- Performed experiments, evaluation and analysis on PGD attacks. 

__
## Background
In their work on ATHENA, Ying et al. (2020) solve the problem of adversarial defense, not as a technique, but as a framework, wherein a variable number of weak adversarial defenses (an ensemble) are trained and their collective predictions are used to create a response to adversarial attacks, the robustness and overhead of which are inversely correlated and controlled by the number of weak defenses.

In the following report, we compare the robustness of ATHENA's ensemble with PGD-ADT and an undefended (control) model by subjecting them to several varations of different adversarial attack methods. 

## BIM Attack and Evaluation
### Introduction
The basic iterative method (BIM) is a whitebox adversarial attack developed by researchers at Google Brain and OpenAI. In their paper, Kurakin et. al demonstrate transferability of adversarial examples from a lab setting to a real-world setting. In particular, they show that adversarial examples generated by attackers who have direct access to an image classifier can still fool that same model when the images are seen through a physical camera. Furthermore, they found that no modification of the attack was needed to account for the camera.

### Experimental Setting
Here we consider an adversarial attack on a convolutional neural network (CNN) trained on a subset (10%) of the MNIST dataset using ten variations of the [basic iterative method](https://arxiv.org/pdf/1607.02533.pdf) (BIM). We first hold the epsilon value constant at 0.10 while varying the maximum number of iterations, then we hold maximum number of iterations at 70 and vary the epsilon to reveal how these parameters influence the error rate of the undefeneded model (UM), an athena ensemble, and PGD-ADT. 

First experiment:
  - epsilon: 0.10
  - max_iter: 100, 90, 80, 70, 60

Second experiment:
- epsilon: 0.20, 0.30, 0.40, 0.50, 0.60
- max_iter: 70

Using the following configurations, we generate AEs and evaluate their effectivness against the UM, the ensemble model, and PGD-ADT, using `notebooks/Task1_GenerateAEs_ZeroKnowledgeModel.ipynb` for the first
- `src/configs/task1/athena-mnist.json`
- `src/configs/task1/attack-bim-mnist.json`
- `src/configs/task1/data-bim-mnist.json`

The AEs can be found at: 
- `AE-mnist-cnn-clean-bim_eps0.1_maxiter60.npy`
- `AE-mnist-cnn-clean-bim_eps0.1_maxiter70.npy`
- `AE-mnist-cnn-clean-bim_eps0.1_maxiter80.npy`
- `AE-mnist-cnn-clean-bim_eps0.1_maxiter90.npy`
- `AE-mnist-cnn-clean-bim_eps0.1_maxiter100.npy`

### Undefended Model Results
We find that the error rate drops only for the UM and only twice. We would expect these drops to occur only at 70 and 60, perhaps as some upper bound is reached. However, we find the interesting result that the drop in error rate occurs only from 100 to 90 and 70 to 60; the error rate is the same for 90, 80, and 70. 

### Ensemble and PGD-ADT Results
The ensemble has nearly the same error rate as PGD-ADT, which in all cases is about 2%. 

| BIM Error Rate (epsilon=0.1) |             |             |             |                                             |                                             |
|------------------------|-------------|-------------|-------------|---------------------------------------------|---------------------------------------------|
| Max Iterations         | UM          | Ensemble    | PGD-ADT     | 9->1                                        | 4->9                                        |
| 100                    | 0.933534743 | 0.022155086 | 0.025176234 | ![](figures/bim_eps0.1_maxiter90_9to1.png)  | ![](figures/bim_eps0.1_maxiter100_4to9.png) |
| 90                     | 0.930513595 | 0.022155086 | 0.025176234 | ![](figures/bim_eps0.1_maxiter90_9to1.png)  | ![](figures/bim_eps0.1_maxiter90_4to9.png)  |
| 80                     | 0.930513595 | 0.022155086 | 0.025176234 | ![](figures/bim_eps0.1_maxiter80_9to1.png)  | ![](figures/bim_eps0.1_maxiter80_4to9.png)  |
| 70                     | 0.930513595 | 0.022155086 | 0.025176234 | ![](figures/bim_eps0.1_maxiter70_9to1.png)  | ![](figures/bim_eps0.1_maxiter70_4to9.png)  |
| 60                     | 0.926485398 | 0.022155086 | 0.025176234 | ![](figures/bim_eps0.1_maxiter60_9to1.png)  | ![](figures/bim_eps0.1_maxiter60_4to9.png)  |

In conclusion, BIM is only effective against the UM, with the erorr rates of the ensemble model and PGD-ADT being around 2%. Changes to the maximum iterations for BIM only have a (slight) effect on the UM, with the ensemble model and PGD-ADT defenses seeing no change.
___


## CW Attack and Evaluation

For the CW attack, the variables altered were the learning rate and normalization method. There are a total of 10 variations where 5 learning rates are repeated for each normalization method. The files for this attack are stored in ~/src/task1/attack2.

### Files Used

Configs:
* athena-mnist.json
* attack-config.json
* data-config.json
* model-config.json
* ./results/sub-data-config.json

sub-samples:

* sublabels-10-ratio_0.001-477207.062.npy
* subsamples-10-ratio_0.001-477207.062.npy

AE's:

* cw-L2-lr0.2.npy
* cw-L2-lr0.02.npy
* cw-L2-lr0.002.npy
* cw-L2-lr0.00002.npy
* cw-L2-lr0.99.npy
* cw-Lf-lr0.2.npy
* cw-Lf-lr0.02.npy
* cw-Lf-lr0.002.npy
* cw-Lf-lr0.00002.npy
* cw-Lf-lr0.99.npy

_\*AEs located in ./results folder_

### Results

#### Summary and Analysis

The L2 vs LINF norm showed a higher error rate result overall with LINF normalization for most cases.
The data also supports the idea that high learning rates have a slight edge over weak learning rates, but the actual difference is insignificant and could easily disappear given a larger sample size.
The data was only trained on a sample size of 10 due to the large increase time for a factor increase of 10.

#### Data

|Norm |LR  |UM      |Ensemble|PGD-ADT|  |
|-----|----|--------|------|-----|-----|
|L2   |0.02|0.4     |0.0   |0.0  |     |
|L2   |0.2 |0.5     |0.0   |0.1  |     |
|L2|0.002|0.5|0.0|0.0|
|L2|0.99|0.6|0.0|0.1|
|L2|0.00002|0.8|0.8|0.8|
|LINF|0.02|0.8|0.8|0.8|
|LINF|0.2|0.8|0.8|0.8|
|LINF|0.002|0.8|0.8|0.8|
|LINF|0.99|0.8|0.8|0.8|
|LINF|0.00002|0.7|0.8|0.8|


## JSMA Attack and Evaluation
We worked with JSMA attack on a convolutional neural network (CNN). We used the following values for gamma 

gamma: 0.30, 0.40, 0.50, 0.60, 0.70

By using notebooks/Task1_GenerateAEs_ZeroKnowledgeModel.ipynb:
* src/practice task1/at.json
* src/practice task1/md.json
* src/practice task1/dt.json
We can get  AEs at:
* 215272.937.npy
* 215373.062.npy
* 215373.156.npy
* 215373.25.npy
* 215373.343.npy

### Undefended Model Results
We can see  that the error rate was reduced  for the UM and it reduced five times, where Y predicted shape is 10. Moreover, Ensemble and JSMA are also zero. Furthermore, at 215272.937.npy, 215373.062.npy, 215373.156.npy, 
215373.25.npy, 215373.343.npy, the value of UM, Ensemble and JSMA are zero.



## PGD Attack and Evaluation 
- The PGD attack is trained on a subset of the MNIST dataset using five variations of the attack. The variables altered were the epsilon between the number 0 and 1. 

## Files Used
Configs: 	
* athena-mnist.json
* attack-zk-mnist.json
* data-mnist.json
* model-mnist.json
* ./results/sample.json

Sub-samples:
* sublabels-10-ratio_0.001-5.142948666.npy
* subsamples-10-ratio_0.001-5.142948666.npy

AE’s:
* pgd-eps0.1.npy
* pgd-eps0.8.npy
* pgd-eps0.7.npy
* pgd-eps0.5.npy
* pgd-eps0.3npy

*AEs located in ./results folder

## Summary and Analysis 

The epsilon showed a consistent error rate of each variation between the range of 0.7 and 1.0, indicating the number of inputs that fools the model. So, the higher the number closest to 1.0 the better.  

## Data:

### Ensemble and PGD-ADT Results

| PGD Error Rate |           |            |            |                  |
|------------------------|-------------|-------------|---------------------------------------------|---------------------------------------|
|                    |   UM     |   Ensemble   |   PGD-ADT  |
|    Epsilon=0.3    |    0.8   |     0.9      |    1.0     |
|    Epsilon=0.5    |    0.7   |     0.9      |    0.8     |
|    Epsilon=0.1    |    0.9   |     1.0      |    1.0     |
|    Epsilon=0.7    |    0.7   |     0.9      |    0.7     |
|    Epsilon=0.8    |    0.7   |     0.9      |    0.7     |    

## Citations
- [Dong, Yinpeng, et al. (2020)](https://arxiv.org/pdf/2002.05999.pdf) "Adversarial Distributional Training for Robust Deep Learning." Advances in Neural Information Processing Systems 33.
- [Kurakin, Alexey, Ian Goodfellow, and Samy Bengio. (2016) ](https://arxiv.org/pdf/1607.02533.pdf) "Adversarial examples in the physical world." arXiv preprint arXiv:1607.02533.
- [Meng, Ying, et al. (2020)](https://arxiv.org/abs/2001.00308) "Ensembles of many diverse weak defenses can be strong: defending deep neural networks against adversarial attacks." arXiv preprint arXiv:2001.00308.
- [LeCun, Y. & Cortes, C. (2010)](http://yann.lecun.com/exdb/mnist/), 'MNIST handwritten digit database', . 