# Team Horus - Task 1


# Introduction

An Adversarial Example are often caused by the introduction of subliminal perturbations to benign samples, making machine learning models vulnerable to attacks and causing the model to make wrong classifications.

For this task, we generated adversarial examples in the context of Zero-knowlege threat model. In this model, the adversary knows about the target classifier and further assumes it has full knowledge of the architecture and settings of the Undefended Model, but lacks the awareness of the existense of a defense, weak defenses and the ensemble strategy.


# Background
In this task, we implemented three different adversarial attacks, namely: 
* 1. Fast Gradient Sign Method (FGSM) 
* 2. Projected Gradient Descent (PGD) 
* 3. Basic Iterative Method (BIM).

## 1. Fast Gradient Sign Method ([FGSM](https://arxiv.org/abs/1412.6572)) Attack 

Fast Gradient Sign Method ([FGSM](https://arxiv.org/abs/1412.6572)) Attack is one of the most popular adversarial attacks. While the target model tries to minimize the loss function by adjusting the weights based on the backpropagated gradients, this powerful attack adds pertubation to original input by maximizing the loss function based on the same backpropagated gradients. That is, it produces adversarial examples by adding the original input image ($x$) and the gradients of the neural network of the loss function ($J$) with respect to input data ($\nabla_xJ(\theta,x,y)$).

It is an attack for a $l_\infty$-bounded adversary that computes an adversarial example as using the following expression:
           
$$ x' = x + \epsilon .sign(\nabla_xJ(\theta,x,y))$$ 
where:
* $x'$ is the adversarial image
* $x$ is the original input image 
* $y$ is the original input label 
* $\theta$  is the model parameter 
* $J$ is the loss function
* $\epsilon$ is the magnitude of the perturbation
  
In essense, this attack can be described as a simple one-iteration scheme for maximizing the inner part of the saddle point formulation.

## 2. Projected Gradient Descent ([PGD](https://arxiv.org/pdf/1706.06083.pdf)) Attack

The Projected Gradient Descent ([PGD](https://arxiv.org/pdf/1706.06083.pdf)) Attack is a more powerful adversary than the FGSM. As its name suggests, it applies a projected gradient descent on negative loss function. In otherwords, it performs an iteration to find the perturbation that maximises the loss of a model on a particular input while keeping the size of the perturbation smaller than a specific amount specified as epsilon, $\epsilon$.
This method processes adversarial examples as follows until a certain stopping criterion is satisfied:
 
$$ x^{t+1} = x^{t} - \Pi_{x+S} (x^t + \alpha sign(\nabla_xJ(\theta,x,y)))$$
where:
* $t$  is the number of iteration
and other parameters are as defined above.

It is similar to the BIM attack (also know as the Iterative [FGSM](https://arxiv.org/abs/1412.6572)), however, what differentiates them is that the PGD initializes the example to a random point withing the available set, as decided by the $l_\infty$ norm, and also the PGD does random restart.

The [PGD](https://arxiv.org/pdf/1706.06083.pdf) Attack is considered the most complete adversary as it is not limited by any constraints on time  and effort the adversary expends in optimizing  to obtain the best attack by iteration.
 
## 3. Basic Iterative Method ([BIM ](https://arxiv.org/abs/1607.02533))($l_\infty$- norms) Attack

This attack is an extension of FGSM in which one perform FGSM multiple times with a small step size $\alpha$ smaller and bounded by $\epsilon$, instead of taking a big jump of size $\epsilon$. Hence, it is commonly refered to as iterative FGSM (IFGSM).

It begins the iteration by initializing the advesarial example $x'$ as the original input image $x$ and computes the iteration steps as follows:

 $$ x^{adv}_0 = x, x^{adv}_{N+1} = clip_{x,\epsilon}(x^{adv}_N + \alpha sign(\nabla_xJ(x^{adv}_N,y_{true})))$$
where:
* $clip_{x,\epsilon}(\beta)$ represents the element-wise clipping of $\beta$
* $J$ denote the loss function of the model
* $N$ denote the number of iteration 
* $\alpha$ is the magnitude of the step size of perturbation. 

The clip function ensure that the adversarial examples generated is still within the range of both the $\epsilon$ ball and the input space.




# Experiment 
 
## Objective
To generate adversarial examples in the context of the **Zero-knowledge** threat model and evaluate the performance of a built defense ensemble in the [Vanilla Athena](https://arxiv.org/pdf/2001.00308.pdf) against the crafted AEs and further compare it to the performance the Undefended Model and the state-of-the-art *PGD Adversarial Trained* (PGD-ADT).


## Subsampling
The input files for this project are images of handwritten numbers in the set `{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}` and their true labels contained in the **MNIST** dataset provided. The full **MNIST** dataset contains **10000** samples which could be used to generate the adversarial examples. However, due to limitations in computational time and resources, a smaller data of samples and their labels were generated by sampling the full **MNIST** dataset at a ratio of 1:10 (`0.1`) to ensure consistency and allow for proper comparison and this subsamples and their corresponding labels were used to generate the adversarial examples.

The generated subsample files (containing 1000 samples) for benign sample and their respected labels are located at:
* samples - ``data/subsamples-mnist-ratio_0.1-289385.328.npy`` 
* labels  - ``data/sublabels-mnist-ratio_0.1-289385.328.npy``

These subsample files were used to generate the adversarial examples (AEs) for FGSM, PGD and BIM adversarial attacks.


## Crafting of Adversarial Examples

The AEs were generated as follows:
* For **FGSM attack**, 6 AEs were generated by using the values of epsilon ($\epsilon$): ``0.08``, ``0.15``, ``0.20``, ``0.25``, ``0.30`` and ``0.40``.


* For **PGD attack**, 17 AEs were created by varying three different parameters: 
    The parameters are varied as follows:
    1. Epsilon ($\epsilon$) of values: ``0.08``, ``0.15``, ``0.20``, ``0.25``, ``0.30`` and ``0.40``
    1. Epsilon-step of values: ``0.015``, ``0.02``, ``0.03``, ``0.06``, ``0.10`` and ``0.15``
    1. Maximum iterations: ``5``, ``7``, ``10``, ``15`` and ``20``. 
  
  Here the when one parameter is changed the remaining two parameters were kept constant.


* For **BIM attack**, 12 AEs were generated in total. 6 different variations of epsilon for each of two variations of maximum iterations as shown below:
    1. For ``60`` maximum iterations, Epsilon ($\epsilon$) of values :``0.08``, ``0.15``, ``0.20``, ``0.25``, ``0.30`` and ``0.40``
    1. For ``100`` maximum iterations, Epsilon ($\epsilon$) of values :``0.08``, ``0.15``, ``0.20``, ``0.25``, ``0.30`` and ``0.40``

The settings for these attacks are defined in the ``config/attack-zk-mnist.json`` file.

## Evaluations of Crafted AEs and Ensemble Strategies.
The generated AEs were evaluated on the undefended model, the model with [Vanilla Athena](https://arxiv.org/pdf/2001.00308.pdf) defense and the model with state-of-the-art defense PGD-ADT. 
The [Vanilla Athena](https://arxiv.org/pdf/2001.00308.pdf) defense was tested by changing the number of weak defenses (10 and 20) and the ensemble strategy (Average output based Probability and Majority Voting).

The configuration number for the selected weak defences are:
* [Vanilla Athena](https://arxiv.org/pdf/2001.00308.pdf) with 10 WDs: [``1``, ``10``, ``12``, ``20``, ``25``, ``30``, ``39``, ``48``, ``57``, ``68``]
* [Vanilla Athena](https://arxiv.org/pdf/2001.00308.pdf) with 20 WDs: [``1``, ``4``, ``8``, ``10``, ``12``, ``17``, ``20``, ``25``, ``30``, ``35``, ``39``, ``43``, ``48``, ``50``, ``55``, ``57``, ``60``, ``65``, ``68``, ``72``]

The settings for WDs are specified in the ``config/athena-mnist.json`` file. 

Error rates for all the three types of models were calculated to determine the success rate of the adversary in the Zero-knowledge threat model context. 

## Relevant Files
* Subsamples and sublabels generated 1:10 ratio can be found at:
 * `samples - task1/data/subsamples-mnist-ratio_0.1-289385.328.npy`
 * `labels - task1/data/sublabels-mnist-ratio_0.1-289385.328.npy`.

* Attack configuration file for the generation of AEs can be found at:
 * `/task1/config/attack-zk-mnist.json`.
 
* Ensemble Configuration for [Vanilla Athena](https://arxiv.org/pdf/2001.00308.pdf) can be found at:
 * `/task1/config/athena-mnist.json`.

# Results and Discussion



# FGSM Generated AEs

### Plot of Sample FGSM AEs
$\epsilon$: 0.08  |  $\epsilon$: 0.15 | $\epsilon$: 0.20 | $\epsilon$: 0.25 | $\epsilon$: 0.30  | $\epsilon$: 0.40
:-------------------------:|:-------------------------:|:-------------------------:|:-------------------------:|:-------------------------:|:-------------------------:|
![](images/FGSM_eps0.08_task1.png)  |  ![](images/FGSM_eps0.15_task1.png) |  ![](images/FGSM_eps0.20_task1.png) |  ![](images/FGSM_eps0.25_task1.png) |  ![](images/FGSM_eps0.30_task1.png) |  ![](images/FGSM_eps0.40_task1.png)

10 WDs AVEP Ensemble Strategy  |  20 WDs AVEP Ensemble Strategy
:-------------------------:|:-------------------------:|
![](images/FGSM_AVEP_10.png)  |  ![](images/FGSM_AVEP_20.png) |

10 WDs MV Ensemble Strategy  |  20 WDs MV Ensemble Strategy
:-------------------------:|:-------------------------:|
![](images/FGSM_MV_10.png)  |  ![](images/FGSM_MV_20.png) |

Comparison of All Defence Ensembles of Athena Used  |
:-------------------------:|
![](images/FGSM_compare.png)  |

### Discussion of FGSM AEs Evaluation Results
The AEs generated by FGSM appear more distorted with the increasing value of epsilon. 
For all variations of ensemble strategy and number of weak defenses,  [Athena](https://arxiv.org/pdf/2001.00308.pdf) seems to have a lower error rate compared to PGD-ADT up to epsilon values around ``0.20``. Beyond epsilon values of ``0.20``, the error rate of [Athena](https://arxiv.org/pdf/2001.00308.pdf) defense increases compared to PGD-ADT. Consequently, for FGSM attack, Athena is the best defense for epsilon values around ``0.20``, then defense by PGD-ADT outperforms Athena.




# PGD Generated AEs

## Variation in Epsilon

$\epsilon$: 0.08  |  $\epsilon$: 0.15 | $\epsilon$: 0.20 | $\epsilon$: 0.25 | $\epsilon$: 0.30  | $\epsilon$: 0.40
:-------------------------:|:-------------------------:|:-------------------------:|:-------------------------:|:-------------------------:|:-------------------------:|
![](images/PGD_eps0.08_Veps_task1.png)  |  ![](images/PGD_eps0.15_Veps_task1.png) |  ![](images/PGD_eps0.20_Veps_task1.png) |  ![](images/PGD_eps0.25_Veps_task1.png) |  ![](images/PGD_eps0.30_Veps_task1.png) |  ![](images/PGD_eps0.40_Veps_task1.png)

### Comparison of Average Output based-on Probability (AVEP) and Majority Voting (MV) Ensemble Strategies

10 WDs AVEP Ensemble Strategy  |  20 WDs AVEP Ensemble Strategy
:-------------------------:|:-------------------------:|
![](images/PGD_eps_AVEP_10.png)  |  ![](images/PGD_eps_AVEP_20.png) |

10 WDs MV Ensemble Strategy  |  20 WDs MV Ensemble Strategy
:-------------------------:|:-------------------------:|
![](images/PGD_eps_MV_10.png)  |  ![](images/PGD_eps_MV_20.png) |

Comparison of All Defence Ensembles of Athena Used  |
:-------------------------:|
![](images/PGD_eps_compare.png)  |

### Discussion of the Effects of  Variation in Epsilon on PGD AEs' Evaluation

The PGD AEs generated with variation in epsilon values are plotted above, and it can be observed that the distortion in the images increases with the increasing value of epsilon. 

The change in the ensemble strategy does not seem to have an effect on the error rate with [Athena](https://arxiv.org/pdf/2001.00308.pdf) defense. However, when the number of WDs is increased from 10 to 20, the error rates by [Vanilla Athena](https://arxiv.org/pdf/2001.00308.pdf) defense is lower than those of the PGD-ADT defense. Hence, the diversity of the WDs is important as a defense when PGD attacks with increasing epsilon values are implemented.


## Variation in Epsilon-step

$\epsilon  step$: 0.015  |  $\epsilon  step$: 0.02 | $\epsilon  step$: 0.03 | $\epsilon  step$: 0.06 | $\epsilon  step$: 0.10  | $\epsilon  step$: 0.15
:-------------------------:|:-------------------------:|:-------------------------:|:-------------------------:|:-------------------------:|:-------------------------:|
![](images/PGD_eps0.30_estep_0.015_task1.png)  |  ![](images/PGD_eps0.30_estep_0.02_task1.png) |  ![](images/PGD_eps0.30_estep_0.03_task1.png) |  ![](images/PGD_eps0.30_estep_0.06_task1.png) |  ![](images/PGD_eps0.30_estep_0.10_task1.png) |  ![](images/PGD_eps0.30_estep_0.15_task1.png)

### Comparison of Average Output based-on Probability (AVEP) and Majority Voting (MV) Ensemble Strategies

10 WDs AVEP Ensemble Strategy  |  20 WDs AVEP Ensemble Strategy
:-------------------------:|:-------------------------:|
![](images/PGD_estep_AVEP_10.png)  |  ![](images/PGD_estep_AVEP_20.png) |

10 WDs MV Ensemble Strategy  |  20 WDs MV Ensemble Strategy
:-------------------------:|:-------------------------:|
![](images/PGD_estep_MV_10.png)  |  ![](images/PGD_estep_MV_20.png) |

Comparison of All Defence Ensembles of Athena Used  |
:-------------------------:|
![](images/PGD_estep_compare.png)  |

### Discussion of the Effects of  Variation in Epsilon-step on PGD AEs' Evaluation

The AEs generated by PGD plotted above appear more blurry as the value of epsilon-step increases. 

Similar to the effect of increment in the value of epsilon, the number of WDs in Athena does seem to have an effect for the [Vanilla Athena](https://arxiv.org/pdf/2001.00308.pdf) defense against this type of PGD attack for both MV and AVEP Ensemble Strategies. For this type of PGD attack, PGD-ADT defense is better than the [Athena](https://arxiv.org/pdf/2001.00308.pdf) defense for **epsilon-step values** greater than **``0.03``**.

We have also observed that a change in the ensemble strategy from MV to AVEP (or vice-versa) has little or no impact on the effectiveness of the adversarial examples in fooling the three models.

## Variation in Maximum Iterations

Max. iterations: 5  |  Max. iterations: 7 | Max. iterations: 10 | Max. iterations: 15 | Max. iterations: 20 
:-------------------------:|:-------------------------:|:-------------------------:|:-------------------------:|:-------------------------:|
![](images/PGD_eps0.30_max_iter_5_task1.png)  |  ![](images/PGD_eps0.30_max_iter_7_task1.png) |  ![](images/PGD_eps0.30_max_iter_10_task1.png) |  ![](images/PGD_eps0.30_max_iter_15_task1.png) |  ![](images/PGD_eps0.30_max_iter_20_task1.png)

### Comparison of Average Output based-on Probability (AVEP) and Majority Voting (MV) Ensemble Strategies

10 WDs AVEP Ensemble Strategy  |  20 WDs AVEP Ensemble Strategy
:-------------------------:|:-------------------------:|
![](images/PGD_maxiter_AVEP_10.png)  |  ![](images/PGD_maxiter_AVEP_20.png) |

10 WDs MV Ensemble Strategy  |  20 WDs MV Ensemble Strategy
:-------------------------:|:-------------------------:|
![](images/PGD_maxiter_MV_10.png)  |  ![](images/PGD_maxiter_MV_20.png) |

Comparison of All Defence Ensembles of Athena Used  |
:-------------------------:|
![](images/PGD_maxiter_compare.png)  |

### Discussion of the Effects of  Maximum Iterations on PGD AEs' Evaluation

The AEs generated by PGD plotted above appear more distorted as the value of maximum iterations increases. Hence, the PGD attack is able to craft stornger AEs when allowed to have higher number of iterations. 

Here, the [Athena](https://arxiv.org/pdf/2001.00308.pdf) defense seems to respond similar to the epsilon variations of PGD attack. Increasing the number of WDs have decreased the error rate of Athena, outperforming the PGD-ADT defense for all the values of maximum iterations in the PGD attack, except for maximum iterations of 20.



# BIM Generated AEs

## Variation in Epsilon for Maximum Iteration of 60

$\epsilon$: 0.08  |  $\epsilon$: 0.15 | $\epsilon$: 0.20 | $\epsilon$: 0.25 | $\epsilon$: 0.30  | $\epsilon$: 0.40
:-------------------------:|:-------------------------:|:-------------------------:|:-------------------------:|:-------------------------:|:-------------------------:|
![](images/BIM_eps0.08_max_iter_60_task1.png)  |  ![](images/BIM_eps0.15_max_iter_60_task1.png) |  ![](images/BIM_eps0.20_max_iter_60_task1.png) |  ![](images/BIM_eps0.25_max_iter_60_task1.png) |  ![](images/BIM_eps0.30_max_iter_60_task1.png) |  ![](images/BIM_eps0.40_max_iter_60_task1.png)

### Comparison of Average Output based-on Probability (AVEP) and Majority Voting (MV) Ensemble Strategies

10 WDs AVEP Ensemble Strategy  |  20 WDs AVEP Ensemble Strategy
:-------------------------:|:-------------------------:|
![](images/BIM_60_iter_AVEP_10.png)  |  ![](images/BIM_60_iter_AVEP_20.png) |

10 WDs MV Ensemble Strategy  |  20 WDs MV Ensemble Strategy
:-------------------------:|:-------------------------:|
![](images/BIM_60_iter_MV_10.png)  |  ![](images/BIM_60_iter_MV_20.png) |

Comparison of All Defence Ensembles of Athena Used  |
:-------------------------:|
![](images/BIM_60_compare.png)  |



## Variation in Epsilon for Maximum Iteration of 100

$\epsilon$: 0.08  |  $\epsilon$: 0.15 | $\epsilon$: 0.20 | $\epsilon$: 0.25 | $\epsilon$: 0.30  | $\epsilon$: 0.40
:-------------------------:|:-------------------------:|:-------------------------:|:-------------------------:|:-------------------------:|:-------------------------:|
![](images/BIM_eps0.08_max_iter_100_task1.png)  |  ![](images/BIM_eps0.15_max_iter_100_task1.png) |  ![](images/BIM_eps0.20_max_iter_100_task1.png) |  ![](images/BIM_eps0.25_max_iter_100_task1.png) |  ![](images/BIM_eps0.30_max_iter_100_task1.png) |  ![](images/BIM_eps0.40_max_iter_100_task1.png)

### Comparison of Average Output based-on Probability (AVEP) and Majority Voting (MV) Ensemble Strategies

10 WDs AVEP Ensemble Strategy  |  20 WDs AVEP Ensemble Strategy
:-------------------------:|:-------------------------:|
![](images/BIM_100_iter_AVEP_10.png)  |  ![](images/BIM_100_iter_AVEP_20.png) |

10 WDs MV Ensemble Strategy  |  20 WDs MV Ensemble Strategy
:-------------------------:|:-------------------------:|
![](images/BIM_100_iter_MV_10.png)  |  ![](images/BIM_100_iter_MV_20.png) |


Comparison of All Defence Ensembles of Athena Used  |
:-------------------------:|
![](images/BIM_100_compare.png)  |

### Discussion of the BIM Attack AEs' Evaluation

The AEs generated by BIM appears as shown above appear more distorted with the increasing values of epsilon. 

The number of WDs does not seem to have an effect for the performance of athena defense for BIM with the used settings. For both values of maximum iterations in the BIM attack, the [Vanilla Athena](https://arxiv.org/pdf/2001.00308.pdf) model performed better than the PGD-ADT defense only when the the value of epsilon is at ``0.08, 0.15 and 0.40``.

Since we obtained similar error rates for maximum iterations of ``60`` and ``100``, we can conclude that the iteration for the BIM, for all values of epsilon, converged on or before the 60th iteration. Hence, optimum number of iterations for this attack should be equal to or less than 60.

# Conclusion

The following conclusions can be made in the context of Zero-knowledge threat model:

 * Increasing the number of weak defenses from 10 to 20 improved the defense of Athena by decreasing the error rate for the adversarial attacks. Hence, at a cost of computational burden and time, adding more weak defenses will improve the Athena's defense and less AEs will be able to fool the model.
 
 
 * Changing the ensemble strategy from Average Output based-on Probability (AVEP) to Majority Voting (MV) had little or no noticable effect on the Athena's defense.
 
 
 * Defense by Athena and the PGD-ADT is dependent of the type of attack and the value of parameters in the attack. 
 
 
 * For iterative attacks like BIM and PGD, the adversarial examples becomes stronger with higher number of iterations and till it converges.


# Contribution

All members contributed to the success of task1.

``Rasika (Rasika-prog)``
 * Tuning of codes to generate Adversarial Examples
 * Code debugging
 * Compilation and Analysis of results and report
 * Management of Group meetings
 
``Olajide (42n8dzydoo)``
 * Tuning of codes to generate Adversarial Examples
 * Code debugging
 * Compilation of results and report
 * Plotting of the graphs
 * Formatting of report in Markdown
 
``Joshua (ojihjo)``
 * Tuning of codes to generate Adversarial Examples
 * Code debugging
 * Compilation and Analysis of results and report
 * Co-management of Group meetings
 
``Kaveh (kavehshariati)``
 * Tuning of codes to generate Adversarial Examples
 * Code debugging
 * Compilation and Analysis of results and report
 * Co-management of Group meetings

# References

* Ian J. Goodfellow, Jonathon Schlens, and Christian Szegedy. Explaining and harnessing adversarial examples.  In _International Conferences on Learning Representations (ICLR)_. [_arXiv:1412.6572_](https://arxiv.org/pdf/1412.6572.pdf), 2015.


* Madry,A., Makelov, A., Schmidt, L., Tsipras, D., and Vladu, A. Towards Deep Learning Models Resistant to Adversarial Attacks. [_arXiv:1706.06083_](https://arxiv.org/pdf/1706.06083.pdf), 2019.


* Alex Kurakin, Ian J. Goodfellow, Samy Bengio. Adversarial Machine Learning at Scale. In _International Conference on Learning Representation  (ICLR)_. [_arXiv:1611.01236v2_](https://arxiv.org/pdf/1611.01236.pdf?source=post_page---------------------------), 2017.


* Ying Meng, Jianhai Su, Jason M. O’Kane, Pooyan Jamshidi. [ATHENA](https://arxiv.org/pdf/2001.00308.pdf): A Framework based on Diverse Weak Defenses for Building Adversarial Defense. [_arXiv:2001.00308v2_](https://arxiv.org/pdf/2001.00308.pdf), 2020.
     * code GitHub repo: [ATHENA](https://github.com/csce585-mlsystems/project-athena)
