# A Learning Based Strategy for Athena

## Introduction

Athena has various strategies for making a final prediction based on the predictions of the ensemble of weak defenses. These strategies are relatively basic, and leave a large potential for optimization using a neural network. Our strategy involves running the predictions of the ensemble through a neural network trained to make a final prediction based on the predictions of the ensemble.

### Approach

The outline of our approach is as follows:
1. Generate subset of benign samples and labels for training and validation
2. Collect predictions from the ensemble for training and validation
3. Train model with ensemble predictions and confirm data fitting
4. Evaluate model on adversarial examples
5. Compare results with other avaliable strategies

### 1. Generate Subset

This step was taken in order to seperate test data and training data, and to avoid over-fitting the model on any particular data. The benign samples were not part of the comparison and final results, removing any bias in the dataset. Labels were saved immediately, while the samples went through further processing
**Total training data: 8000**
**Total validation data: 2000**

#### Relevant files
- Data Configurations (benign samples and subsamples): configs/experiment/data-mnist.json
- Subsampling: learning_based_strategy/utils/data.py
- Subset Generated in: learning_based_strategy/collect_raws.py

#### Parameters

- Data set *Data Configurations*: bs_file, label_file
- Subsamples output directory *Data Configurations*: training_dir
- Subsample output *Data Configurations*: training_labels_file, validation_labels_file
- Subsample ratio: 80% training, 20%, (ratio=0.2)

### 2. Collect Predictions

In order to train the model, we need the predictions of each of the weak defenses used. We take the raw predictions, which is a list of ten float values each corresponding to the confidence in it's respective index being the correct value for the image. Since this data is not categorical, it is much easier to process for the network. The raw predictions are gathered for training and validation, and saved to their respective files.<br> <br>**An important note:** This design requires the usage of a fixed number of weak defenses, specified in the *transformation configurations file*, in this experiment we used the first 30 weak defenses supplied by Athena. These were chosen as a way to maximize the number of weak defenses without pushing the limits of our hardware. We wanted to use as many weak defenses as possible to guage the ability of the model to use their respective predictions, in hopes that more information would increase the effectiveness of the model.

#### Relevant files

- Data Configurations: configs/experiment/data-mnist.json
- Athena Ensemble Configuration: configs/experiment/athena-mnist.json
- WD Model Configurations: configs/experiment/model-mnist.json
- Raw predictions generated in: learning_based_strategy/collect_raws.py

#### Parameters

- Weak defenses *Athena configuration*: 30

### 3. Train Model

After generating our dataset, we are ready to train the model. But first we will go over the model design:<br>
<br>
**Model Design**<br>
The model is sequential, has 1 input and 1 output layer, as well as 3 hidden layers. The input of the weak defenses is flattened to be one dimensional: *num_wds*x*10*. The number of weak defenses is passed as a parameter to the model, allowing this architecture to be used with any number of weak defenses. This was mostly for ease of testing the code, however it does allow for customization should the experiment be reproduced. A mixture of *tanh* and *relu* activations are used, however since the input and output are both (0,1), this could be changed potentially for better results. The loss function is sparse categorical crossentropy in order to generate an integer from numerical data, the length 10 output layer specifically.
<br><br>
**Training**<br>
The model is trained on the previously collected raw predictions, comparing to the correct label values. We used 10 epochs with a batch size of 100. We found this to be a good middle ground, as a smaller batch size would not allow the model to find minima, and more epochs would push it out of minima.

#### Relevant files

- Neural network design and training: learning_based_strategy/models/nn.py
- Neural network saved to: learning_based_strategy/learning-strategy-nn.h5
- Parse training data and send for training: learning_based_strategy/train_model.py

#### Parameters

- Input shape: num_wds * 10
- Output shape: 10 (sparse categorical gets 1 integer)
- Activations: tanh (input) relu (hidden) sigmoid (output)
- Epochs: 10
- Batch Size: 100

### 4. Evaluate Model on Adversarial Examples

The model was evaluated on all the adversarial examples provided with Athena. In order to do this properly, each AE file was subsampled with the Label file, and the AE's were tested on all relevant models (more on this in the next section). After all predictions were made and results stored, the next AE file was loaded, and a new subsample generated with the label file. While not every AE file had the same exact subsample, the importance of the experiment was to guage the effectiveness of the model as a strategy for Athena. Speaking of, a new ENSEMBLE_STRATEGY was created within Athena which would take the raw predictions of the ensemble and pass them to the previously trained model for the final prediction.

#### Relevant files

- Evaluation of models: learning_based_strategy/test_benign_training.py
- Subsampling: learning_based_strategy/utils/data.py
- Athena: learning_based_strategy/models/athena.py
- Model: learning_based_strategy/learning-strategy-nn.h5

#### Parameters

- Sampling ratio: 20% of AE's from each file used ratio=0.8

### 5. Compare Results

Steps 4. and 5. occur simultaneously (almost). For every AE file and it's respectively generated subset, the Undefended Model, Athena Ensemble using AVEP and the same 30 weak defenses used to train the learning based strategy, Athena using Learning based strategy, and PGD-ADT made predictions, and results were gathered. Each AE type has it's own output file containing the results from each variation of that AE (all error rates of the models).

#### Relevant files

- Evaluation: learning_based_strategy/test_benign_training.py
- Data configurations (contain AE's): configs/experiment/data-mnist.json
- Output specifications: configs/experiment/results.json
- Output directory: learning_based_strategy/results/

### Results
- Plots Generated with learning_based_strategy/generate_plots.py

<table> <tr>
    <td> <img src="../src/learning_based_strategy/results/fgsm.png" width="400"/> </td>
    <td> <img src="../src/learning_based_strategy/results/bim_ord2.png" width="400"/> </td>
    <td> <img src="../src/learning_based_strategy/results/bim_ordinf.png" width="400"/> </td>
</tr> </table>
<table> <tr>
    <td> <img src="../src/learning_based_strategy/results/cw_l2.png" width="400"/> </td>
    <td> <img src="../src/learning_based_strategy/results/deepfool_l2.png" width="400"/> </td>
    <td> <img src="../src/learning_based_strategy/results/jsma.png" width="400"/> </td>
</tr> </table>
<table> <tr>
    <td> <img src="../src/learning_based_strategy/results/pgd.png" width="400"/> </td>
    <td> <img src="../src/learning_based_strategy/results/mim.png" width="400"/> </td>
    <td> <img src="../src/learning_based_strategy/results/onepixel.png" width="400"/> </td>
</tr> </table>

Above can be seen the error rate of each model on each attack type, and each variant of that attack type. In general, the learning based strategy outperforms the other models for low degrees of pertubation; ie lower epsilon values. There are some anomalies, for example JSMA and CW where the learning based strategy *appears* to become better with higher pertubation, however it is unclear what could be causing this. All models seem to perform extremely poorly on deepfool, even on lower levels of pertubation. For onepixel and FGSM, the ensemble is superior, and in FGSM PGD-ADT is the best. On all the other attack types besides ones discussed, the learning based strategy sticks roughly next to the ensemble in terms of error rate, and both having lower error rates than PGD-ADT.

### Conclusion

Our implementation of a learning based strategy for Athena appears to offer a small advantage in certain scenarios and specific attack types, low levels of pertubation for example. In general the predictions of weak defenses provide an advantage to no defense or PGD-ADT (at least this configuration of weak defenses, and not against FGSM). One problem could be the loss of semantic meaning of the data in the model. The data from the weak defenses is getting all mixed together, and there may not be much of a way to truly find a pattern in this data. Using more weak defenses would be ideal, however the issue still stands with loss of information in the training data. One solution could be to seperate input layers by weak defense, allowing each weak defense's prediction to be individually processed before being concatenated. Other activations should be tested, preferably sigmoid in order to match the data input and output type as this may improve performance. Using 8000 training samples may also not be enough to truly train the model fully; a full 60k dataset would be ideal (or at least a percentage). Final remarks are related to the potential of using adversarial examples to train the model; if samples were slightly pertubed for training, at least a portion, this would certainly allow the model to determine a pattern in weak defense input. This introduces the issue of over-fitting, but a large and varied dataset should be able to accomplish this.

Contributions:
- Approach: Miles Ziemer, Max Corbel, Shuge Lei, Safi Hoque
- Code: Miles Ziemer, Max Corbel
- Data gathering: Miles Ziemer
- Testing and interpretation/validation of results: Miles Ziemer, Max Corbel, Shuge Lei Safi Hoque
- Report: Miles Ziemer, Max Corbel, Shuge Lei, Safi Hoque