# Tensorflow MNIST-LeNet Patch Demo

This notebook demonstrates the adversarial patch attack applied on the LeNet model, as well as preprocessing and adversarial training defenses.

The following two sections cover experiment setup and is similar across all demos. 


## Setup: Experiment Name and MNIST Dataset

Here we will import the necessary Python modules and ensure the proper environment variables are set so that all the code blocks will work as expected.

### Important: Users will need to verify or update the following parameters:
- Ensure that the `USERNAME` parameter is set to your own name.
- Ensure that the `DATASET_DIR` parameter is set to the location of the MNIST dataset directory. Currently set to `/nfs/data/Mnist` as the default location.
- (Optional) Set the `EXPERIMENT_NAME` parameter to your own preferred experiment name.

Other parameters can be modified to alter the RESTful API and MLFlow tracking addresses. 

In [72]:
# Import packages from the Python standard library
import os
import pprint
import time
import warnings
from pathlib import Path
from typing import Tuple

# Please enter custom username here.
USERNAME = "howard"

# Experiment name (note the username_ prefix convention)
EXPERIMENT_NAME = f"{USERNAME}_mnist_adversarial_patches"

# Ensure that the MNIST dataset location is properly set here.
DATASET_DIR = "/nfs/data/Mnist"

# Filter out warning messages
warnings.filterwarnings("ignore")

# Default address for accessing the RESTful API service
RESTAPI_ADDRESS = "http://localhost:30080"

# Base API address
RESTAPI_API_BASE = f"{RESTAPI_ADDRESS}/api"

# Default address for accessing the MLFlow Tracking server
MLFLOW_TRACKING_URI = "http://localhost:35000"

# Path to workflows archive
WORKFLOWS_TAR_GZ = Path("workflows.tar.gz")

# Set MLFLOW_TRACKING_URI variable, used to connect to MLFlow Tracking service
if os.getenv("MLFLOW_TRACKING_URI") is None:
    os.environ["MLFLOW_TRACKING_URI"] = MLFLOW_TRACKING_URI

# Import third-party Python packages
import numpy as np
import requests
from mlflow.tracking import MlflowClient

# Import utils.py file
import utils

# Create random number generator
rng = np.random.default_rng(54399264723942495723666216079516778448)

Check that the Makefile works in your environment by executing the `bash` code block below,

In [71]:
%%bash

# Running this will list the available rules defined in the demo's Makefile.
make

[1mAvailable rules:[m

[36mclean              [m Remove temporary files 
[36mdata               [m Download and prepare MNIST dataset 
[36minitdb             [m Initialize the RESTful API database 
[36mservices           [m Launch the Minio S3 and MLFlow Tracking services 
[36mteardown           [m Destroy service containers 
[36mworkflows          [m Create workflows tarball 


### MNIST Dataset Overview:
The training and testing images in this directory are saved as PNG files and are organized into the following folder structure,

    Mnist
    ├── testing
    │   ├── 0
    │   ├── 1
    │   ├── 2
    │   ├── 3
    │   ├── 4
    │   ├── 5
    │   ├── 6
    │   ├── 7
    │   ├── 8
    │   └── 9
    └── training
        ├── 0
        ├── 1
        ├── 2
        ├── 3
        ├── 4
        ├── 5
        ├── 6
        ├── 7
        ├── 8
        └── 9

The subfolders under `Mnist/training/` and `Mnist/testing/` are the classification labels for the images in the dataset.

This folder structure is a standardized way to encode the label information and many libraries can make use of it, including the Tensorflow library that we are using for this particular demo.

## Setup: REST API and Source Code

The jobs that we will be running are implemented in the Python source files under `src/`, which will be executed using the entrypoints defined in the `MLproject` file.
To get this information into the architecture, we need to package those files up into an archive and upload it to the lab API.
For convenience, the `Makefile` provides a rule for creating the archive file, just run `make workflows`,

In [32]:
%%bash

# Create the workflows.tar.gz file
make workflows

make: Nothing to be done for 'workflows'.


To connect with the endpoint, we will use a client class defined in the `utils.py` file that is able to connect with the lab's RESTful API using the HTTP protocol.
We connect using the client below,

In [73]:
restapi_client = utils.SecuringAIClient(address=RESTAPI_API_BASE)

We need to register an experiment under which to collect our job runs.
The code below checks if the relevant experiment exists.
If it does, then it just returns info about the experiment, if it doesn't, it then registers the new experiment.

In [77]:
response_experiment = restapi_client.get_experiment_by_name(name=EXPERIMENT_NAME)

if response_experiment is None or "Not Found" in response_experiment.get("message", []):
    response_experiment = restapi_client.register_experiment(name=EXPERIMENT_NAME)

response_experiment

{'createdOn': '2020-11-05T08:45:14.946823',
 'experimentId': 10,
 'lastModified': '2020-11-05T08:45:14.946823',
 'name': 'howard_mnist_adversarial_patches'}

### The following sections will now cover the adversarial patch demo.






# Adversarial Patches: Baseline MNIST Training

Now, we will train our baseline LeNet-5 model on the MNIST dataset. 
We will be submitting our jobs to the `"tensorflow_gpu"` queue.

Once the experiment is finished, we will examine the accuracy results of our model.

In [94]:
# Create and submit training job.

response_le_net_train = restapi_client.submit_job(
    workflows_file=WORKFLOWS_TAR_GZ,
    experiment_name=EXPERIMENT_NAME,
    entry_point="train",
    entry_point_kwargs=" ".join([
        "-P batch_size=256",
        "-P register_model=True",
        "-P model_architecture=le_net",
        "-P epochs=30",
        f"-P data_dir_train={DATASET_DIR}/training",
        f"-P data_dir_test={DATASET_DIR}/testing",
    ]),
    queue="tensorflow_gpu",
    timeout="1h",
)

print("Training job for LeNet-5 neural network submitted")
print("")
pprint.pprint(response_le_net_train)

Training job for LeNet-5 neural network submitted

{'createdOn': '2021-02-11T00:39:25.356043',
 'dependsOn': None,
 'entryPoint': 'train',
 'entryPointKwargs': '-P batch_size=256 -P register_model=True -P '
                     'model_architecture=le_net -P epochs=30 -P '
                     'data_dir_train=/nfs/data/Mnist/training -P '
                     'data_dir_test=/nfs/data/Mnist/testing',
 'experimentId': 10,
 'jobId': 'e6df0b0d-ec6b-4087-a7b1-4ab97064cfc6',
 'lastModified': '2021-02-11T00:39:25.356043',
 'mlflowRunId': None,
 'queueId': 2,
 'status': 'queued',
 'timeout': '1h',
 'workflowUri': 's3://workflow/b8f949bd5661440b877e12683807557f/workflows.tar.gz'}


In [143]:
# The following helper functions will recheck the job responses until the job is completed or a run ID is available. 
# The run ID is needed to link dependencies between jobs.

def mlflow_run_id_is_not_known(job_response):
    return job_response["mlflowRunId"] is None and job_response["status"] not in [
        "failed",
        "finished",
    ]

def get_run_id(job_response):
    while mlflow_run_id_is_not_known(job_response):
        time.sleep(1)
        job_response = restapi_client.get_job_by_id(job_response["jobId"])
        
    return job_response

def wait_until_finished(job_response):
    # First make sure job has started.
    job_response = get_run_id(job_response)
    
    # Next re-check job until it has stopped running.
    while (job_response["status"] not in ["failed", "finished"]):
        time.sleep(1)
        job_response = restapi_client.get_job_by_id(job_response["jobId"])
    
    return job_response

    

In [100]:
# Now wait for the job to complete before proceeding to next steps.
response_le_net_train = wait_until_finished(response_le_net_train)
print("Training job for LeNet-5 neural network")
pprint.pprint(response_le_net_train)

Training job for LeNet-5 neural network
{'createdOn': '2021-02-11T00:39:25.356043',
 'dependsOn': None,
 'entryPoint': 'train',
 'entryPointKwargs': '-P batch_size=256 -P register_model=True -P '
                     'model_architecture=le_net -P epochs=30 -P '
                     'data_dir_train=/nfs/data/Mnist/training -P '
                     'data_dir_test=/nfs/data/Mnist/testing',
 'experimentId': 10,
 'jobId': 'e6df0b0d-ec6b-4087-a7b1-4ab97064cfc6',
 'lastModified': '2021-02-11T00:39:28.968235',
 'mlflowRunId': 'fdc1fbd940bf4f2794a6d1c6b789d5b2',
 'queueId': 2,
 'status': 'started',
 'timeout': '1h',
 'workflowUri': 's3://workflow/b8f949bd5661440b877e12683807557f/workflows.tar.gz'}


## Checking baseline MNIST job accuracy.

Once the job has finished running we can view the results either through the MLflow URI or by accessing the job via MLflow client.

Here we will show the baseline accuracy results from the previous training job.

Please see [Querying the MLFlow Tracking Service](#querying_cell) section for more details. 

In [140]:
def get_mlflow_results(job_response):
    mlflow_client = MlflowClient()
    job_response = wait_until_finished(job_response)
    if(job_response['status']=="failed"):
        return {}
    
    run = mlflow_client.get_run(job_response["mlflowRunId"])  
    while(len(run.data.metrics) == 0):
        time.sleep(1)
        run = mlflow_client.get_run(job_response["mlflowRunId"])
        
    return run
        
results = get_mlflow_results(response_le_net_train)
pprint.pprint(results.data.metrics)

{'accuracy': 0.9820223450660706,
 'auc': 0.9992511868476868,
 'fn': 975.0,
 'fp': 752.0,
 'loss': 0.05927685966999091,
 'precision': 0.9842615127563477,
 'recall': 0.9796891808509827,
 'restored_epoch': 3.0,
 'stopped_epoch': 8.0,
 'tn': 431284.0,
 'tp': 47029.0,
 'training_time_in_minutes': 5.73557385,
 'val_accuracy': 0.9863287806510925,
 'val_auc': 0.9989493489265442,
 'val_fn': 175.0,
 'val_fp': 147.0,
 'val_loss': 0.046896340751505276,
 'val_precision': 0.9877172708511353,
 'val_recall': 0.9854118227958679,
 'val_tn': 107817.0,
 'val_tp': 11821.0}


With the baseline training settings, it appears that the LeNet model has been properly trained on MNIST dataset.

## Deploying and Testing Adversarial Patches

Now we will create and apply the adversarial patches over our test set and evaluate the performance of the baseline model on the adversarial patches.

We will also apply the patches over the training set for the adversarial training defense evaluation.

### Patch Generation:
The following job will generate the adversarial patches. 

Feel free to adjust the input parameters to see how they impact the effectiveness of the patch attack.

In [102]:
# Create Patches
response_le_net_patches = restapi_client.submit_job(
    workflows_file=WORKFLOWS_TAR_GZ,
    experiment_name=EXPERIMENT_NAME,
    entry_point="gen_patch",
    entry_point_kwargs=" ".join(
        [
            f"-P model={EXPERIMENT_NAME}_le_net/1",
            "-P model_architecture=le_net",
            f"-P data_dir={DATASET_DIR}/training",
            "-P num_patch_gen_samples=40",
            "-P num_patch=3",
            "-P patch_target=5",
        ]
    ),
    queue="tensorflow_gpu",
    depends_on=response_le_net_train["jobId"],
)

print("Patch attack (LeNet-5 architecture) job submitted")
print("")
pprint.pprint(response_le_net_patches)
print("")

# Wait for Patch attack to finish.
response_le_net_patches = wait_until_finished(response_le_net_patches)

Patch attack (LeNet-5 architecture) job submitted

{'createdOn': '2021-02-11T00:58:39.351362',
 'dependsOn': 'e6df0b0d-ec6b-4087-a7b1-4ab97064cfc6',
 'entryPoint': 'gen_patch',
 'entryPointKwargs': '-P model=howard_mnist_adversarial_patches_le_net/1 -P '
                     'model_architecture=le_net -P '
                     'data_dir=/nfs/data/Mnist/training -P '
                     'num_patch_gen_samples=40 -P num_patch=3 -P '
                     'patch_target=5',
 'experimentId': 10,
 'jobId': 'e07c4f80-422f-4655-8821-9fb7d9143009',
 'lastModified': '2021-02-11T00:58:39.351362',
 'mlflowRunId': None,
 'queueId': 2,
 'status': 'queued',
 'timeout': '24h',
 'workflowUri': 's3://workflow/dd5cb95d32e244bba00b63821ef958b6/workflows.tar.gz'}



### Patch Deployment:

The following jobs will deploy the patches over the training and test sets.

In [103]:
# Deploy Patch attack on training set.
response_deploy_le_net_patches_mnist_training = restapi_client.submit_job(
    workflows_file=WORKFLOWS_TAR_GZ,
    experiment_name=EXPERIMENT_NAME,
    entry_point="deploy_patch",
    entry_point_kwargs=" ".join(
        [
            f"-P run_id={response_le_net_patches['mlflowRunId']}",
            f"-P model={EXPERIMENT_NAME}_le_net/1",
            "-P model_architecture=le_net",
            f"-P data_dir={DATASET_DIR}/training",
        ]
    ),
    queue="tensorflow_gpu",
    depends_on=response_le_net_patches["jobId"],
)

print("Patch deployment (LeNet-5 architecture) job submitted")
print("")
pprint.pprint(response_deploy_le_net_patches_mnist_training)
print("")

# Get the run ID of the test set.
response_deploy_le_net_patches_mnist_training = get_run_id(response_deploy_le_net_patches_mnist_training)

Patch deployment (LeNet-5 architecture) job submitted

{'createdOn': '2021-02-11T01:01:03.350803',
 'dependsOn': 'e07c4f80-422f-4655-8821-9fb7d9143009',
 'entryPoint': 'deploy_patch',
 'entryPointKwargs': '-P run_id=a29a543b7d2443268af4a6c97254db30 -P '
                     'model=howard_mnist_adversarial_patches_le_net/1 -P '
                     'model_architecture=le_net -P '
                     'data_dir=/nfs/data/Mnist/training',
 'experimentId': 10,
 'jobId': 'f8a6ea17-fe13-4809-940d-9edddb315219',
 'lastModified': '2021-02-11T01:01:03.350803',
 'mlflowRunId': None,
 'queueId': 2,
 'status': 'queued',
 'timeout': '24h',
 'workflowUri': 's3://workflow/2c372deb736d4158aee53c56254a4e56/workflows.tar.gz'}



In [105]:
# Deploy Patch attack on test set.
response_deploy_le_net_patches_mnist_testing = restapi_client.submit_job(
    workflows_file=WORKFLOWS_TAR_GZ,
    experiment_name=EXPERIMENT_NAME,
    entry_point="deploy_patch",
    entry_point_kwargs=" ".join(
        [
            f"-P run_id={response_le_net_patches['mlflowRunId']}",
            f"-P model={EXPERIMENT_NAME}_le_net/1",
            "-P model_architecture=le_net",
            f"-P data_dir={DATASET_DIR}/testing",
            "-P patch_deployment_method=corrupt"
        ]
    ),
    queue="tensorflow_gpu",
    depends_on=response_le_net_patches["jobId"],
)

print("Patch deployment (LeNet-5 architecture) job submitted")
print("")
pprint.pprint(response_deploy_le_net_patches_mnist_testing)
print("")

# Get the run ID of the training set.
response_deploy_le_net_patches_mnist_testing = get_run_id(response_deploy_le_net_patches_mnist_testing)

Patch deployment (LeNet-5 architecture) job submitted

{'createdOn': '2021-02-11T01:02:02.769583',
 'dependsOn': 'e07c4f80-422f-4655-8821-9fb7d9143009',
 'entryPoint': 'deploy_patch',
 'entryPointKwargs': '-P run_id=a29a543b7d2443268af4a6c97254db30 -P '
                     'model=howard_mnist_adversarial_patches_le_net/1 -P '
                     'model_architecture=le_net -P '
                     'data_dir=/nfs/data/Mnist/testing -P '
                     'patch_deployment_method=corrupt',
 'experimentId': 10,
 'jobId': 'e24f6ccb-3ddc-4ba6-8027-17ac45198b4c',
 'lastModified': '2021-02-11T01:02:02.769583',
 'mlflowRunId': None,
 'queueId': 2,
 'status': 'queued',
 'timeout': '24h',
 'workflowUri': 's3://workflow/c7936ae6b0ba4a82b7daa36ba15a4257/workflows.tar.gz'}



## Patch Attack Evaluation: Baseline MNIST Model
Now we will run an inference step to check the patch-attacked dataset with our MNIST-trained model.

In [119]:
# Check patched dataset results   
response_le_net_infer_le_net_patch = restapi_client.submit_job(
    workflows_file=WORKFLOWS_TAR_GZ,
    experiment_name=EXPERIMENT_NAME,
    entry_point="infer",
    entry_point_kwargs=" ".join(
        [
            f"-P run_id={response_deploy_le_net_patches_mnist_testing['mlflowRunId']}",
            f"-P model={EXPERIMENT_NAME}_le_net/1",
            "-P model_architecture=le_net",
            "-P batch_size=512",
            "-P dataset_tar_name=adversarial_patch_dataset.tar.gz",
            "-P dataset_name=adv_patch_dataset",
        ]
    ),
    queue="tensorflow_gpu",
    depends_on=response_deploy_le_net_patches_mnist_testing["jobId"],
)

print("Patch evaluation (LeNet-5 architecture) job submitted")
print("")
pprint.pprint(response_deploy_le_net_patches_mnist_testing)
print("")

Patch evaluation (LeNet-5 architecture) job submitted

{'createdOn': '2021-02-11T01:02:02.769583',
 'dependsOn': 'e07c4f80-422f-4655-8821-9fb7d9143009',
 'entryPoint': 'deploy_patch',
 'entryPointKwargs': '-P run_id=a29a543b7d2443268af4a6c97254db30 -P '
                     'model=howard_mnist_adversarial_patches_le_net/1 -P '
                     'model_architecture=le_net -P '
                     'data_dir=/nfs/data/Mnist/testing -P '
                     'patch_deployment_method=corrupt',
 'experimentId': 10,
 'jobId': 'e24f6ccb-3ddc-4ba6-8027-17ac45198b4c',
 'lastModified': '2021-02-11T01:02:43.752820',
 'mlflowRunId': 'a137b8b5d62b46e8a4b4227af145405a',
 'queueId': 2,
 'status': 'finished',
 'timeout': '24h',
 'workflowUri': 's3://workflow/c7936ae6b0ba4a82b7daa36ba15a4257/workflows.tar.gz'}



In [126]:
# Wait for the job to finish
response_le_net_infer_le_net_patch = wait_until_finished(response_le_net_infer_le_net_patch)

# Check on the patch evaluation results
results = get_mlflow_results(response_le_net_infer_le_net_patch)
print("Baseline model results on adversarially patched dataset: ")
pprint.pprint(results.data.metrics)

Baseline model results on adversarially patched dataset: 
{'accuracy': 0.4181690812110901,
 'auc': 0.798317015171051,
 'fn': 6413.0,
 'fp': 375.0,
 'loss': 1.650673319513981,
 'precision': 0.9049670696258545,
 'recall': 0.3576722741127014,
 'tn': 89481.0,
 'tp': 3571.0}



We can see that the adversarial patch attack causes a noticeable decrease in the model's accuracy scores.

We will now test various defenses against the patch attacked images.

# Defenses: Image Preprocessing and Adversarial Training

The next part of the adversarial patch demo focuses on investigating effective defenses against the attack.

## Preprocessing Defenses: Spatial Smoothing, JPEG Compression, Gaussian Augmentation

Here we will investigate three preprocessing defenses that can be applied over the images before inference.

### Spatial Smoothing Defense:

Here, we can adjust the `spatial_smoothing_window_size` parameter to increase or decrease the sliding window of the smoothing defense.

Larger values will create more noticeable distortions but can also help mask any adversarial perturbations.

In [130]:
response_le_net_spatial_smoothing_test_set = restapi_client.submit_job(
    workflows_file=WORKFLOWS_TAR_GZ,
    experiment_name=EXPERIMENT_NAME,
    entry_point="spatial_smoothing",
    entry_point_kwargs=" ".join(
        [
            f"-P model={EXPERIMENT_NAME}_le_net/1",
            "-P model_architecture=le_net",
            f"-P data_dir={DATASET_DIR}/training",
            "-P batch_size=20",
            "-P load_dataset_from_mlruns=true",
            "-P spatial_smoothing_window_size=2",
            f"-P dataset_run_id={response_deploy_le_net_patches_mnist_testing['mlflowRunId']}",
            "-P dataset_tar_name=adversarial_patch_dataset.tar.gz",
            "-P dataset_name=adv_patch_dataset",
            
        ]
    ),
    queue="tensorflow_gpu",
    depends_on=response_deploy_le_net_patches_mnist_testing["jobId"],
)

print("Spatial smoothing defense job submitted")
print("")
pprint.pprint(response_le_net_spatial_smoothing_test_set)
print("")

response_le_net_spatial_smoothing_test_set = get_run_id(response_le_net_spatial_smoothing_test_set)

Spatial smoothing defense job submitted

{'createdOn': '2021-02-11T03:25:31.027419',
 'dependsOn': 'e24f6ccb-3ddc-4ba6-8027-17ac45198b4c',
 'entryPoint': 'spatial_smoothing',
 'entryPointKwargs': '-P model=howard_mnist_adversarial_patches_le_net/1 -P '
                     'model_architecture=le_net -P '
                     'data_dir=/nfs/data/Mnist/training -P batch_size=20 -P '
                     'load_dataset_from_mlruns=true -P '
                     'spatial_smoothing_window_size=2 -P '
                     'dataset_run_id=a137b8b5d62b46e8a4b4227af145405a -P '
                     'dataset_tar_name=adversarial_patch_dataset.tar.gz -P '
                     'dataset_name=adv_patch_dataset',
 'experimentId': 10,
 'jobId': '4a03aab8-fc59-464e-a543-b3731808d562',
 'lastModified': '2021-02-11T03:25:31.027419',
 'mlflowRunId': None,
 'queueId': 2,
 'status': 'queued',
 'timeout': '24h',
 'workflowUri': 's3://workflow/cfa8e701e68241db97a214241242fb54/workflows.tar.gz'}



In [133]:
# Wait for defense to complete, then check baseline model on defended test set with adversarial patches.

response_evaluate_spatial_smoothing_images = restapi_client.submit_job(
    workflows_file=WORKFLOWS_TAR_GZ,
    experiment_name=EXPERIMENT_NAME,
    entry_point="infer",
    entry_point_kwargs=" ".join(
        [
            f"-P run_id={response_le_net_spatial_smoothing_test_set['mlflowRunId']}",
            f"-P model={EXPERIMENT_NAME}_le_net/None",
            "-P model_architecture=le_net",
            "-P batch_size=20",
            "-P dataset_tar_name=spatial_smoothing_dataset.tar.gz",
            "-P dataset_name=adv_testing",
        ]
    ),
    queue="tensorflow_gpu",
    depends_on=response_le_net_spatial_smoothing_test_set["jobId"],
)

print("Inference job submitted")
print("")
pprint.pprint(response_evaluate_spatial_smoothing_images)
print("")

response_evaluate_spatial_smoothing_images = wait_until_finished(response_evaluate_spatial_smoothing_images)
results = get_mlflow_results(response_evaluate_spatial_smoothing_images)
print("Adversarial Patches with Spatial Smoothing Results: \n")
pprint.pprint(results.data.metrics)

Inference job submitted

{'createdOn': '2021-02-11T03:29:24.413596',
 'dependsOn': '4a03aab8-fc59-464e-a543-b3731808d562',
 'entryPoint': 'infer',
 'entryPointKwargs': '-P run_id=b3897d27c63f451d88ec9a24e89ff6c3 -P '
                     'model=howard_mnist_adversarial_patches_le_net/None -P '
                     'model_architecture=le_net -P batch_size=20 -P '
                     'dataset_tar_name=spatial_smoothing_dataset.tar.gz -P '
                     'dataset_name=adv_testing',
 'experimentId': 10,
 'jobId': '6d647ac8-f297-43c8-9e84-07852bfacedc',
 'lastModified': '2021-02-11T03:29:24.413596',
 'mlflowRunId': None,
 'queueId': 2,
 'status': 'queued',
 'timeout': '24h',
 'workflowUri': 's3://workflow/8187d9d456914f4aa70c3c38df037302/workflows.tar.gz'}

Adversarial Patches with Spatial Smoothing Results: 

{'accuracy': 0.4125250577926636,
 'auc': 0.7922819256782532,
 'fn': 6543.0,
 'fp': 592.0,
 'loss': 1.6688883025676777,
 'precision': 0.8530652523040771,
 'recall': 0.3443887829

It appears that spatial smoothing does not protect well against adversarial patches.

### JPEG Compression Defense:

We can adjust the image compression quality by modifying the `jpeg_compression_quality` field. 
Enter any value between 1 (worst) to 95 (best).

In [145]:
response_le_net_jpeg_compression_test_set = restapi_client.submit_job(
    workflows_file=WORKFLOWS_TAR_GZ,
    experiment_name=EXPERIMENT_NAME,
    entry_point="jpeg_compression",
    entry_point_kwargs=" ".join(
        [
            f"-P model={EXPERIMENT_NAME}_le_net/1",
            "-P model_architecture=le_net",
            f"-P data_dir=/{DATASET_DIR}/training",
            "-P batch_size=20",
            "-P load_dataset_from_mlruns=true",
            f"-P dataset_run_id={response_deploy_le_net_patches_mnist_testing['mlflowRunId']}",
            "-P jpeg_compression_quality=30",
            "-P dataset_tar_name=adversarial_patch_dataset.tar.gz",
            "-P dataset_name=adv_patch_dataset",
            
        ]
    ),
    queue="tensorflow_gpu",
    depends_on=response_deploy_le_net_patches_mnist_testing["jobId"],
)

print("JPEG compression defense job submitted")
print("")
pprint.pprint(response_le_net_jpeg_compression_test_set)
print("")

response_le_net_jpeg_compression_test_set = get_run_id(response_le_net_jpeg_compression_test_set)

JPEG compression defense job submitted

{'createdOn': '2021-02-11T04:04:29.125973',
 'dependsOn': 'e24f6ccb-3ddc-4ba6-8027-17ac45198b4c',
 'entryPoint': 'jpeg_compression',
 'entryPointKwargs': '-P model=howard_mnist_adversarial_patches_le_net/1 -P '
                     'model_architecture=le_net -P '
                     'data_dir=/nfs/data/ImageNet-Kaggle-2017/images/ILSVRC/Data/CLS-LOC/val-sorted-1000 '
                     '-P batch_size=20 -P load_dataset_from_mlruns=true -P '
                     'dataset_run_id=a137b8b5d62b46e8a4b4227af145405a -P '
                     'jpeg_compression_quality=30 -P '
                     'dataset_tar_name=adversarial_patch_dataset.tar.gz -P '
                     'dataset_name=adv_patch_dataset',
 'experimentId': 10,
 'jobId': 'd9b6063d-f27d-473a-96ec-26de8fb48d8a',
 'lastModified': '2021-02-11T04:04:29.125973',
 'mlflowRunId': None,
 'queueId': 2,
 'status': 'queued',
 'timeout': '24h',
 'workflowUri': 's3://workflow/6e83977d6e5a48978a8a44ce

In [146]:
# Wait for defense to complete, then check baseline model on defended test set with adversarial patches.

response_evaluate_jpeg_compression_images = restapi_client.submit_job(
    workflows_file=WORKFLOWS_TAR_GZ,
    experiment_name=EXPERIMENT_NAME,
    entry_point="infer",
    entry_point_kwargs=" ".join(
        [
            f"-P run_id={response_le_net_jpeg_compression_test_set['mlflowRunId']}",
            f"-P model={EXPERIMENT_NAME}_le_net/None",
            "-P model_architecture=le_net",
            "-P batch_size=20",
            "-P dataset_tar_name=jpeg_compression_dataset.tar.gz",
            "-P dataset_name=adv_testing",
        ]
    ),
    queue="tensorflow_gpu",
    depends_on=response_le_net_jpeg_compression_test_set["jobId"],
)

print("Inference job submitted")
print("")
pprint.pprint(response_evaluate_jpeg_compression_images)
print("")

response_evaluate_jpeg_compression_images = wait_until_finished(response_evaluate_jpeg_compression_images)
results = get_mlflow_results(response_evaluate_jpeg_compression_images)
print("Adversarial Patches with JPEG Compression Results: \n")
pprint.pprint(results.data.metrics)


Inference job submitted

{'createdOn': '2021-02-11T04:04:33.250324',
 'dependsOn': 'd9b6063d-f27d-473a-96ec-26de8fb48d8a',
 'entryPoint': 'infer',
 'entryPointKwargs': '-P run_id=2233c7370cbc4a7dac3d390b0161ddcd -P '
                     'model=howard_mnist_adversarial_patches_le_net/None -P '
                     'model_architecture=le_net -P batch_size=20 -P '
                     'dataset_tar_name=jpeg_compression_dataset.tar.gz -P '
                     'dataset_name=adv_testing',
 'experimentId': 10,
 'jobId': '1c6c6456-0e4d-4261-9b69-c1c37248d15e',
 'lastModified': '2021-02-11T04:04:33.250324',
 'mlflowRunId': None,
 'queueId': 2,
 'status': 'queued',
 'timeout': '24h',
 'workflowUri': 's3://workflow/e385f2da8f8e4192a91a80eb6d5dd242/workflows.tar.gz'}

Adversarial Patches with JPEG Compression Results: 

{'accuracy': 0.4106212556362152,
 'auc': 0.7881637215614319,
 'fn': 6438.0,
 'fp': 1159.0,
 'loss': 1.8545305996369092,
 'precision': 0.753456711769104,
 'recall': 0.354909807443

### Gaussian Augmentation Defense:

We can adjust the amount of noise created by the defense by adjusting the `gaussian_augmentation_sigma` parameter. 
Please enter any positive value for sigma.

In [147]:
response_le_net_gaussian_augmentation_test_set = restapi_client.submit_job(
    workflows_file=WORKFLOWS_TAR_GZ,
    experiment_name=EXPERIMENT_NAME,
    entry_point="gaussian_augmentation",
    entry_point_kwargs=" ".join(
        [
            f"-P model={EXPERIMENT_NAME}_le_net/1",
            "-P model_architecture=le_net",
            f"-P data_dir=/{DATASET_DIR}/training",
            "-P batch_size=20",
            "-P load_dataset_from_mlruns=true",
            f"-P dataset_run_id={response_deploy_le_net_patches_mnist_testing['mlflowRunId']}",
            "-P dataset_tar_name=adversarial_patch_dataset.tar.gz",
            "-P dataset_name=adv_patch_dataset",
            "-P gaussian_augmentation_sigma=0.3"
            
        ]
    ),
    queue="tensorflow_gpu",
    depends_on=response_deploy_le_net_patches_mnist_testing["jobId"],
)


print("Gaussian Augmentation defense job submitted")
print("")
pprint.pprint(response_le_net_gaussian_augmentation_test_set)
print("")

response_le_net_gaussian_augmentation_test_set = get_run_id(response_le_net_gaussian_augmentation_test_set)

Gaussian Augmentation defense job submitted

{'createdOn': '2021-02-11T04:09:31.863074',
 'dependsOn': 'e24f6ccb-3ddc-4ba6-8027-17ac45198b4c',
 'entryPoint': 'gaussian_augmentation',
 'entryPointKwargs': '-P model=howard_mnist_adversarial_patches_le_net/1 -P '
                     'model_architecture=le_net -P '
                     'data_dir=/nfs/data/ImageNet-Kaggle-2017/images/ILSVRC/Data/CLS-LOC/val-sorted-1000 '
                     '-P batch_size=20 -P load_dataset_from_mlruns=true -P '
                     'dataset_run_id=a137b8b5d62b46e8a4b4227af145405a -P '
                     'dataset_tar_name=adversarial_patch_dataset.tar.gz -P '
                     'dataset_name=adv_patch_dataset -P '
                     'gaussian_augmentation_sigma=0.3',
 'experimentId': 10,
 'jobId': '099c20a4-6549-4c7c-9153-6b90d3dc8c19',
 'lastModified': '2021-02-11T04:09:31.863074',
 'mlflowRunId': None,
 'queueId': 2,
 'status': 'queued',
 'timeout': '24h',
 'workflowUri': 's3://workflow/bfb1fe1e31

In [148]:
# Wait for defense to complete, then check baseline model on defended test set with adversarial patches.

response_evaluate_gaussian_augmentation_images = restapi_client.submit_job(
    workflows_file=WORKFLOWS_TAR_GZ,
    experiment_name=EXPERIMENT_NAME,
    entry_point="infer",
    entry_point_kwargs=" ".join(
        [
            f"-P run_id={response_le_net_gaussian_augmentation_test_set['mlflowRunId']}",
            f"-P model={EXPERIMENT_NAME}_le_net/None",
            "-P model_architecture=le_net",
            "-P batch_size=20",
            "-P dataset_tar_name=gaussian_augmentation_dataset.tar.gz",
            "-P dataset_name=adv_testing",
        ]
    ),
    queue="tensorflow_gpu",
    depends_on=response_le_net_gaussian_augmentation_test_set["jobId"],
)

print("Inference job submitted")
print("")
pprint.pprint(response_evaluate_gaussian_augmentation_images)
print("")

response_evaluate_gaussian_augmentation_images = wait_until_finished(response_evaluate_gaussian_augmentation_images)
results = get_mlflow_results(response_evaluate_gaussian_augmentation_images)
print("Adversarial Patches with Gaussian Augmentation Results: \n")
pprint.pprint(results.data.metrics)


Inference job submitted

{'createdOn': '2021-02-11T04:10:53.605176',
 'dependsOn': '099c20a4-6549-4c7c-9153-6b90d3dc8c19',
 'entryPoint': 'infer',
 'entryPointKwargs': '-P run_id=a3d8b1d2fbd9430b916b3072f5295667 -P '
                     'model=howard_mnist_adversarial_patches_le_net/None -P '
                     'model_architecture=le_net -P batch_size=20 -P '
                     'dataset_tar_name=gaussian_augmentation_dataset.tar.gz -P '
                     'dataset_name=adv_testing',
 'experimentId': 10,
 'jobId': '1896a83c-648f-4f1c-9070-5e88696542b2',
 'lastModified': '2021-02-11T04:10:53.605176',
 'mlflowRunId': None,
 'queueId': 2,
 'status': 'queued',
 'timeout': '24h',
 'workflowUri': 's3://workflow/daf5e513206a44988ccc04adb32631de/workflows.tar.gz'}

Adversarial Patches with Gaussian Augmentation Results: 

{'accuracy': 0.35891783237457275,
 'auc': 0.7413415312767029,
 'fn': 7019.0,
 'fp': 2055.0,
 'loss': 2.278626961585803,
 'precision': 0.5903109908103943,
 'recall': 0.2

It appears that of the three preprocessing defenses, gaussian augmentation might actually improve patch attach effectiveness.

Since preprocessing defenses appear to be ineffective at stopping adversarial patches, let's try another approach.

# Adversarial Training Defense:

Finally, we will train a new copy of the LeNet model on training set that contains adversarial patches.
In doing so, the model learns to ignore the adversarial patches.

In [150]:
# Finally, train and retest patched dataset.

response_deploy_le_net_patches_mnist_adv_training = restapi_client.submit_job(
    workflows_file=WORKFLOWS_TAR_GZ,
    experiment_name=EXPERIMENT_NAME,
    entry_point="train",
    entry_point_kwargs=" ".join(
        [
            f"-P testing_dataset_run_id={response_deploy_le_net_patches_mnist_testing['mlflowRunId']}",
            f"-P training_dataset_run_id={response_deploy_le_net_patches_mnist_training['mlflowRunId']}",
            "-P batch_size=256",
            "-P register_model=True",
            "-P model_architecture=le_net",
            "-P model_tag=adversarial_patch",
            "-P epochs=30",
            f"-P data_dir_train={DATASET_DIR}/training",
            f"-P data_dir_test={DATASET_DIR}/testing",
            "-P load_dataset_from_mlruns=True",
        ]
    ),
    queue="tensorflow_gpu",
    depends_on=response_deploy_le_net_patches_mnist_training["jobId"],
)


print("Patch adversarial training (LeNet-5 architecture) job submitted")
print("")
pprint.pprint(response_deploy_le_net_patches_mnist_adv_training)
print("")

response_deploy_le_net_patches_mnist_adv_training = get_run_id(response_deploy_le_net_patches_mnist_adv_training)

Patch adversarial training (LeNet-5 architecture) job submitted

{'createdOn': '2021-02-11T05:17:29.520627',
 'dependsOn': 'f8a6ea17-fe13-4809-940d-9edddb315219',
 'entryPoint': 'train',
 'entryPointKwargs': '-P '
                     'testing_dataset_run_id=a137b8b5d62b46e8a4b4227af145405a '
                     '-P '
                     'training_dataset_run_id=b1886b21e58845759fb43537709f99b4 '
                     '-P batch_size=256 -P register_model=True -P '
                     'model_architecture=le_net -P model_tag=adversarial_patch '
                     '-P epochs=30 -P data_dir_train=/nfs/data/Mnist/training '
                     '-P data_dir_test=/nfs/data/Mnist/testing -P '
                     'load_dataset_from_mlruns=True',
 'experimentId': 10,
 'jobId': 'deba691a-edf1-49ab-9c3a-46eea3254923',
 'lastModified': '2021-02-11T05:17:29.520627',
 'mlflowRunId': None,
 'queueId': 2,
 'status': 'queued',
 'timeout': '24h',
 'workflowUri': 's3://workflow/1f8071a6e81b4a07ad9e4

In [152]:
response_evaluate_adv_training = restapi_client.submit_job(
    workflows_file=WORKFLOWS_TAR_GZ,
    experiment_name=EXPERIMENT_NAME,
    entry_point="infer",
    entry_point_kwargs=" ".join(
        [
            f"-P run_id={response_deploy_le_net_patches_mnist_testing['mlflowRunId']}",
            f"-P model={EXPERIMENT_NAME}_adversarial_patch_le_net/None",
            "-P model_architecture=le_net",
            "-P batch_size=20",
            "-P dataset_tar_name=adversarial_patch_dataset.tar.gz",
            "-P dataset_name=adv_patch_dataset",
        ]
    ),
    queue="tensorflow_gpu",
    depends_on=response_deploy_le_net_patches_mnist_adv_training["jobId"],
)

print("Patch evaluation job submitted")
print("")
pprint.pprint(response_evaluate_adv_training)
print("")

response_evaluate_adv_training = wait_until_finished(response_evaluate_adv_training)
results = get_mlflow_results(response_evaluate_adv_training)
print("Adversarial Training Results:")

Patch evaluation job submitted

{'createdOn': '2021-02-11T05:20:36.058379',
 'dependsOn': 'deba691a-edf1-49ab-9c3a-46eea3254923',
 'entryPoint': 'infer',
 'entryPointKwargs': '-P run_id=a137b8b5d62b46e8a4b4227af145405a -P '
                     'model=howard_mnist_adversarial_patches_adversarial_patch_le_net/None '
                     '-P model_architecture=le_net -P batch_size=20 -P '
                     'dataset_tar_name=adversarial_patch_dataset.tar.gz -P '
                     'dataset_name=adv_patch_dataset',
 'experimentId': 10,
 'jobId': 'bf197423-b636-4788-9081-e55ff7c97e5d',
 'lastModified': '2021-02-11T05:20:36.058379',
 'mlflowRunId': None,
 'queueId': 2,
 'status': 'queued',
 'timeout': '24h',
 'workflowUri': 's3://workflow/152cae393b1e41f88015dfdce644afa8/workflows.tar.gz'}

Adversarial Training Results:


### We can see that for adversarial patches, adversarial training appears to be a good option.
### Some other interesting approaches that users may want to investigate by modfifying this demo are:

 - Generating a separate set of patches for training and test data. 
    - To do so, add a secondary patch generation job and link the run_id of this new job to one of the existing test/training patch deployment jobs. 
    
 - Experimenting with alternate adversarially trained models. 
     - If both the MNIST patch demo and MNIST FGM demos are run together, then users can swap the `model={EXPERIMENT_NAME}_adversarial_patch_le_net/None` with the model name of the FGM experiment.
     - Doing so can let users quickly check which adversarial training offers robust protection against multiple types of attacks. 
     
### Please consult the README documentation for more information regarding available entrypoints and attack/defense parameters. 

<a id='querying_cell'></a>
## Querying the MLFlow Tracking Service

Currently the lab API can only be used to register experiments and start jobs, so if users wish to extract their results programmatically, they can use the `MlflowClient()` class from the `mlflow` Python package to connect and query their results.
Since we captured the run ids generated by MLFlow, we can easily retrieve the data logged about one of our jobs and inspect the results.
To start the client, we simply need to run,

In [154]:
mlflow_client = MlflowClient()

The client uses the environment variable `MLFLOW_TRACKING_URI` to figure out how to connect to the MLFlow Tracking Service, which we configured near the top of this notebook.
To query the results of one of our runs, we just need to pass the run id to the client's `get_run()` method.
As an example, let's query the run results for the patch attack applied to the LeNet-5 architecture,

In [155]:
run_le_net = mlflow_client.get_run(response_le_net_patches["mlflowRunId"])

If the request completed successfully, we should now be able to query data collected during the run.
For example, to review the collected metrics, we just use,

In [None]:
pprint.pprint(run_le_net.data.metrics)

To review the run's parameters, we use,

In [None]:
pprint.pprint(run_le_net.data.params)

To review the run's tags, we use,

In [None]:
pprint.pprint(run_le_net.data.tags)

There are many things you can query using the MLFlow client.
[The MLFlow documentation gives a full overview of the methods that are available](https://www.mlflow.org/docs/latest/python_api/mlflow.tracking.html#mlflow.tracking.MlflowClient).