# Tensorflow MNIST Classifier demo

This notebook contains an end-to-end demostration of Dioptra that can be run on any modern laptop.
Please see the [example README](README.md) for instructions on how to prepare your environment for running this example.

## Setup

Below we import the necessary Python modules and ensure the proper environment variables are set so that all the code blocks will work as expected,

In [None]:
EXPERIMENT_NAME = "mnist_fgm"
EXPERIMENT_DESC = "applying the fast gradient sign (FGM) attack to a classifier trained on MNIST"
QUEUE_NAME = 'tensorflow_cpu'
QUEUE_DESC = 'Tensorflow CPU Queue'
PLUGIN_FILES = '../task-plugins/dioptra_custom/fgm_mnist_demo/'
MODEL_NAME = "mnist_classifier"

# Default address for accessing the RESTful API service
RESTAPI_ADDRESS = "http://localhost:20080"

# Default address for accessing the MLFlow Tracking server
MLFLOW_TRACKING_URI = "http://localhost:35000"

In [None]:
# Import packages from the Python standard library
import importlib.util
import os
import sys
import pprint
import time
import warnings
from IPython.display import display, clear_output
import logging
import structlog
from pathlib import Path

# Filter out warning messages
warnings.filterwarnings("ignore")
structlog.configure(
    wrapper_class=structlog.make_filtering_bound_logger(logging.ERROR),
)

def register_python_source_file(module_name: str, filepath: Path) -> None:
    """Import a source file directly.

    Args:
        module_name: The module name to associate with the imported source file.
        filepath: The path to the source file.

    Notes:
        Adapted from the following implementation in the Python documentation:
        https://docs.python.org/3/library/importlib.html#importing-a-source-file-directly
    """
    spec = importlib.util.spec_from_file_location(module_name, str(filepath))
    module = importlib.util.module_from_spec(spec)
    sys.modules[module_name] = module
    spec.loader.exec_module(module)
register_python_source_file("scripts", Path("..", "scripts", "__init__.py"))

# Register the examples/scripts directory as a Python module
from scripts.client import DioptraClient
from scripts.utils import make_tar
from scripts.setup import upload_experiment, run_experiment, delete_all

# Set DIOPTRA_RESTAPI_URI variable if not defined, used to connect to RESTful API service
if os.getenv("DIOPTRA_RESTAPI_URI") is None:
    os.environ["DIOPTRA_RESTAPI_URI"] = RESTAPI_ADDRESS

# Set MLFLOW_TRACKING_URI variable, used to connect to MLFlow Tracking service
if os.getenv("MLFLOW_TRACKING_URI") is None:
    os.environ["MLFLOW_TRACKING_URI"] = MLFLOW_TRACKING_URI

## Dataset

We obtained a copy of the MNIST dataset when we ran `download_data.py` script. If you have not done so already, see [How to Obtain Common Datasets](https://pages.nist.gov/dioptra/getting-started/acquiring-datasets.html).
The training and testing images for the MNIST dataset are stored within the `/dioptra/data/Mnist` directory as PNG files that are organized into the following folder structure,

    Mnist
    ├── testing
    │   ├── 0
    │   ├── 1
    │   ├── 2
    │   ├── 3
    │   ├── 4
    │   ├── 5
    │   ├── 6
    │   ├── 7
    │   ├── 8
    │   └── 9
    └── training
        ├── 0
        ├── 1
        ├── 2
        ├── 3
        ├── 4
        ├── 5
        ├── 6
        ├── 7
        ├── 8
        └── 9

The subfolders under `training/` and `testing/` are the classification labels for the images in the dataset.
This folder structure is a standardized way to encode the label information and many libraries can make use of it, including the Tensorflow library that we are using for this particular demo.

## Submit and run jobs

To connect with the endpoint, we will use a client class defined in the `examples/scripts/client.py` file that is able to connect with the Dioptra RESTful API using the HTTP protocol.
We connect using the client below.
The client uses the environment variable `DIOPTRA_RESTAPI_URI`, which we configured at the top of the notebook, to figure out how to connect to the Dioptra RESTful API.

In [None]:
client = DioptraClient()

It is necessary to login to the RESTAPI to be able to perform any functions. Here we create a user if it is not created already, and login with it.

In [None]:
try:
    client.users.create('pluginuser','pluginuser@dioptra.nccoe.nist.gov','pleasemakesuretoPLUGINthecomputer','pleasemakesuretoPLUGINthecomputer')
except:
    pass # ignore if user exists already
client.auth.login('pluginuser','pleasemakesuretoPLUGINthecomputer')

`wait_for_job` stalls til the previous job was finished, which is useful for jobs which depend on the output of other jobs.

In [None]:
def wait_for_job(job, job_name):
    n = 0
    while job['status'] != 'finished':  
        job = client.jobs.get_by_id(job['id'])
        time.sleep(1)
        clear_output(wait=True)
        display("Waiting for job." + "." * (n % 3) )
        n += 1
    clear_output(wait=True)
    display(f"Job finished. Starting {job_name} job.")
    

In this step, we are just uploading all of our entrypoints and the plugins they rely on to the Dioptra server.

In [63]:
#delete_all(client)
experiment_id, train_ep, queue_id = upload_experiment(client, 'src/train.yml','train','training a classifier on MNIST', PLUGIN_FILES, QUEUE_NAME, QUEUE_DESC, EXPERIMENT_NAME, EXPERIMENT_DESC)
experiment_id, fgm_ep, queue_id = upload_experiment(client, 'src/fgm.yml','fgm','generating examples on mnist_classifier using the fgm attack', PLUGIN_FILES, QUEUE_NAME, QUEUE_DESC, EXPERIMENT_NAME, EXPERIMENT_DESC)
experiment_id, infer_ep, queue_id = upload_experiment(client, 'src/infer.yml','infer','evaluating performance of mnist_classifier on generated fgm examples', PLUGIN_FILES, QUEUE_NAME, QUEUE_DESC, EXPERIMENT_NAME, EXPERIMENT_DESC)

Next, we need to train our model. This particular entrypoint uses a LeNet-5 model.
Depending on the specs of your computer, it can take 5-20 minutes or longer to complete.
If you are fortunate enough to have access to a dedicated GPU, then the training time will be much shorter.

In [64]:
job_time_limit = '1h'

training_job = client.experiments.create_jobs_by_experiment_id(
    experiment_id, 
    f"training job for {experiment_id}", 
    queue_id,
    train_ep, 
    {"epochs":"1"}, 
    job_time_limit
)

Now that we have trained a model, next we will apply the fast-gradient method (FGM) evasion attack on it to generate adversarial images.

This specific workflow is an example of jobs that contain dependencies, as the metric evaluation jobs cannot start until the adversarial image generation jobs have completed, and the adversarial image generation job cannot start until the training job has completed.

Note that the training_job id is needed to tell the FGM attack which model to generate examples against.

In [65]:
job_time_limit = '1h'

wait_for_job(training_job, 'fgm')
fgm_job = client.experiments.create_jobs_by_experiment_id(
    experiment_id,
    f"fgm job for {experiment_id}",
    queue_id,
    fgm_ep,
    {"model_name": MODEL_NAME, "model_version": str(-1)}, # -1 means get the latest
    job_time_limit
)

'Job finished. Starting fgm job.'

Finally, we can test out the results of our adversarial attack on the model we trained earlier. This will wait for the FGM job to finish, and then evaluate the model's performance on the adversarial examples. Note that we need to know both the `fgm_job` id as well as the `training_job` id, so that this entrypoint knows which run's adversarial examples to test against which model. 

The previous runs are all stored in Dioptra as well, so you can always go back later and retrieve examples, models, and even the code used to create them.

In [66]:
job_time_limit = '1h'
wait_for_job(fgm_job, 'infer')
infer_job = client.experiments.create_jobs_by_experiment_id(
    experiment_id,
    f"infer job for {experiment_id}",
    queue_id,
    infer_ep,
    {"fgm_job_id": str(fgm_job['id']), "model_name": MODEL_NAME, "model_version": str(-1)},
    job_time_limit
)

'Job finished. Starting infer job.'

In [70]:
from mlflow.tracking import MlflowClient
from uuid import UUID
mlflow_client = MlflowClient()
mlflow_runid = UUID(client.jobs.get_mlflow_run_id(training_job['id'])['mlflowRunId']).hex
mlflow_run = mlflow_client.get_run(mlflow_runid)
print("Training metrics:")
pprint.pprint(mlflow_run.data.metrics)
      
mlflow_runid = UUID(client.jobs.get_mlflow_run_id(infer_job['id'])['mlflowRunId']).hex
mlflow_run = mlflow_client.get_run(mlflow_runid)
print("FGM metrics:")
pprint.pprint(mlflow_run.data.metrics)

Training accuracy:
{'accuracy': 0.9760833382606506,
 'auc': 0.9990718364715576,
 'loss': 0.07672422379255295,
 'precision': 0.9799415469169617,
 'recall': 0.9721999764442444,
 'training_time_in_minutes': 0.3090300166666667}
FGM accuracy:
{'accuracy': 0.16326121985912323,
 'auc': 0.6759902238845825,
 'loss': 2.7856907844543457,
 'precision': 0.09174499660730362,
 'recall': 0.044971954077482224}
