# PyTorch Detectron2 Road Signs Poisoning Demo

This notebook contains an end-to-end demostration of Dioptra that can be run on any modern laptop.

## Setup

Below we import the necessary Python modules and ensure the proper environment variables are set so that all the code blocks will work as expected,

In [None]:
# Import packages from the Python standard library
import os
import pprint
import time
import warnings
from pathlib import Path
from typing import Tuple

# Filter out warning messages
warnings.filterwarnings("ignore")

# Address for connecting the docker container to exposed ports on the host device
HOST_DOCKER_INTERNAL = "host.docker.internal"
# HOST_DOCKER_INTERNAL = "172.17.0.1"

# Dioptra API ports
RESTAPI_PORT = "30080"
MLFLOW_TRACKING_PORT = "35000"

# Default address for accessing the RESTful API service
RESTAPI_ADDRESS = (
    f"http://{HOST_DOCKER_INTERNAL}:{RESTAPI_PORT}"
    if os.getenv("IS_JUPYTER_SERVICE")
    else f"http://localhost:{RESTAPI_PORT}"
)

# Override the DIOPTRA_RESTAPI_URI variable, used to connect to RESTful API service
os.environ["DIOPTRA_RESTAPI_URI"] = RESTAPI_ADDRESS

# Default address for accessing the MLFlow Tracking server
MLFLOW_TRACKING_URI = (
    f"http://{HOST_DOCKER_INTERNAL}:{MLFLOW_TRACKING_PORT}"
    if os.getenv("IS_JUPYTER_SERVICE")
    else f"http://localhost:{MLFLOW_TRACKING_PORT}"
)

# Override the MLFLOW_TRACKING_URI variable, used to connect to MLFlow Tracking service
os.environ["MLFLOW_TRACKING_URI"] = MLFLOW_TRACKING_URI

# Base API address
RESTAPI_API_BASE = f"{RESTAPI_ADDRESS}/api"

# Path to workflows archive
WORKFLOWS_TAR_GZ = Path("workflows.tar.gz")

# Path to custom task plugins archive
CUSTOM_PLUGINS_PYTORCH_D2_TAR_GZ = Path("custom-plugins-pytorch-mi.tar.gz")

# Experiment name (note the username_ prefix convention)
USERNAME = "user"
EXPERIMENT_NAME = "{}_stop_signs_demo".format(USERNAME)

# Path to dataset
data_path = "/nfs/data/Road-Sign-Detection-v2-balanced-div/"

# Import third-party Python packages
import numpy as np
import requests
from mlflow.tracking import MlflowClient

# Import utils.py file
import utils

# Create random number generator
rng = np.random.default_rng(54399264723942495723666216079516778448)

## Submit and run jobs

The entrypoints that we will be running in this example are implemented in the Python source files under `src/` and the `MLproject` file.
To run these entrypoints within Dioptra's architecture, we need to package those files up into an archive and submit it to the Dioptra RESTful API to create a new job.
For convenience, the `Makefile` provides a rule for creating the archive file for this example, just run `make workflows`,

In [None]:
%%bash

# Create the workflows.tar.gz file
make workflows

To connect with the endpoint, we will use a client class defined in the `utils.py` file that is able to connect with the Dioptra RESTful API using the HTTP protocol.
We connect using the client below, which uses the environment variable `DIOPTRA_RESTAPI_URI` to figure out how to connect to the Dioptra RESTful API,

In [None]:
restapi_client = utils.DioptraClient()

We need to register an experiment under which to collect our job runs.
The code below checks if the relevant experiment exists.
If it does, then it just returns info about the experiment, if it doesn't, it then registers the new experiment.

In [None]:
response_experiment = restapi_client.get_experiment_by_name(name=EXPERIMENT_NAME)

if response_experiment is None or "Not Found" in response_experiment.get("message", []):
    response_experiment = restapi_client.register_experiment(name=EXPERIMENT_NAME)

response_experiment

We should also check which queues are available for running our jobs to make sure that the resources that we need are available.
The code below queries the Lab API and returns a list of active queues.

In [None]:
restapi_client.list_queues()

This example also makes use of the `pytorch_d2` custom task plugin package stored locally under the `task-plugins/dioptra_custom/pytorch_d2` directory.
To register these custom task plugins, we first need to package them up into an archive.
For convenience, the `Makefile` provides a rule for creating the custom task plugins archive file, just run `make custom-plugins`,

In [None]:
%%bash

# Create the workflows.tar.gz file
make custom-plugins

Now that the `pytorch_d2` custom task plugin package is packaged into an archive file, next we register it by uploading the file to the REST API.
Note that we need to provide the name to use for custom task plugin package, this name must be unique under the custom task plugins namespace.
For a full list of the custom task plugins, use `restapi_client.restapi_client.list_custom_task_plugins()`.

In [None]:
response_custom_plugins = restapi_client.get_custom_task_plugin(name="pytorch_d2")

if response_custom_plugins is None or "Not Found" in response_custom_plugins.get("message", []):
    response_custom_plugins = restapi_client.upload_custom_plugin_package(
        custom_plugin_name="pytorch_d2",
        custom_plugin_file=CUSTOM_PLUGINS_PYTORCH_D2_TAR_GZ,
    )

response_custom_plugins

If at any point you need to update one or more files within the `pytorch_d2` plugin package, you will need to unregister/delete the custom task plugin first using the REST API.
This can be done as follows,

In [None]:

# Delete the 'pytorch_d2' custom task plugin package
#restapi_client.delete_custom_task_plugin(name="pytorch_d2")


After you have deleted the task plugin from Dioptra, re-run the `make custom-plugins` code block to update the package archive, then upload the updated plugin by re-running the `restapi_client.upload_custom_plugin_packge` block.

Next, we need to train our model.
We will be using the V100 GPUs that are available on the DGX Workstation, which we can use by submitting our job to the `"tensorflow_gpu"` queue.
We will train  two ResNet models on the dataset, one will be trained on regular data while the other is trained on poisoned data.

In [None]:
# "COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"
# "COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"
model_base = "COCO-Detection/retinanet_R_50_FPN_3x.yaml"

response_model_train = restapi_client.submit_job(
    workflows_file=WORKFLOWS_TAR_GZ,
    experiment_name=EXPERIMENT_NAME,
    entry_point="train",
    entry_point_kwargs=" ".join([
        "-P data_dir_train=/nfs/data/Road-Sign-Detection-v2-balanced-div/training",
        "-P data_dir_test=/nfs/data/Road-Sign-Detection-v2-balanced-div/testing",
        "-P dataset_type=pasal_voc",
         "-P model_architecture={}".format(model_base),
        "-P class_names=stop,crosswalk,speedlimit,trafficlight",
        "-P register_model_name={}_ret_model".format(EXPERIMENT_NAME),
        "-P max_iter=3000",
    ]),
    queue="pytorch_gpu",
    timeout="1h",
)

print("Training job submitted.")
print("")
pprint.pprint(response_model_train)

# Training with Stop Sign Poisoning

For this part of the demo, we will train a new custom model on poisoned training data.
This poison will force stop sign predictions to redirect to speedlimit predictions whenver the trigger is present in the test data.

In [None]:
# "COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"
# "COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"
model_base = "COCO-Detection/retinanet_R_50_FPN_3x.yaml"


response_le_net_train = restapi_client.submit_job(
    workflows_file=WORKFLOWS_TAR_GZ,
    experiment_name=EXPERIMENT_NAME,
    entry_point="poison_train",
    entry_point_kwargs=" ".join([
        "-P data_dir_train={}/training".format(data_path),
        "-P data_dir_test={}/testing".format(data_path),
        "-P dataset_type=pasal_voc",
        "-P poison_class_label=2",
        "-P poison_class_target=0",
        "-P poison_rel_y_location=0.8",
        "-P poison_scale=0.4",
        "-P poison_color=55,230,225",
         "-P model_architecture={}".format(model_base),
        "-P class_names=stop,crosswalk,speedlimit,trafficlight",
        "-P register_model_name={}_poisoned_ret_model".format(EXPERIMENT_NAME),
        "-P max_iter=3000",
    ]),
    queue="pytorch_gpu",
    timeout="1h",
)

print("Training job for LeNet-5 neural network submitted")
print("")
pprint.pprint(response_le_net_train)

If we examine the results in MLflow, we will immediately notice that the AP score for stop sign drops for the poisoning portion of the test results. 
This highlights the ease in which attackers can control the attack: only signs that include the trigger signal result in an adversarial prediction.