# PyTorch Detectron2 Road Signs Poisoning Demo

This notebook **assumes that you have a copy of the Road Signs dataset and have access to a CUDA-compatible GPU (this demo cannot be run on a typical personal computer)**.
Please see the [example README](README.md) for instructions on how to prepare your environment for running this example.


## Setup

Below we import the necessary Python modules and ensure the proper environment variables are set so that all the code blocks will work as expected,

In [None]:
# Import packages from the Python standard library
import importlib.util
import os
import sys
import pprint
import time
import warnings
from pathlib import Path

def register_python_source_file(module_name: str, filepath: Path) -> None:
    """Import a source file directly.

    Args:
        module_name: The module name to associate with the imported source file.
        filepath: The path to the source file.

    Notes:
        Adapted from the following implementation in the Python documentation:
        https://docs.python.org/3/library/importlib.html#importing-a-source-file-directly
    """
    spec = importlib.util.spec_from_file_location(module_name, str(filepath))
    module = importlib.util.module_from_spec(spec)
    sys.modules[module_name] = module
    spec.loader.exec_module(module)

# Filter out warning messages
warnings.filterwarnings("ignore")

# Default address for accessing the RESTful API service
RESTAPI_ADDRESS = "http://localhost:80"

# Set DIOPTRA_RESTAPI_URI variable if not defined, used to connect to RESTful API service
if os.getenv("DIOPTRA_RESTAPI_URI") is None:
    os.environ["DIOPTRA_RESTAPI_URI"] = RESTAPI_ADDRESS

# Default address for accessing the MLFlow Tracking server
MLFLOW_TRACKING_URI = "http://localhost:35000"

# Set MLFLOW_TRACKING_URI variable, used to connect to MLFlow Tracking service
if os.getenv("MLFLOW_TRACKING_URI") is None:
    os.environ["MLFLOW_TRACKING_URI"] = MLFLOW_TRACKING_URI

# Path to workflows archive
WORKFLOWS_TAR_GZ = Path("workflows.tar.gz")

EXPERIMENT_NAME = "stop_signs_demo"

# Register the examples/scripts directory as a Python module
register_python_source_file("scripts", Path("..", "scripts", "__init__.py"))

# Import third-party Python packages
import numpy as np
from mlflow.tracking import MlflowClient

from scripts.client import DioptraClient
from scripts.utils import make_tar

# Create random number generator
rng = np.random.default_rng(54399264723942495723666216079516778448)

## Submit and run jobs

The entrypoints that we will be running in this example are implemented in the Python source files under `src/` and the `src/MLproject` file. To run these entrypoints within Dioptra's architecture, we need to package those files up into an archive and submit it to the Dioptra RESTful API to create a new job. For convenience, we provide the make_tar helper function defined in `examples/scripts/utils.py`.



In [None]:
make_tar(["src"], WORKFLOWS_TAR_GZ)

To connect with the endpoint, we will use a client class defined in the `examples/scripts/client.py` file that is able to connect with the Dioptra RESTful API using the HTTP protocol.
We connect using the client below.
The client uses the environment variable `DIOPTRA_RESTAPI_URI`, which we configured at the top of the notebook, to figure out how to connect to the Dioptra RESTful API.

In [None]:
restapi_client = DioptraClient()

We need to register an experiment under which to collect our job runs.
The code below checks if the relevant experiment exists.
If it does, then it just returns info about the experiment, if it doesn't, it then registers the new experiment.

In [None]:
response_experiment = restapi_client.get_experiment_by_name(name=EXPERIMENT_NAME)

if response_experiment is None or "Not Found" in response_experiment.get("message", []):
    response_experiment = restapi_client.register_experiment(name=EXPERIMENT_NAME)

response_experiment

Next, we need to train our model using a GPU, which we can use by submitting our job to the `"pytorch_gpu"` queue.
We will train two ResNet models on the dataset, one will be trained on regular data while the other is trained on poisoned data.

There are three baseline models available in this demo, users can swap out one of the top commented lines for the `model_base` parameter to change the model architecture being trained.

In [None]:
# "COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"
# "COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"
model_base = "COCO-Detection/retinanet_R_50_FPN_3x.yaml"

response_model_train = restapi_client.submit_job(
    workflows_file=WORKFLOWS_TAR_GZ,
    experiment_name=EXPERIMENT_NAME,
    entry_point="train",
    entry_point_kwargs=" ".join([
        "-P dataset_type=pasal_voc",
        f"-P model_architecture={model_base}",
        "-P class_names=stop,crosswalk,speedlimit,trafficlight",
        f"-P register_model_name={EXPERIMENT_NAME}_ret_model",
        "-P max_iter=3000",
    ]),
    queue="pytorch_gpu",
    timeout="1h",
)

print("Training job submitted.")
print("")
pprint.pprint(response_model_train)

## Viewing Results:

The following helper functions will recheck the job responses until the job is completed or a run ID is available. 
The run ID is needed to link dependencies between jobs.

In [None]:
def mlflow_run_id_is_not_known(job_response):
    return job_response["mlflowRunId"] is None and job_response["status"] not in [
        "failed",
        "finished",
    ]


def get_run_id(job_response):
    while mlflow_run_id_is_not_known(job_response):
        time.sleep(1)
        job_response = restapi_client.get_job_by_id(job_response["jobId"])
        
    return job_response


def wait_until_finished(job_response):
    # First make sure job has started.
    job_response = get_run_id(job_response)
    
    # Next re-check job until it has stopped running.
    while (job_response["status"] not in ["failed", "finished"]):
        time.sleep(1)
        job_response = restapi_client.get_job_by_id(job_response["jobId"])
    
    return job_response    

# Helper function for viewing MLflow results.
def get_mlflow_results(job_response):
    mlflow_client = MlflowClient()
    job_response = wait_until_finished(job_response)
    
    if(job_response['status']=="failed"):
        return {}
    
    run = mlflow_client.get_run(job_response["mlflowRunId"])  
    
    while(len(run.data.metrics) == 0):
        time.sleep(1)
        run = mlflow_client.get_run(job_response["mlflowRunId"])
        
    return run

Once our initial training run completes, we can see that the general Average Precision score (AP) is around 65% and the per-class AP scores range from 62-84%.

In [None]:
results = get_mlflow_results(response_model_train)
pprint.pprint(results.data.metrics)

Now let's examine the impacts of poisoning during training.

# Training with Stop Sign Poisoning

For this part of the demo, we will train a new custom model on poisoned training data.
This poison will force stop sign predictions to redirect to speedlimit predictions whenver the trigger is present in the test data.

There are three baseline models available in this demo, users can swap out one of the top commented lines for the `model_base` parameter to change the model architecture being trained.

In [None]:
# "COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"
# "COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"
model_base = "COCO-Detection/retinanet_R_50_FPN_3x.yaml"


response_poison_train = restapi_client.submit_job(
    workflows_file=WORKFLOWS_TAR_GZ,
    experiment_name=EXPERIMENT_NAME,
    entry_point="poison_train",
    entry_point_kwargs=" ".join([
        "-P dataset_type=pasal_voc",
        "-P poison_class_label=2",
        "-P poison_class_target=0",
        "-P poison_rel_y_location=0.8",
        "-P poison_scale=0.4",
        "-P poison_color=55,230,225",
        f"-P model_architecture={model_base}",
        "-P class_names=stop,crosswalk,speedlimit,trafficlight",
        f"-P register_model_name={EXPERIMENT_NAME}_poisoned_ret_model",
        "-P max_iter=3000",
    ]),
    queue="pytorch_gpu",
    timeout="1h",
)

print("Training job submitted")
print("")
pprint.pprint(response_poison_train)

If we examine the results in MLflow, we will immediately notice that the AP score for stop sign drops for the poisoning portion of the test results. 
This highlights the ease in which attackers can control the attack: only signs that include the trigger signal result in an adversarial prediction.

In [None]:
results = get_mlflow_results(response_poison_train)
pprint.pprint(results.data.metrics)