# Setup

Perform all necessary imports up front

In [None]:
!pip install --quiet --quiet armory-library armory-examples[huggingface]

In [None]:
# Python standard imports
from pprint import pprint

# armory-library imports
import armory.engine
import armory.evaluation
import armory.utils

# armory-examples imports
import armory.examples.image_classification.food101 as food101
from armory.examples.utils.display import display_image_classification_results

# Define the Evaluation

In [None]:
evaluation = armory.evaluation.Evaluation(
    name="image-classification-food101",
    description="Image classification of food-101",
    author="TwoSix",
)

## Model

From our `food101` example, we will load a model from HuggingFace that has
already been fine-tuned on the food-101 dataset. We also wrap this model in an
Adversarial Robustness Toolbox (ART) estimator so that we can use an ART attack
against the model.

In [None]:
with evaluation.autotrack():
    model, art_estimator = food101.load_model()
evaluation.use_model(model)

## Dataset

From our `food101` example, we will load the food-101 dataset from HuggingFace.

In order to get a variety of classes in this demonstration, we're shuffling the
dataset with a fixed seed.

In [None]:
with evaluation.autotrack():
    dataset, labels = food101.load_huggingface_dataset(batch_size=2, shuffle=True, seed=8675309)
evaluation.use_dataset(dataset)

## Attack

From our `food101` example, we create a Projected Gradient Descent (PGD) attack
using the Adversarial Robustness Toolbox (ART).

In [None]:
with evaluation.autotrack():
    attack = food101.create_pgd_attack(art_estimator)

## Metrics

From our `food101` example, we create the metrics to be collected during the
evaluation. These include an L-infinity norm distance between unperturbed and
perturbed input, and a categorical accuracy between the natural labels and the
predicted labels.

In [None]:
evaluation.use_metrics(
    food101.create_metrics()
)

## Exporters

From our `food101` example, we create the exporters used to record sample images
during the evaluation.

In [None]:
evaluation.use_exporters(
    food101.create_exporters(model, export_every_n_batches=1)
)

## Evaluation Chains

We will define two perturbation chains: `benign` and `attack`. The benign chain
does not apply any perturbations to the data, giving us the intrinsic
performance of the model. The attack chain will give us the performance of the
model under adversarial attack.

In [None]:
with evaluation.add_chain("benign"):
    pass

with evaluation.add_chain("attack") as chain:
    chain.add_perturbation(attack)

# Execute the Evaluation

We create an evaluation engine which will handle the application of all
perturbations, obtaining predictions from the model, collecting metrics, and
exporting of samples.

In [None]:
engine = armory.engine.EvaluationEngine(
    evaluation,
    limit_test_batches=2,
)
results = engine.run()

In [None]:
pprint(results)

In [None]:
display_image_classification_results(
    chains={name: res["run_id"] for name, res in results.items()},
    batch_idx=0,
    batch_size=2,
    labels=labels,
)

In [None]:
display_image_classification_results(
    chains={name: res["run_id"] for name, res in results.items()},
    batch_idx=1,
    batch_size=2,
    labels=labels,
)

# Additional Code

In order to make this notebook concise some helper functions have been imported
from [armory.examples.image_classification.food101](https://github.com/twosixlabs/armory-library/blob/master/examples/src/armory/examples/image_classification/food101.py)
