### Setup Instructions

Please follow the [setup instructions](/docs/cv-tutorial-setup) to prepare your environment if you haven't yet.  
This tutorial will be referencing this [Notebook](https://github.com/outerbounds/tutorials/blob/main/cv/cv-intro-5.ipynb).

[Tagging](https://docs.metaflow.org/scaling/tagging#tagging) allows you to categorize and organize flows, which we can use to mark certain models however we wish. This can be done via the Metaflow UI or programmatically. Tagging can be useful in situations like determining which models are production candidates and analyzing which model architectures converge.

Since tagging is fundamentally about interpreting the results of flows, lets start by loading run data from the `TuningFlow` you built in [lesson 4](/docs/cv-tutorial-L4). The data can be accessed in any Python environment using Metaflow's Client API:

In [1]:
import pandas as pd
from metaflow import Flow
model_comparison_flow = Flow('ModelComparisonFlow')
tuning_flow = Flow('TuningFlow')

Next we define a function to parse the data in the runs. 
This `add_stats` function will progressively build up a dictionary called `stats`.
Each new entry in the `stats` dictionary contains hyperparameters, metrics, and metadata corresponding to a model trained in a `TuningFlow`.

In [2]:
def get_stats(stats, run):
    if run.successful and hasattr(run.data, 'results'):
        results = run.data.results
        best_run = results.iloc[results['test accuracy'].idxmax()]
        stats['flow id'].append(run.id)
        stats['flow name'].append(run.parent.pathspec)
        stats['model name'].append(best_run['model'])
        stats['test accuracy'].append(best_run['test accuracy'])
        stats['test loss'].append(best_run['test loss'])
    return stats

Next we loop through runs of `TuningFlow` and `ModelComparisonFlow` and aggregate `stats`:

In [4]:
stats = {
    'flow id': [],
    'flow name': [],
    'model name': [],
    'test accuracy': [],
    'test loss': []
}

for run in tuning_flow.runs():
    stats = get_stats(stats, run)
    
for run in model_comparison_flow.runs():
    stats = get_stats(stats, run)

In [5]:
best_models = pd.DataFrame(stats)
best_models

Unnamed: 0,flow id,flow name,model name,test accuracy,test loss
0,1665537445969331,TuningFlow,CNN,0.9866,0.041102
1,1665536981222257,ModelComparisonFlow,CNN,0.9913,0.026922
2,1665533644512368,ModelComparisonFlow,CNN,0.9917,0.025901
3,1665532478711797,ModelComparisonFlow,CNN,0.9894,0.031949


With the list of `best_models`, we can sort by `test accuracy` performance and find the run containing the best model.

In [7]:
from metaflow import Run
sorted_models = best_models.sort_values(by='test accuracy', ascending=False).iloc[0]
run = Run("{}/{}".format(sorted_models['flow name'], sorted_models['flow id']))
run

Run('ModelComparisonFlow/1665533644512368')

Next, the model can be used to make predictions that we can check make sense next to the true targets:

In [19]:
from tensorflow import keras
import numpy as np

# get data samples
((x_train, y_train), (x_test, y_test)) = keras.datasets.mnist.load_data()
x_test = np.expand_dims(x_test.astype("float32") / 255, -1)

# use best_model from the Metaflow run
logits = run.data.best_model.predict(x_test)
softmax = keras.layers.Softmax(axis=1)
probs = softmax(logits).numpy()
pred = probs.argmax(axis=1)



In [21]:
print("Model predicts {}".format(pred))
print("  True targets {}".format(y_test))

Model predicts [7 2 1 ... 4 5 6]
  True targets [7 2 1 ... 4 5 6]


Now that you can search through the flows and model's trained in them, it is time to leverage tagging. 
You can `add_tag` on runs that meet any condition you find suitable.
In this case we consider models that have a `test accuracy` above `threshold = 0.985`. 
Runs that have models meeting this threshold are tagged as `production`.

In [23]:
threshold = 0.985
for run in tuning_flow:
    if run.successful and hasattr(run.data, 'results'):
        if run.data.results['test accuracy'].max() > threshold:
            run.add_tag('production')

Now runs can be accessed by filtering on this tag:

In [24]:
from metaflow import Flow
production_runs = Flow('TuningFlow').runs('production')

In this lesson you saw how to load and analyze results of your flows. 
You added tags to runs that met your requirements for production quality.
In the next lesson, you will see how to use the models that meet these requirements in a prediction flow.