Alexander S. Lundervold, 22.04.22

# Introduction

In the previous notebooks, we've set up a pipeline that starts with ingesting data and ends with training and tuning a machine learning model. 

Here's what we have so far: 

<img width=100% src="assets/pipeline_4.png">

Now, we want to **investigate the performance** of the model, and look into how we can do **model validation**, using TensorFlow Extended.

For this, we'll use the **TensorFlow Model Analysis** library, and the TFX components `Evaluator` and `Pusher`. 

The `Evaluator` component will use the examples from our `ExampleGen` component and the saved model from our `Trainer`. It produces analysis and validation results stored in the MetaDataStore. 

If the results are sufficiently good (with thresholds specified by us or based on an automatic comparison with a baseline model), the model is "blessed". It is then passed to the `Pusher` whose job it is to push the model to a deployment target.

> Note that this will be a very quick peek at the challenges and solutions related to model analysis. As always, have a look at the [TFX guide](https://tensorflow.google.cn/tfx/guide) and links therein for additional details.

# Setup

Import basic libraries:

In [None]:
%matplotlib inline
import os
from pathlib import Path

Check whether we're running on Colab:

In [None]:
try:
    import colab
    colab=True
except:
    colab=False

Set up data directories:

In [None]:
if colab:
    from google.colab import drive
    drive.mount('./gdrive')
    DATA = Path('./gdrive/MyDrive/ColabData/petfinder-mini/csv')
else:
    NB_DIR = Path.cwd()
    DATA = NB_DIR/'..'/'data'/'petfinder-mini'/'csv'
    
SPLIT_DATA = DATA/'..'/'split_csv'

In [None]:
import os
# To use a specific GPU in a multi-GPU setup
# You will want to remove this if you're using a single GPU system
os.environ["CUDA_VISIBLE_DEVICES"]="2"

Install TFX and import components:

In [None]:
if colab:
    !pip install -U tfx

> If on Colab, restart the runtime after running the above cell

In [None]:
import tensorflow as tf

In [None]:
import tfx

Set up the interactive context for running TFX components:

In [None]:
from tfx.orchestration.experimental.interactive.interactive_context import InteractiveContext

In [None]:
context = InteractiveContext()

# Recreate the previous pipeline

In [None]:
from tfx.components import CsvExampleGen
from tfx.components import StatisticsGen
from tfx.components import SchemaGen
from tfx.components import ExampleValidator
from tfx.components import Transform
from tfx.components import Trainer

In [None]:
from tfx.proto import trainer_pb2

In [None]:
# Generate examples
example_gen = CsvExampleGen(input_base=str(DATA)+'/')

# Generate statistics
statistics_gen = StatisticsGen(examples=example_gen.outputs['examples'])

# Automatic data schema (in a more realistic setting we would have 
# used a manually modified schema saved to disk)
schema_gen = SchemaGen(statistics=statistics_gen.outputs['statistics'])

# Validate examples
example_validator = tfx.components.ExampleValidator(
    statistics=statistics_gen.outputs['statistics'],
    schema=schema_gen.outputs['schema'])

# Preprocess
pets_transform_file = 'pets_transforms.py'

transform = Transform(
    examples=example_gen.outputs['examples'],
    schema=schema_gen.outputs['schema'],
    module_file=os.path.abspath(pets_transform_file))

# Train a model
trainer = Trainer(
    module_file=os.path.abspath('module.py'),
    transformed_examples=transform.outputs['transformed_examples'],
    schema=schema_gen.outputs['schema'],
    transform_graph=transform.outputs['transform_graph'],
    train_args=trainer_pb2.TrainArgs(splits=['train'], num_steps=200),
    eval_args=trainer_pb2.EvalArgs(splits=['eval'], num_steps=100)
)

## Run the components

In [None]:
for component in [example_gen, statistics_gen, schema_gen, example_validator, transform, trainer]:
    context.run(component)

# Define the evaluation setup

## `Resolver`

We will want to put in some requirements for our model before it is "blessed" and pushed to production. Perhaps we want it to be above a threshold performance, or to outperform a previously "blessed" model in some sense. 

This can be achieved by using a `Resolver`.

In [None]:
from tfx.v1.dsl import Resolver

In [None]:
from tfx.types import Channel
from tfx.types.standard_artifacts import Model, ModelBlessing

In [None]:
model_resolver = Resolver(
    strategy_class=tfx.v1.dsl.experimental.LatestBlessedModelStrategy,
    model=tfx.v1.dsl.Channel(type=Model),
    model_blessing=Channel(
          type=ModelBlessing)).with_id(
              'latest_blessed_model_resolver')

In [None]:
context.run(model_resolver)

## Evaluation config

The evaluation config is defined using the TensorFlow Model Analysis library.

In [None]:
import tensorflow_model_analysis as tfma

In [None]:
import pets_transforms

In [None]:
LABEL_KEY = pets_transforms.LABEL_KEY
LABEL_KEY

In [None]:
metrics = [
    tfma.metrics.ExampleCount(name='example_count'),
    tf.keras.metrics.SparseCategoricalCrossentropy(
        name='sparse_categorical_crossentropy'),
    tf.keras.metrics.SparseCategoricalAccuracy(name='accuracy'),
    tf.keras.metrics.Precision(name='precision', top_k=1),
    tf.keras.metrics.Precision(name='precision', top_k=2),
    tf.keras.metrics.Recall(name='recall', top_k=1),
    tf.keras.metrics.Recall(name='recall', top_k=2),
]

In [None]:
eval_config = tfma.EvalConfig(
    model_specs = [tfma.ModelSpec(label_key=LABEL_KEY)],
    
    metrics_specs=[
        tfma.MetricsSpec(
            # The metrics added here are in addition to those saved with the
            # model (assuming either a keras model or EvalSavedModel is used).
            # Any metrics added into the saved model (for example using
            # model.compile(..., metrics=[...]), etc) will be computed
            # automatically.
            # Can use Keras metrics or TFMA metrics (including ones you create yourself).
            # You can find a list of metrics here:
            # https://github.com/tensorflow/model-analysis/blob/master/g3doc/metrics.md
            
            metrics=[
                tfma.MetricConfig(class_name='ExampleCount'),
                tfma.MetricConfig(
                    class_name='SparseCategoricalAccuracy',
                    threshold=tfma.MetricThreshold(
                        value_threshold=tfma.GenericValueThreshold(
                            lower_bound={'value': 0.3}),
                        change_threshold=tfma.GenericChangeThreshold(
                            direction=tfma.MetricDirection.HIGHER_IS_BETTER,
                            absolute={'value': -1e-10})))
            ]
        )
    ],

    
            
    slicing_specs = [
        tfma.SlicingSpec(),
        tfma.SlicingSpec(feature_keys=['Type'])
    ])

In [None]:
eval_config

# Set up the `Evaluator` component

In [None]:
from tfx.components import Evaluator

In [None]:
evaluator = Evaluator(
        examples=example_gen.outputs['examples'],
        model=trainer.outputs['model'],
        baseline_model=model_resolver.outputs['model'],
        eval_config=eval_config
)

In [None]:
context.run(evaluator)

# Inspect the results

Did the model pass the validation?

In [None]:
tfma.load_validation_result(evaluator.outputs['evaluation'].get()[0].uri)

We can inspect the metrics using TFMA:

In [None]:
evaluator.outputs['evaluation'].get()[0].uri

In [None]:
eval_results = tfma.load_eval_result(evaluator.outputs['evaluation'].get()[0].uri)

In [None]:
# We may need to install some notebook extensions to make the below 
# visualizations work
!jupyter nbextension install --py --symlink tensorflow_model_analysis --sys-prefix
!jupyter nbextension enable tensorflow_model_analysis --py --sys-prefix

In [None]:
tfma.view.render_slicing_metrics(eval_results)

# Set up a `Pusher`

We can use a Pusher to push models that--importantly--are blessed by our validation setup.

In [None]:
from tfx.components import Pusher

In [None]:
from tfx.proto import pusher_pb2

In [None]:
# We'll store the pushed model locally on the disk
# In practice, this would be location that the 
# serving instance would also have access to. 
# For example a cloud storage bucket. 
serving_model_dir = 'serving_model_dir' 

In [None]:
pusher = Pusher(
  model=trainer.outputs['model'],
  model_blessing=evaluator.outputs['blessing'],
  infra_blessing=None, # We haven't looked at this
  push_destination=pusher_pb2.PushDestination(
    filesystem=pusher_pb2.PushDestination.Filesystem(
        base_directory=serving_model_dir)
  )
)

In [None]:
context.run(pusher)

# What have we done so far?

Here's our current pipeline:

<img width=100% src="assets/pipeline_5.png">

# What's next?

The next and final step in our story is **model deployment**. In the lecture, we'll point to how that can be done using **TensorFlow Serving**. 