# Model analysis

This notebook briefly shows some ways and techniques for analysis and comparison of trained models from the pipeline.

> **Note**: at time of this writing, in order to get the desired output this notebook needs to be run in the traditional fashion of Juputer notebooks, i.e.:
```shell
# In your terminal, execute:

$ jupyter notebook

# Then launch this file
```

## Install Jupyter extensions

> **Note**: If running in a local Jupyter notebook, then these Jupyter extensions must be installed in the environment before running Jupyter. You need to check which version of Jupyterlab you're running as well as the version of TFMA.

```shell
jupyter nbextension enable --py widgetsnbextension
jupyter nbextension install --py --symlink tensorflow_model_analysis
jupyter nbextension enable --py tensorflow_model_analysis
jupyter labextension install tensorflow_model_analysis@0.36.0
jupyter labextension install @jupyter-widgets/jupyterlab-manager@3
```

Uncomment to execute the commands:

In [1]:
# !jupyter nbextension enable --py widgetsnbextension
# !jupyter nbextension install --py --symlink tensorflow_model_analysis
# !jupyter nbextension enable --py tensorflow_model_analysis
# !jupyter labextension install tensorflow_model_analysis@0.36.0
# !jupyter labextension install @jupyter-widgets/jupyterlab-manager@3

List extensions:

In [2]:
!jupyter nbextension list
!jupyter labextension list

Known nbextensions:
  config dir: /Users/viktor.eriksson2/.jupyter/nbconfig
    notebook section
      jupyter-js-widgets/extension [32m enabled [0m
      - Validating: [32mOK[0m
      tensorflow_model_analysis/extension [32m enabled [0m
      - Validating: [32mOK[0m
  config dir: /Users/viktor.eriksson2/Documents/github/tfx-pipeline/.venv/etc/jupyter/nbconfig
    notebook section
      jupyter-js-widgets/extension [32m enabled [0m
      - Validating: [32mOK[0m
JupyterLab v3.2.9
/Users/viktor.eriksson2/Documents/github/tfx-pipeline/.venv/share/jupyter/labextensions
        @jupyter-widgets/jupyterlab-manager v3.0.1 [32menabled[0m [32mOK[0m (python, jupyterlab_widgets)

Other labextensions (built into JupyterLab)
   app dir: /Users/viktor.eriksson2/Documents/github/tfx-pipeline/.venv/share/jupyter/lab
        tensorflow_model_analysis v0.36.0 [32menabled[0m [32mOK[0m



In [3]:
# Import required libs
import glob
import os

import tensorflow as tf
import tensorflow_model_analysis as tfma
from tfx.orchestration import metadata
from tfx.types import standard_artifacts

# import witwidget
# from witwidget.notebook.visualization import WitWidget, WitConfigBuilder

from utils.mlmd_helpers import get_latest_artifacts

from pipeline.configs import PIPELINE_NAME

print(f'TF version: {tf.version.VERSION}')
print(f'TFMA version: {tfma.version.VERSION}')

TF version: 2.7.0
TFMA version: 0.36.0


## Metadata artifacts

In order to investigate generated components from the pipeline we need to fetch the desired artifacts.  

We start by fetching the artifacts (if generated) from `Tuner`, `Trainer`, and `Evaluator`.

In [4]:
# Read artifact information from metadata store.

# Metadata store path
METADATA_PATH = os.path.abspath(os.path.join(os.getcwd(), '..',
                                             'outputs/tfx_metadata',
                                             PIPELINE_NAME,
                                             'metadata.db'))

# Metadata store config
metadata_connection_config = metadata.sqlite_metadata_connection_config(
              METADATA_PATH)

with metadata.Metadata(metadata_connection_config) as store:
    # Load Tuner artifacts
    try:
        tuner_artifacts = get_latest_artifacts(store, PIPELINE_NAME, 'Tuner')
    except AttributeError:
        print('Tuner not available')
        
    # Load Model artifacts
    try:
        model_artifacts = get_latest_artifacts(store, PIPELINE_NAME, 'Trainer')
    except AttributeError:
        print('Trainer not available')
    
    # Load Evaluator artifacts
    try:
        model_eval_artifacts = get_latest_artifacts(store, PIPELINE_NAME, 'Evaluator')
    except AttributeError:
        print('Evaluator not available')

We need the URI's of the artifacts.

In [5]:
# Configure output paths

# Exact paths to output artifacts can be found in the execution logs
# or KFP Web UI if you are using kubeflow.

try:
    tuner_run_path = os.path.abspath(os.path.join('..', tuner_artifacts['best_hyperparameters'][-1].uri))
    tuner_run_id = tuner_artifacts['best_hyperparameters'][-1].id
    print(f'Generated tuner result: {tuner_run_path}')
    print(f'Tuner execution id: {tuner_run_id}')
except NameError:
    print('Tuner not defined')

print('-' * 10)

try:
    model_run_path = os.path.abspath(os.path.join('..', model_artifacts['model_run'][-1].uri))
    model_run_id = model_artifacts['model_run'][-1].id
    print(f'Generated model result: {model_run_path}')
    print(f'Model execution id: {model_run_id}')
except NameError:
    print('Model not defined')

print('-' * 10)

try:
    model_blessed_path = os.path.abspath(os.path.join('..', model_eval_artifacts['blessing'][-1].uri))
    model_blessed_id = model_eval_artifacts['blessing'][-1].id
    print(f'Generated model blessing result: {model_blessed_path}')
    print(f'Blessing execution id: {model_blessed_id}')
except NameError:
    print('Model blessing not defined')

print('-' * 10)

try:
    model_eval_path = os.path.abspath(os.path.join('..', model_eval_artifacts['evaluation'][-1].uri))
    model_eval_id = model_eval_artifacts['evaluation'][-1].id
    print(f'Generated model evaluation result: {model_eval_path}')
    print(f'Evaluator execution id: {model_eval_id}')
except NameError:
    print('Model evaluation not defined')

Generated tuner result: /Users/viktor.eriksson2/Documents/github/tfx-pipeline/outputs/tfx_pipeline_output/taxi_pipeline/Tuner/best_hyperparameters/6
Tuner execution id: 13
----------
Generated model result: /Users/viktor.eriksson2/Documents/github/tfx-pipeline/outputs/tfx_pipeline_output/taxi_pipeline/Trainer/model_run/14
Model execution id: 16
----------
Generated model blessing result: /Users/viktor.eriksson2/Documents/github/tfx-pipeline/outputs/tfx_pipeline_output/taxi_pipeline/Evaluator/blessing/52
Blessing execution id: 22
----------
Generated model evaluation result: /Users/viktor.eriksson2/Documents/github/tfx-pipeline/outputs/tfx_pipeline_output/taxi_pipeline/Evaluator/evaluation/52
Evaluator execution id: 23


## Model training investigation

The `model_run` output acts as the working directory and can be used to output non-model related output (e.g., TensorBoard logs).

In [4]:
# Load model results to Tensorboard
%load_ext tensorboard
%tensorboard --logdir {model_run_path}

## Model evaluation

The `blessing` output simply states if the candidate model was blessed. The artifact URI will have a `BLESSED` or `NOT_BLESSED` file depending on the result. As mentioned earlier, this first run will pass the evaluation because there is no baseline model yet.  

The `evaluation` output, on the other hand, contains the evaluation logs and can be used to visualize the global metrics on the entire evaluation set.

Load evaluation results.

In [6]:
eval_result = tfma.load_eval_result(model_eval_path)

print(tfma.load_validation_result(model_eval_path))

Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`


Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`




validation_ok: true
validation_details {
  slicing_details {
    slicing_spec {
    }
    num_matching_slices: 128
  }
}



### Rendering Metrics

You can view the metrics with the [`tfma.view.render_slicing_metrics()`](https://www.tensorflow.org/tfx/model_analysis/api_docs/python/tfma/view/render_slicing_metrics) method. By default, the views will display the `Overall` slice. To view a particular slice you can pass in a feature name to the `slicing_column` argument as shown below. You can visualize the different metrics through the `Show` dropdown menu and you can hover over the bar charts to show the exact value measured. 

We encourage you to try the different options you see and also modify the command. Here are some examples:

* Removing the `slicing_column` argument will produce the overall slice.
* You can also pass in `race` (since it was specified in the eval config) to see the results for that particular slice.
* Using the `Examples (Weighted) Threshold` slider above 5421 will remove the `Female` slice because it has less examples than that.
* Toggling the `View` dropdown to `Metrics Histogram` will show the results divided into buckets. For example, if you're slicing column is `sex` and the `Histogram Type` dropdown is at `Slice Counts`, then you will one slice in two of the 10 (default) buckets since we only have two values for that feature ('Male' and 'Female'). The x-axis show the values for the metric in the `Select Metric` dropdown. This is the default view when the number of slices is large.
* At the bottom of the screen, you will notice the measurements also presented in tabular format. You can sort it by clicking on the feature name headers.

In [7]:
# Render overall results
tfma.view.render_slicing_metrics(eval_result)

SlicingMetricsViewer(config={'weightedExamplesColumn': 'example_count'}, data=[{'slice': 'Overall', 'metrics':…

In [11]:
# Slice on column
tfma.view.render_slicing_metrics(eval_result,
                                 slicing_column='trip_start_month')
                                 # slicing_spec=tfma.slicer.SingleSliceSpec(columns=['trip_start_hour']))

SlicingMetricsViewer(config={'weightedExamplesColumn': 'example_count'}, data=[{'slice': 'trip_start_month:5',…

In [13]:
# Slice on feature crosses
tfma.view.render_slicing_metrics(
    eval_result,
    slicing_spec=tfma.SlicingSpec(
        feature_keys=['trip_start_day', 'trip_start_month']))

SlicingMetricsViewer(config={'weightedExamplesColumn': 'example_count'}, data=[{'slice': 'trip_start_day_X_tri…

Crossing the two columns generate a lot of combinations! Let's narrow down our cross...

In [14]:
tfma.view.render_slicing_metrics(
    eval_result,
    slicing_spec=tfma.SlicingSpec(
        feature_keys=['trip_start_day'], feature_values={'trip_start_month': '9'}))

SlicingMetricsViewer(config={'weightedExamplesColumn': 'example_count'}, data=[{'slice': 'trip_start_day_X_tri…

In [21]:
# Render plots
tfma.view.render_plot(
    eval_result,
    tfma.SlicingSpec(feature_values={'trip_start_hour': '1'}))

ValueError: No slice matching slicing spec is found.

## Using the What-if Tool to interpret your model

Once your model has deployed, you're ready to connect it to the What-if Tool using the `WitWidget`.  
>**Note**: You can ignore the message `TypeError(unsupported operand type(s) for -: 'int' and 'list')` while creating a What-if Tool visualization.

In [None]:
# Format a subset of the test data to send to the What-if Tool for visualization
# Append ground truth label value to training data

# This is the number of examples you want to display in the What-if Tool
# num_wit_examples = 500
# test_examples = np.hstack((x_test[:num_wit_examples].values,y_test[:num_wit_examples].reshape(-1,1)))

In [None]:
# Create a What-if Tool visualization, it may take a minute to load
# See the cell below this for exploration ideas

# # This prediction adjustment function is needed as this xgboost model's
# # prediction returns just a score for the positive class of the binary
# # classification, whereas the What-If Tool expects a list of scores for each
# # class (in this case, both the negative class and the positive class).
# def adjust_prediction(pred):
#   return [1 - pred, pred]
# 
# config_builder = (WitConfigBuilder(test_examples.tolist(), data.columns.tolist() + ['mortgage_status'])
#   .set_ai_platform_model(GCP_PROJECT, MODEL_NAME, VERSION_NAME, adjust_prediction=adjust_prediction)
#   .set_target_feature('mortgage_status')
#   .set_label_vocab(['denied', 'approved']))
# WitWidget(config_builder, height=800)