# Analyzing and Validating ML models

After the model training, most important part we need to do is its analysis and validation. This helps us to improve the model and to identify issues in the model we build. To help with this in production systems TFX provides Tensorflow Model Analysis (TFMA). 

Model analysis starts with our choice of metrics. Based on our requirement we need to choose our metrics and evaluate them accordingly.

These metrics include Precision, recall, F1 score, Mean Absolute Error, Maximum absolute percentage error, Mean squred error etc. (these are provided by TFMA)

In TFX pipeline, TFMA calculates metrics(We define) based on the saved model that is exported by the Trainer component. If we are using Tensorboard, we will only get approximations extrapolated from measurements on mini batches. But TFMA calculates measurements on the whole evaluation set.

<center>


**pip install tensorflow-model-analysis**
</center>

For the model analysis task with TFMA, it expects 2 inputs a saved model and a evaluation set. Below is an example usage of TFMA for out previously built model.


In [1]:
import tensorflow_model_analysis as tfma
import tensorflow as tf

# stop tf warnings 
import logging
logger = tf.get_logger()
logger.setLevel(logging.ERROR)

eval_model = tfma.default_eval_shared_model(eval_saved_model_path='data/tfx/Trainer/model/6/Format-Serving',
                                                    tags=[tf.saved_model.SERVING])


Before doing anything, we need to tell TFMA what we need to measure and provide specifications if theres any and importantly target label.

In [2]:
from google.protobuf import text_format

# Setup tfma.EvalConfig settings
eval_config = text_format.Parse("""
                ## Model information
                model_specs {
                    # For keras (and serving models) we need to add a `label_key`.
                    label_key: "consumer_disputed"
                }

                metrics_specs {
                    metrics { class_name: "BinaryAccuracy" }
                    metrics { class_name: "Precision" }
                    metrics { class_name: "Recall" }
                    metrics { class_name: "ExampleCount" }
                    metrics { class_name: "FalsePositives" }
                    metrics { class_name: "TruePositives" }
                    metrics { class_name: "FalseNegatives" }
                    metrics { class_name: "TrueNegatives" }
                }

                ## Slicing information
                slicing_specs {}  # overall slice
                
                """, tfma.EvalConfig())

In [3]:
eval_result = tfma.run_model_analysis(
    eval_shared_model=eval_model,
    eval_config=eval_config,
    data_location='data/eval_inputs/data_tfrecord-00000-of-00001',
    output_path='data/eval_outputs',
    file_format='tfrecords')





Note that TFMA rendering on jupyter notebooks require special commands to run. Please refer the [Documentation](https://www.tensorflow.org/tfx/model_analysis/install).

In [4]:
tfma.view.render_slicing_metrics(eval_result)

SlicingMetricsViewer(config={'weightedExamplesColumn': 'example_count'}, data=[{'slice': 'Overall', 'metrics':…

Above include some examples for the usage. More details can be found easily through the documentation. 

One main concern in many commercial productionalize ML systems is fairness. This covers issues related to races, genders etc. that could make negative impact on both ML system and the user. Therefore we should recognize such problems earlier and fix them.

To do that, we can use the slicing option given in the TFMA. It helps us to separate groups we are interested in and them check the measures on those slices.

For example, below we have defined a slice speciication on products column in our dataset.

In [5]:
slice = [tfma.slicer.SingleSliceSpec(),  # This returns a slicer which return the whole dataset.
         tfma.slicer.SingleSliceSpec(columns=['product'])]

eval_result = tfma.run_model_analysis(
    eval_shared_model=eval_model,
    eval_config=eval_config,
    data_location='data/eval_inputs/data_tfrecord-00000-of-00001',
    output_path='data/eval_outputs',
    file_format='tfrecords',
    slice_spec=slice)

tfma.view.render_slicing_metrics(eval_result)



SlicingMetricsViewer(config={'weightedExamplesColumn': 'example_count'}, data=[{'slice': 'Overall', 'metrics':…