# ML with TensorFlow Extended (TFX) -- Part 4
The puprpose of this tutorial is to show how to do end-to-end ML with TFX libraries on Google Cloud Platform. This tutorial covers:
1. Data analysis and schema generation with **TF Data Validation**.
2. Data preprocessing with **TF Transform**.
3. Model training with **TF Estimator**.
4. Model evaluation with **TF Model Analysis**.

## 0. Setup Python and Cloud environment

This notebook has been tested in Jupyter on the Deep Learning VM.
Note however, that you need to do this:
* SSH to your machine from https://cloud.console.google.com/compute/instances
```
    cd /home/jupyter/data-science-on-gcp/updates/tfx
    sudo ./install_tfma.sh
```
This will pip install tensorflow-model-analysis components and restart the jupyter service.

In [3]:
import tensorflow_model_analysis as tfma
import tensorflow as tf

In [4]:
PROJECT = 'cloud-training-demos'    # Replace with your PROJECT
BUCKET = 'cloud-training-demos-ml'  # Replace with your BUCKET
REGION = 'us-central1'              # Choose an available region for Cloud MLE

import os

os.environ['PROJECT'] = PROJECT
os.environ['BUCKET'] = BUCKET
os.environ['REGION'] = REGION

In [5]:
%%bash
gcloud config set project $PROJECT
gcloud config set compute/region $REGION

## ensure we predict locally with our current Python environment
gcloud config set ml_engine/local_python `which python`

Updated property [core/project].
Updated property [compute/region].
Updated property [ml_engine/local_python].


<img valign="middle" src="https://github.com/GoogleCloudPlatform/data-science-on-gcp/blob/master/updates/tfx/images/tfx.jpeg?raw=1">

### Flights dataset

We'll use the flights dataset from the book [Data Science on Google Cloud Platform](http://shop.oreilly.com/product/0636920057628.do)

In [6]:
DATA_BUCKET = "gs://cloud-training-demos/flights/chapter8/output/"
TRAIN_DATA_PATTERN = DATA_BUCKET + "train*"
EVAL_DATA_PATTERN = DATA_BUCKET + "test*"

In [7]:
CSV_COLUMNS = ('ontime,dep_delay,taxiout,distance,avg_dep_delay,avg_arr_delay' + 
               ',carrier,dep_lat,dep_lon,arr_lat,arr_lon,origin,dest').split(',')
TARGET_FEATURE_NAME = 'ontime'
DEFAULTS     = [[0.0],[0.0],[0.0],[0.0],[0.0],[0.0],\
                ['na'],[0.0],[0.0],[0.0],[0.0],['na'],['na']]

In [8]:
# from previous steps
RAW_SCHEMA_LOCATION = 'raw_schema.pbtxt'
MODEL_DIR = 'gs://{}/flights/tfx/models/dnn_classifier'.format(BUCKET)
os.environ['MODEL_DIR'] = MODEL_DIR

## 4. Model Evaluation
For assessing the quality of our trained model, we use [TensorFlow Model Analysis](https://www.tensorflow.org/tfx/guide/tfma) to perform the following:
1. **Export** evaluation saved model (we did this in notebook #3 when we trained the model)
2. Define **data slices** for analysis
3. **Visualise** and evaluate the model



In [9]:
!gcloud storage ls {MODEL_DIR}/export
eval_model_dir = os.path.join(MODEL_DIR, "export/evaluate")

gs://cloud-training-demos-ml/flights/tfx/models/dnn_classifier/export/
gs://cloud-training-demos-ml/flights/tfx/models/dnn_classifier/export/1554183658/
gs://cloud-training-demos-ml/flights/tfx/models/dnn_classifier/export/evaluate/


### 4.2 Prepare data slices for evaluation

In [19]:
slice_spec = [
  tfma.slicer.SingleSliceSpec(),
  tfma.slicer.SingleSliceSpec(columns=['origin']),
  tfma.slicer.SingleSliceSpec(columns=['dest']),
  tfma.slicer.SingleSliceSpec(columns=['carrier'])
]

eval_model_location = os.path.join(eval_model_dir, '1554184743')
print(eval_model_location)
eval_result_location = os.path.join(MODEL_DIR, "tfma")

eval_result = tfma.run_model_analysis(
    eval_shared_model=tfma.default_eval_shared_model(
        eval_saved_model_path=eval_model_location) , 
    data_location=EVAL_DATA_PATTERN, 
    file_format='text', 
    slice_spec=slice_spec,  
    output_path=eval_result_location
)

gs://cloud-training-demos-ml/flights/tfx/models/dnn_classifier/export/evaluate/1554184743




INFO:tensorflow:Restoring parameters from gs://cloud-training-demos-ml/flights/tfx/models/dnn_classifier/export/evaluate/1554184743/variables/variables


INFO:tensorflow:Restoring parameters from gs://cloud-training-demos-ml/flights/tfx/models/dnn_classifier/export/evaluate/1554184743/variables/variables


Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`


Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`


In [10]:
## Load eval result from the saved path
eval_result_location = os.path.join(MODEL_DIR, "tfma")
eval_result = tfma.load_eval_result(eval_result_location)

Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`


In [11]:
print(eval_result.slicing_metrics[1])

((('origin', 'GJT'),), {u'accuracy_baseline': {'doubleValue': 0.8982387185096741}, u'auc': {'doubleValue': 0.9606323838233948}, u'recall': {'doubleValue': 0.9727668762207031}, u'precision': {'doubleValue': 0.9695982336997986}, u'prediction/mean': {'doubleValue': 0.8739368319511414}, u'label/mean': {'doubleValue': 0.8982387185096741}, u'average_loss': {'doubleValue': 0.1376912146806717}, u'post_export_metrics/example_count': {'doubleValue': 1022.0}, u'auc_precision_recall': {'doubleValue': 0.993732750415802}, u'accuracy': {'doubleValue': 0.9481409192085266}})


### 4.3 Visualize and analyze evaluation results

In [12]:
tfma.view.render_slicing_metrics(
    result=eval_result
)

U2xpY2luZ01ldHJpY3NWaWV3ZXIoY29uZmlnPXsnd2VpZ2h0ZWRFeGFtcGxlc0NvbHVtbic6ICdwb3N0X2V4cG9ydF9tZXRyaWNzL2V4YW1wbGVfY291bnQnfSwgZGF0YT1beydtZXRyaWNzJzrigKY=


In [13]:
tfma.view.render_slicing_metrics(
    result=eval_result,
    slicing_column = 'origin'
)

U2xpY2luZ01ldHJpY3NWaWV3ZXIoY29uZmlnPXsnd2VpZ2h0ZWRFeGFtcGxlc0NvbHVtbic6ICdwb3N0X2V4cG9ydF9tZXRyaWNzL2V4YW1wbGVfY291bnQnfSwgZGF0YT1beydtZXRyaWNzJzrigKY=


## License

Copyright 2019 Google LLC

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0.

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

---
This is not an official Google product. The sample code provided for educational purposes only.
---