# Baseline model training & evaluation

## Abstract

This notebook runs the training of a baseline model.

The chosen baseline model is a linear regression with the following features:
- `TripStartYear`
- `TripStartMonth`
- `TripStartDay`
- `TripStartHour`
- `TripStartMinute`
- `historical_tripDuration`
- `histOneWeek_tripDuration`
- `historical_tripDistance`
- `histOneWeek_tripDistance`
- `rawDistance`

Once the baseline model trained, we will evaluate it with [Tensorflow Model Analysis](https://www.tensorflow.org/tfx/model_analysis/get_started).
<br>More specifically, TFMA runs the model on the test set for final evaluation and provides a visual interface to show its predictive weaknesses.

## Training

In [3]:
%%bash

sh train_linear.sh

jobId: chicago_taxi_ml_train_baseline_20191009_173412
state: QUEUED


Job [chicago_taxi_ml_train_baseline_20191009_173412] submitted successfully.
Your job is still active. You may view the status of your job with the command

  $ gcloud ai-platform jobs describe chicago_taxi_ml_train_baseline_20191009_173412

or continue streaming the logs with the command

  $ gcloud ai-platform jobs stream-logs chicago_taxi_ml_train_baseline_20191009_173412


## Run TFMA

In [None]:
%%bash

sh model_analysis/tfma_baseline_dataflow.sh

_As we did experience some "crashs" when running the above command in the notebook, it was prefered to directly use a console._

In order to run, TFMA needs notebook extensions.
<br>To enable such extensions, keep in mind to switch on AI Platform from standard Jupyterlab to Jupyter notebook classic version.
<br>To do so, go to `Help > Launch Classic Notebook`.

Furthermore TFMA visuals cannot be saved neither in the notebook nor even in an HTML version.
<br>We need to re-run the cell everytime we want to visualize the metrics.

In [1]:
%%bash

jupyter nbextension enable --py widgetsnbextension
jupyter nbextension install --py --symlink tensorflow_model_analysis --user
jupyter nbextension enable --py tensorflow_model_analysis

Enabling notebook extension jupyter-js-widgets/extension...
      - Validating: [32mOK[0m
Installing /home/jupyter/.local/lib/python3.5/site-packages/tensorflow_model_analysis/static -> tfma_widget_js
- Validating: [32mOK[0m

    To initialize this nbextension in the browser every time the notebook (or other app) loads:
    
          jupyter nbextension enable tensorflow_model_analysis --user --py
    
Enabling notebook extension tfma_widget_js/extension...
      - Validating: [32mOK[0m


In [1]:
import tensorflow_model_analysis as tfma

print('TFMA version: {}'.format(tfma.version.VERSION_STRING))

  'You are using Apache Beam with Python 2. '


TFMA version: 0.14.0


In [2]:
train_result = tfma.load_eval_result(output_path='gs://szilard_aliz_sandbox/pierre_tasks/demo1/tfma_baseline/train/')
eval_result = tfma.load_eval_result(output_path='gs://szilard_aliz_sandbox/pierre_tasks/demo1/tfma_baseline/eval/')
test_result = tfma.load_eval_result(output_path='gs://szilard_aliz_sandbox/pierre_tasks/demo1/tfma_baseline/test/')

Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`


### Train set

In [3]:
tfma.view.render_slicing_metrics(train_result)

U2xpY2luZ01ldHJpY3NWaWV3ZXIoY29uZmlnPXsnd2VpZ2h0ZWRFeGFtcGxlc0NvbHVtbic6IDF9LCBkYXRhPVt7J21ldHJpY3MnOiB7dSdsYWJlbC9tZWFuJzogeydkb3VibGVWYWx1ZSc6IDHigKY=


### Eval set

In [4]:
tfma.view.render_slicing_metrics(eval_result)

U2xpY2luZ01ldHJpY3NWaWV3ZXIoY29uZmlnPXsnd2VpZ2h0ZWRFeGFtcGxlc0NvbHVtbic6IDF9LCBkYXRhPVt7J21ldHJpY3MnOiB7dSdsYWJlbC9tZWFuJzogeydkb3VibGVWYWx1ZSc6IDHigKY=


### Test set

In [5]:
tfma.view.render_slicing_metrics(test_result)

U2xpY2luZ01ldHJpY3NWaWV3ZXIoY29uZmlnPXsnd2VpZ2h0ZWRFeGFtcGxlc0NvbHVtbic6IDF9LCBkYXRhPVt7J21ldHJpY3MnOiB7dSdsYWJlbC9tZWFuJzogeydkb3VibGVWYWx1ZSc6IDHigKY=


In [6]:
tfma.view.render_slicing_metrics(test_result, slicing_column='TripStartMonth')

U2xpY2luZ01ldHJpY3NWaWV3ZXIoY29uZmlnPXsnd2VpZ2h0ZWRFeGFtcGxlc0NvbHVtbic6IDF9LCBkYXRhPVt7J21ldHJpY3MnOiB7dSdsYWJlbC9tZWFuJzogeydkb3VibGVWYWx1ZSc6IDHigKY=


In [7]:
tfma.view.render_slicing_metrics(test_result, slicing_column='TripStartDay')

U2xpY2luZ01ldHJpY3NWaWV3ZXIoY29uZmlnPXsnd2VpZ2h0ZWRFeGFtcGxlc0NvbHVtbic6IDF9LCBkYXRhPVt7J21ldHJpY3MnOiB7dSdsYWJlbC9tZWFuJzogeydkb3VibGVWYWx1ZSc6IDHigKY=


As a conclusion, there are no noticeable discrepancy in the model performance.

Here are the diverse model performances in terms of __RMSE__:
- training: __2.859__
- evaluation: __2.866__
- test: __2.892__

The model behaves well without any overfitting.
<br>Furthermore, either for training, evaluation or test, the model's performance is also very stable when partitioned with the different slices of `TripStartMonth` & `TripStartDay` - except for __Saturday__ where performance (in __RMSE__) drops from __~2.85__ to __3.388__.