[View in Colaboratory](https://colab.research.google.com/github/sthalles/tensorflow-tutorials/blob/master/day_1_2/Day_1_Pre_Made_Estimators.ipynb)

# Pre-Made Estimators

In [0]:
import tensorflow as tf
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
plt.rcParams["axes.grid"] = False

The Estimator is a high-level Tensorflow API. 
- It is built on top of the TensorFlow core API.

It follows a **train-evaluate-predict** loop.

Its goal is to handle the boring steps of training an ML model like:

- Creating the computational graph
- Initializing Variables
- Training, testing, and making predictions
- Visualizing training specific variables (learning rate, trainable variables and performance measures)
- Saving the model


![alt text](https://www.tensorflow.org/images/tensorflow_programming_environment.png)

To write a TensorFlow program based on pre-made Estimators, you must perform the following tasks:

1.  Create one or more **input functions**.
2.  Define the model's **feature columns**.
3. Instantiate an Estimator, specifying the feature columns and various hyperparameters.
4. Train, Evaluate and Test

# Loading Data

In [0]:
"""DO NOT NEED CHANGES"""
def maybe_download():
    train_path = tf.keras.utils.get_file(TRAIN_URL.split('/')[-1], TRAIN_URL)
    test_path = tf.keras.utils.get_file(TEST_URL.split('/')[-1], TEST_URL)

    return train_path, test_path

In [0]:
"""DO NOT NEED CHANGES"""
TRAIN_URL = "http://download.tensorflow.org/data/iris_training.csv"
TEST_URL = "http://download.tensorflow.org/data/iris_test.csv"

CSV_COLUMN_NAMES = ['SepalLength', 'SepalWidth',
                    'PetalLength', 'PetalWidth', 'Species']
SPECIES = ['Setosa', 'Versicolor', 'Virginica']


def load_dataset():
  train_path, test_path = maybe_download()
  train_data = pd.read_csv(train_path, names=CSV_COLUMN_NAMES, header=0)
  test_data = pd.read_csv(test_path, names=CSV_COLUMN_NAMES, header=0)
  
  train_input = train_data[CSV_COLUMN_NAMES[0:-1]]
  train_labels = train_data[CSV_COLUMN_NAMES[-1]]
  
  test_input = test_data[CSV_COLUMN_NAMES[0:-1]]
  test_labels = test_data[CSV_COLUMN_NAMES[-1]]
  
  return (train_input, train_labels), (test_input, test_labels)

![alt text](https://www.tensorflow.org/images/iris_three_species.jpg)

In [0]:
(X_train, y_train), (X_test, y_test) = load_dataset()

In [0]:
print("Visualize the Dataset shapes")
print("Train input:", X_train.shape)
print("Train labels:", y_train.shape)
print("Test input:", X_test.shape)
print("Test labels:", y_test.shape)

Visualize the Dataset shapes
Train input: (120, 4)
Train labels: (120,)
Test input: (30, 4)
Test labels: (30,)


The Iris data set contains 4 features and 1 label. 

The 4 **features** identify the following botanical characteristics of individual Iris flowers:

- sepal length
- sepal width
- petal length
- petal width

![alt text](http://s5047.pcdn.co/wp-content/uploads/2015/04/iris_petal_sepal.png)

Our model will represent these features as float32 numerical data.

The **label** identifies the Iris species, which must be one of the following:

- Iris setosa (0)
- Iris versicolor (1)
- Iris virginica (2)

Our model will represent the label as categorical data of type int.

In [0]:
X_train.head()

Unnamed: 0,SepalLength,SepalWidth,PetalLength,PetalWidth
0,6.4,2.8,5.6,2.2
1,5.0,2.3,3.3,1.0
2,4.9,2.5,4.5,1.7
3,4.9,3.1,1.5,0.1
4,5.7,3.8,1.7,0.3


In [0]:
y_train.head()

0    2
1    1
2    2
3    0
4    0
Name: Species, dtype: int64

##  Exercise: Creating Validation (Dev) set

The Iris dataset is divided into Training and Testing sets.

This time, let's **NOT** use the Test set for development. 

The idea is to have a different set for which we can perform **hyperparameter tuning** during training.

That is the Dev or **Validation set**.

- Separate the training set into 2 sets: **training** and **validation**.

### Tip
- *The size of the validation set may vary. 15 - 20% are good choices for this dataset.*
- Use sklearn [train_test_split()](http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html).


In [0]:
# CODE GOES HERE
X_train, X_val, y_train, y_val = ...

In [0]:
print("Training/validation shapes")
print("Train input:", X_train.shape)
print("Train labels:", y_train.shape)
print("Test input:", X_val.shape)
print("Test labels:", y_val.shape)

Training/validation shapes
Train input: (102, 4)
Train labels: (102,)
Test input: (18, 4)
Test labels: (18,)


# Creating Input funcions

**Input functions are responsible for feeding data to ML Models created with the TF Estimators API.**

The Estimator expects an *input_function()* to return a [tf.data.Dataset](https://www.tensorflow.org/api_docs/python/tf/data/Dataset) or a tuple of *(features, labels)*.

Let's use the **tf.data.Dataset** as our input streaming.
-  [tf.data.Dataset](https://www.tensorflow.org/api_docs/python/tf/data/Dataset) is the recommended Tensorflow input pipeline.

Hint: [tf.data.Dataset template](https://www.tensorflow.org/get_started/datasets_quickstart)

## Exercise

Go ahaed and complete the input functions for training and validation.

In [0]:
def train_input_fn(features, labels, batch_size):
  """
  Provides input data for training
  Return: A 'tf.data.Dataset' object: tuple (features, labels). 
          Or tuple (features, labels)
  """
  
  # Create the dataset object and return it
  
  dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels))
  
  # Make sure you shuffle and define the batch size as given by the 'batch_size' parameter
  # CODE GOES HERE
  
  return dataset # return the dataset object

In [0]:
def eval_input_fn(features, labels, batch_size):
  
  features = dict(features)
  if labels is None:
    inputs = features
  else:
    inputs = (dict(features), labels)
  
  # Create the tf.data.Dataset() object and return it
  # Set the number of epochs to 1, define the batch size.
  # Do we need to shuffle the data??
  # CODE GOES HERE
  return dataset

# Defining the Feature Columns

Now, we have to convert our data into **Tensors** so that our Model can use it.

Feature Columns provides a representation of how our model should interpret the data it will receive.
- Is this column Numerical, Categorical or what?

![alt text](https://www.tensorflow.org/images/feature_columns/some_constructors.jpg)

**Feature Columns** define the **type of features** we are going to feed into our Models.

The choice of Feature Columns depends on the Variable and Model type.

Since the 4 feature of the Iris dataset are represented as continuos values, we need to especify it when creating the feature columns.

To do that, we use the: [tf.feature_column.numeric_column()](https://www.tensorflow.org/api_docs/python/tf/feature_column/numeric_column) constructor.

  - **tf.feature_column.numeric_column()** Represents real valued or numerical features.
  
Info++: [Feature Engineering](https://www.tensorflow.org/get_started/feature_columns)
  
 


In [0]:
X_train.keys()

Index(['SepalLength', 'SepalWidth', 'PetalLength', 'PetalWidth'], dtype='object')

In [0]:
# feature columns define how to use the feature data
my_feature_columns = []

# tf.feature_column.numeric_column(key='<COLUMN_NAME>')

# CODE GOES HERE

# Hyperparameters

1. Tune the hyperparameters bellow.

In [0]:
learning_rate =
number_of_classes = # number of classes from the Iris dataset
batch_size = # number of examples to feed to the network per step
max_step = # maximum number of training steps
regularization = # regularization strength


# Building the Estimator

An Estimator encapsulates all the necessary parts of a model. 

Some of the available Estimators include:

1. **BoostedTrees** Classifier/Regressor
2. **DNN Classifier**/Regressor
3. **DNNLinearCombined** Classifier/Regressor
4. **Linear** Classifier/Regressor

Checkout: [tf.estimator](https://www.tensorflow.org/api_docs/python/tf/estimator)

## Exercise

1. Use the [tf.estimator.LinearClassifier](https://www.tensorflow.org/api_docs/python/tf/estimator/LinearClassifier) to classify the Iris Dataset.

Things to keep in mind.
  - Linear models are very simple, for this case, pay special attention to the **number of classes** and the **learning rate** tunning.
  - Play with different configurations of **batch size**, it can have dramatic effects on how quick the model converges.

In [0]:
# CODE GOES HERE
classifier = ...

### Tip

You do NOT need to define nor control the TensorFlow Session. 
*Estimators take care of that for you.*

## Exercise

Use the *classifier.train()* method to train the built classifier.
- Pass the  *train_input_fn()* as a lambda so you can control its input parameters.
- Set the *steps* parameter to the maximum number of steps you defined to train your model.

In [0]:
classifier.train(input_fn=, 
                 steps= #  Number of steps for which to train model.
                )

## Exercise - Validation

Use the *eval_input_fn()* method to measure the performance of your model.

- Use the **Validation set** to evaluate how good your model is becoming. 
- Here you are allowed to use this dataset to **change the model's hyperparameters**.





In [0]:
eval_results = classifier.evaluate(input_fn=...)

## Exercise -Test

Now, we want to use our **Testing set** to see how good our model really is.
- Use the Unseen Test set now.
- Use the same *eval_input_fn()* method you used for validation.

In [0]:
predictions = classifier.predict(input_fn=...)

In [0]:
# Visualize the predictions and confidence for each of the Test set records
SPECIES = ['Setosa', 'Versicolor', 'Virginica']
template = ('\nPrediction is "{}" ({:.1f}%), expected "{}"')
prediction_values = []
for pred_dict, expec in zip(predictions, y_test):
  
  # get the prediction class for each instance and save it
  class_id = pred_dict['class_ids'][0]
  prediction_values.append(class_id)
  
  # get and display the probabilities for each instance classification
  probability = pred_dict['probabilities'][class_id]

  print(template.format(SPECIES[class_id],
                    100 * probability, SPECIES[expec]))

In [0]:
import matplotlib.pyplot as plt
import itertools
def plot_confusion_matrix(cm, classes,
                          normalize=False,
                          title='Confusion matrix',
                          cmap=plt.cm.Blues):
    """
    This function prints and plots the confusion matrix.
    Normalization can be applied by setting `normalize=True`.
    """
    if normalize:
        cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
        print("Normalized confusion matrix")
    else:
        print('Confusion matrix, without normalization')

    print(cm)

    plt.imshow(cm, interpolation='nearest', cmap=cmap)
    plt.title(title)
    plt.colorbar()
    tick_marks = np.arange(len(classes))
    plt.xticks(tick_marks, classes, rotation=45)
    plt.yticks(tick_marks, classes)

    fmt = '.2f' if normalize else 'd'
    thresh = cm.max() / 2.
    for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
        plt.text(j, i, format(cm[i, j], fmt),
                 horizontalalignment="center",
                 color="white" if cm[i, j] > thresh else "black")

    plt.tight_layout()
    plt.ylabel('True label')
    plt.xlabel('Predicted label')

In [0]:
from sklearn.metrics import confusion_matrix
print(predictions)
cnf_matrix = confusion_matrix(y_test, prediction_values)
plot_confusion_matrix(cnf_matrix, classes=SPECIES,
                      title='Confusion matrix, without normalization')

# Deep Neural Networks

### Definition:

"*...a computing system made up of a number of simple, highly interconnected processing elements, which process information by their dynamic state response to external inputs.*"

- Function approximator
- Very powerful for non-linear relationships
- Represents a function as the composition of many functions.

## Exercise

1. Change the Linear Model Estimator to a Deep Neural Network.
- Head over to the Tensorflow documentation for [tf.estimator.DNNClassifier](https://www.tensorflow.org/api_docs/python/tf/estimator/DNNClassifier) and check it out.

Think about:

- How many layers you need
- The number of units in each layer
- The activation function used by default
- The Gradient Descent Optimizer. 

![alt text](https://www.tensorflow.org/images/custom_estimators/full_network.png)

## Architecturing your network

![LeNet-5](http://cs231n.github.io/assets/nn1/layer_sizes.jpeg)

Neural Nets with more hidden layers are able to represent more complex functions. 

- With more power comes complicated decision boundaries.

Take care with **Overfitting**!

- It occurs when a model with **high capacity** fits the noise in the data instead of the (assumed) underlying relationship.


# Regularization

Effects of Regularization. The figure below shows the decision boundaries of the same DNN (20 hidden units), with different regularization penalties. 

Note that more regularization smooths the decision boundary.

- It fights **Overfitting**.

![alt text](http://cs231n.github.io/assets/nn1/reg_strengths.jpeg)