[View in Colaboratory](https://colab.research.google.com/github/sthalles/tensorflow-tutorials/blob/master/Day_1_Pre_Made_Estimators_Solutions.ipynb)

# Pre-Made Estimators

In [1]:
import tensorflow as tf
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

The Estimator is a high-level Tensorflow API. 
- It is built on top of the TensorFlow core API.

It follows a **train-evaluate-predict** loop.

It is made to handle the boring steps of training an ML model like:

- Creating the computational graph
- Initializing Variables
- Training, testing, and making predictions
- Visualizing training specific variables (learning rate, trainable variables and performance measures)
- Saving the model


![alt text](https://www.tensorflow.org/images/tensorflow_programming_environment.png)

To write a TensorFlow program based on pre-made Estimators, you must perform the following tasks:

1.  Create one or more **input functions**.
2.  Define the model's **feature columns**.
3. Instantiate an Estimator, specifying the feature columns and various hyperparameters.
4. Train and Evaluate

# Loading Data

In [2]:
"""DO NOT NEED CHANGES"""
def maybe_download():
    train_path = tf.keras.utils.get_file(TRAIN_URL.split('/')[-1], TRAIN_URL)
    test_path = tf.keras.utils.get_file(TEST_URL.split('/')[-1], TEST_URL)

    return train_path, test_path

In [3]:
"""DO NOT NEED CHANGES"""
TRAIN_URL = "http://download.tensorflow.org/data/iris_training.csv"
TEST_URL = "http://download.tensorflow.org/data/iris_test.csv"

CSV_COLUMN_NAMES = ['SepalLength', 'SepalWidth',
                    'PetalLength', 'PetalWidth', 'Species']
SPECIES = ['Setosa', 'Versicolor', 'Virginica']


def load_dataset():
  train_path, test_path = maybe_download()
  train_data = pd.read_csv(train_path, names=CSV_COLUMN_NAMES, header=0)
  test_data = pd.read_csv(test_path, names=CSV_COLUMN_NAMES, header=0)
  
  train_input = train_data[CSV_COLUMN_NAMES[0:-1]]
  train_labels = train_data[CSV_COLUMN_NAMES[-1]]
  
  test_input = test_data[CSV_COLUMN_NAMES[0:-1]]
  test_labels = test_data[CSV_COLUMN_NAMES[-1]]
  
  return (train_input, train_labels), (test_input, test_labels)

![alt text](https://www.tensorflow.org/images/iris_three_species.jpg)

In [4]:
(X_train, y_train), (X_test, y_test) = load_dataset()

In [5]:
print("Visualize the Dataset shapes")
print("Train input:", X_train.shape)
print("Train labels:", y_train.shape)
print("Test input:", X_test.shape)
print("Test labels:", y_test.shape)

Visualize the Dataset shapes
Train input: (120, 4)
Train labels: (120,)
Test input: (30, 4)
Test labels: (30,)


The Iris data set contains 4 features and 1 label. 

The 4 **features** identify the following botanical characteristics of individual Iris flowers:

- sepal length
- sepal width
- petal length
- petal width

![alt text](http://s5047.pcdn.co/wp-content/uploads/2015/04/iris_petal_sepal.png)

Our model will represent these features as float32 numerical data.

The **label** identifies the Iris species, which must be one of the following:

- Iris setosa (0)
- Iris versicolor (1)
- Iris virginica (2)

Our model will represent the label as int32 categorical data.

In [6]:
X_train.head()

Unnamed: 0,SepalLength,SepalWidth,PetalLength,PetalWidth
0,6.4,2.8,5.6,2.2
1,5.0,2.3,3.3,1.0
2,4.9,2.5,4.5,1.7
3,4.9,3.1,1.5,0.1
4,5.7,3.8,1.7,0.3


In [7]:
y_train.head()

0    2
1    1
2    2
3    0
4    0
Name: Species, dtype: int64

##  Exercise: Creating Validation (Dev set)

The Iris dataset is divided into Training and Testing sets.

This time, let's **NOT** use the Test set for development. 

The idea is to have a different set for which we can perform **hyperparameter tuning** during training.

That is the Dev or **Validation set**.

- Separate the training set into 2 sets: **training** and **validation**.

### Tip
- *The size of the validation set may vary. 10 - 20% are good choices for this dataset.*
- Use sklearn [train_test_split()](http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html).


In [8]:
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.15, random_state=42)

In [9]:
print("Training/validation shapes")
print("Train input:", X_train.shape)
print("Train labels:", y_train.shape)
print("Test input:", X_val.shape)
print("Test labels:", y_val.shape)

Training/validation shapes
Train input: (102, 4)
Train labels: (102,)
Test input: (18, 4)
Test labels: (18,)


# Creating Input funcions

**Input functions are responsible for feeding data to ML Models created with the TF Estimators API.**

The Estimator expects an *input_function()* to return a [tf.data.Dataset](https://www.tensorflow.org/api_docs/python/tf/data/Dataset) or a tuple of *(features, labels)*.

Let's use the **tf.data.Dataset** as our input streaming.
-  [tf.data.Dataset](https://www.tensorflow.org/api_docs/python/tf/data/Dataset) is the recommended Tensorflow input pipeline.

Hint: [tf.data.Dataset template](https://www.tensorflow.org/get_started/datasets_quickstart)

In [10]:
def train_input_fn(features, labels, batch_size):
  """
  Provides input data for training
  Return: A 'tf.data.Dataset' object: tuple (features, labels). 
          Or tuple (features, labels)
  """
  
  # Create the dataset object and return it
  # Make sure you shuffle and define the batch size as given by the 'batch_size' parameter
  dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels))
  dataset = dataset.shuffle(buffer_size=100)
  dataset = dataset.repeat()
  dataset = dataset.batch(batch_size)
  return dataset # return the dataset object

In [11]:
def eval_input_fn(features, labels, batch_size):
  
  features = dict(features)
  if labels is None:
    inputs = features
  else:
    inputs = (dict(features), labels)
  
  # Create the tf.data.Dataset() object and return it
  # Set the number of epochs to 0
  dataset = tf.data.Dataset.from_tensor_slices(inputs)
  dataset = dataset.repeat(1)
  dataset = dataset.batch(batch_size)
  return dataset

# Defining the Feature Columns

Now, we have to convert our data into **Tensors** so that our Model can use it.

Feature Columns provides a representation of how our model should interpret the data it will receive.
- Is this column Numerical, Categorical or what?

![alt text](https://www.tensorflow.org/images/feature_columns/some_constructors.jpg)

**Feature Columns** define the **type of features** we are going to feed into our Models.

The choice of Feature Column depends on the Variable and Model type.

Since the 4 feature columns of the Iris dataset are represented as continuos values, we need to especify it when creating the feature columns.

To do that, we use the: [tf.feature_column.numeric_column()](https://www.tensorflow.org/api_docs/python/tf/feature_column/numeric_column) constructor.

  - **tf.feature_column.numeric_column()** Represents real valued or numerical features.
  
Info++: [Feature Engineering](https://www.tensorflow.org/get_started/feature_columns)
  
 


In [12]:
# feature columns define how to use the feature data
my_feature_columns = []
for feature_column in X_train.keys():
  # use tf.feature_column.numeric_column() to create a feature column and add it to the 'my_feature_columns' list
  my_feature_columns.append(tf.feature_column.numeric_column(key=feature_column))

# Hyperparameters

1. Tune the hyperparameters bellow.

In [13]:
learning_rate = 0.1
number_of_classes = 3 # number of classes from the Iris dataset
batch_size = 16 # number of examples to feed to the network per step
max_step = 2000 # maximum number of training steps
regularization = 0.001 # regularization strength


# Building the Estimator

An Estimator encapsulates all the necessary parts of a model. 

Some of the available Estimators include:

1. **BoostedTrees** Classifier/Regressor
2. **DNN Classifier**/Regressor
3. **DNNLinearCombined** Classifier/Regressor
4. **Linear** Classifier/Regressor

Checkout: [tf.estimator](https://www.tensorflow.org/api_docs/python/tf/estimator)

## Exercise

1. Use the [tf.estimator.LinearClassifier](https://www.tensorflow.org/api_docs/python/tf/estimator/LinearClassifier) to classify the Iris Dataset.

Things to keep in mind.
  - Linear models are very simple, for this case, pay special attention to the **number of classes** and the **learning rate** tunning.
  - Play with different configurations of **batch size**, it can have dramatic effects on how quick the model converges.

In [11]:
classifier = tf.estimator.LinearClassifier(    
    feature_columns=my_feature_columns, # pass the feature columns to the Linear Classifier
    n_classes=number_of_classes, # Set the number of classes defined above
    optimizer=tf.train.FtrlOptimizer( # Configure the loss function
      learning_rate=learning_rate, # setup the learning rate
      l1_regularization_strength=regularization # set the regularization strength 
    ))

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmp3k69rpbn', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fdeaf055b70>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


### Tip

You do NOT need to define nor control the TensorFlow Session. 
*Estimators take care of that for you.*

## Exercise

Use the *classifier.train()* method to train the built classifier.
- Pass the  *train_input_fn()* as a lambda so you can control its input parameters.
- Set the *steps* parameter to the maximum number of steps you want to train your model.

In [12]:
classifier.train(input_fn=lambda: train_input_fn(X_train, y_train, batch_size=batch_size), 
                 steps=max_step #  Number of steps for which to train model.
                )

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmp3k69rpbn/model.ckpt.
INFO:tensorflow:loss = 17.577797, step = 0
INFO:tensorflow:global_step/sec: 244.599
INFO:tensorflow:loss = 8.288761, step = 100 (0.414 sec)
INFO:tensorflow:global_step/sec: 255.833
INFO:tensorflow:loss = 4.807881, step = 200 (0.390 sec)
INFO:tensorflow:global_step/sec: 217.644
INFO:tensorflow:loss = 5.272978, step = 300 (0.460 sec)
INFO:tensorflow:global_step/sec: 212.425
INFO:tensorflow:loss = 6.930149, step = 400 (0.472 sec)
INFO:tensorflow:global_step/sec: 215.467
INFO:tensorflow:loss = 3.7425652, step = 500 (0.460 sec)
INFO:tensorflow:global_step/sec: 222.168
INFO:tensorflow:loss = 3.7090392, step = 600 (0.454 sec)
INFO:tensorflow:global_step/sec: 225.272
INFO:tensorflow:loss

<tensorflow.python.estimator.canned.linear.LinearClassifier at 0x7fdeaf055828>

In [15]:
eval_results = classifier.evaluate(input_fn=lambda: eval_input_fn(X_test, y_test, batch_size=512))

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-06-25-14:07:09
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmp3k69rpbn/model.ckpt-2000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Finished evaluation at 2018-06-25-14:07:09
INFO:tensorflow:Saving dict for global step 2000: accuracy = 0.93333334, average_loss = 0.20646352, global_step = 2000, loss = 6.1939054
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 2000: /tmp/tmp3k69rpbn/model.ckpt-2000


In [19]:
eval_results['probabilities']

KeyError: ignored

# Deep Neural Networks

### Definition:

"*...a computing system made up of a number of simple, highly interconnected processing elements, which process information by their dynamic state response to external inputs.*"

- Function approximator
- Very powerful for non-linear relationships
- Represents a function as the composition of many functions.

## Exercise

1. Change the Linear Model Estimator to a Deep Neural Network.
- Head over to the Tensorflow documentation for [tf.estimator.DNNClassifier](https://www.tensorflow.org/api_docs/python/tf/estimator/DNNClassifier) and check it out.

Think about:

- How many layers you need
- The number of units in each layer
- The activation function used by default
- The Gradient Descent Optimizer. 

![alt text](https://www.tensorflow.org/images/custom_estimators/full_network.png)

## Architecturing your network

![LeNet-5](http://cs231n.github.io/assets/nn1/layer_sizes.jpeg)

Neural Nets with more hidden layers are able to represent more complex functions. 

- With more power comes complicated decision boundaries.

Take care with **Overfitting**!

- It occurs when a model with **high capacity** fits the noise in the data instead of the (assumed) underlying relationship.


# Regularization

Effects of Regularization. The figure below shows the decision boundaries of the same DNN (20 hidden units), with different regularization penalties. 

Note that more regularization smooths the decision boundary.

- It fights **Overfitting**.

![alt text](http://cs231n.github.io/assets/nn1/reg_strengths.jpeg)