# Tensorflow with Estimators

As we saw previously how to build a full Multi-Layer Perceptron model with full Sessions in Tensorflow. Unfortunately this was an extremely involved process. However developers have created Estimators that have an easier to use flow!

It is much easier to use, but you sacrifice some level of customization of your model. Let's go ahead and explore it!

## Get the Data

We will the iris data set.

Let's get the data:

In [1]:
import pandas as pd

In [2]:
df = pd.read_csv('iris.csv')

In [3]:
df.head()

Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),target
0,5.1,3.5,1.4,0.2,0.0
1,4.9,3.0,1.4,0.2,0.0
2,4.7,3.2,1.3,0.2,0.0
3,4.6,3.1,1.5,0.2,0.0
4,5.0,3.6,1.4,0.2,0.0


In [4]:
df.columns = ['sepal_length','sepal_width','petal_length','petal_width','target']

In [5]:
X = df.drop('target',axis=1)
y = df['target'].apply(int) # the target variable is defined as float here, it has to be integer, since it's classification problem

In [6]:
y # we need to shuffle the dataset

0      0
1      0
2      0
3      0
4      0
5      0
6      0
7      0
8      0
9      0
10     0
11     0
12     0
13     0
14     0
15     0
16     0
17     0
18     0
19     0
20     0
21     0
22     0
23     0
24     0
25     0
26     0
27     0
28     0
29     0
      ..
120    2
121    2
122    2
123    2
124    2
125    2
126    2
127    2
128    2
129    2
130    2
131    2
132    2
133    2
134    2
135    2
136    2
137    2
138    2
139    2
140    2
141    2
142    2
143    2
144    2
145    2
146    2
147    2
148    2
149    2
Name: target, Length: 150, dtype: int64

## Train Test Split

In [7]:
from sklearn.model_selection import train_test_split

In [8]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

# Estimators

Let's show you how to use the simpler Estimator interface!

In [11]:
import tensorflow as tf

  from ._conv import register_converters as _register_converters


## Feature Columns
Let's create feature columns for tensorflow estimators

In [12]:
X.columns

Index(['sepal_length', 'sepal_width', 'petal_length', 'petal_width'], dtype='object')

In [13]:
# after ., click"tab". It's going to show the different  functions.
tf.feature_column.

SyntaxError: invalid syntax (<ipython-input-13-038c36f1496f>, line 2)

In [14]:
feat_cols = []

for col in X.columns:
    feat_cols.append(tf.feature_column.numeric_column(col)) # .feature_column.numeric_column is the one you used for dealing
    # with numerical columns. There's also special ways you can use for your categorical columns.

In [16]:
feat_cols

[_NumericColumn(key='sepal_length', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None),
 _NumericColumn(key='sepal_width', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None),
 _NumericColumn(key='petal_length', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None),
 _NumericColumn(key='petal_width', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None)]

## Input Function

In [17]:
# tensorflow estimator inputs includes two major input, one for pandas one for numpy. So we use pandas here
# training input functions
# If you get the values include nan, so, that indicates you should set the batch_size smaller
# num of epochs: when you've gone through all your training data one time. If I have gone through 5 times all the training data->
# I have done with the TF estimator training 
# Shuffle is one you dealing with when the target values have been sorted. so can shuttle them around.
input_func = tf.estimator.inputs.pandas_input_fn(x=X_train,y=y_train,batch_size=10,num_epochs=5,shuffle=True)

In [18]:
# DNNClassifier: stands for deep neural network classifier.
# hidden_units define how many layers, and how many neurons in that layer. First layer has 10 neurons, second layer has 20 neurons,
# third layer has 10 neurons. These three are hidden layers. We also define how may classes we are expecting.
classifier = tf.estimator.DNNClassifier(hidden_units=[10, 20, 10], n_classes=3,feature_columns=feat_cols)

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': '/var/folders/1s/fc5ymx_j00vdg383_t0__vk80000gn/T/tmp4rk_q28h', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x11220f668>, '_task_type': 'worker', '_task_id': 0, '_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


In [19]:
# Last step is to train the data
classifier.train(input_fn=input_func,steps=50) # steps: how many steps you want to train for

INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Saving checkpoints for 1 into /var/folders/1s/fc5ymx_j00vdg383_t0__vk80000gn/T/tmp4rk_q28h/model.ckpt.
INFO:tensorflow:loss = 21.775734, step = 1
INFO:tensorflow:Saving checkpoints for 50 into /var/folders/1s/fc5ymx_j00vdg383_t0__vk80000gn/T/tmp4rk_q28h/model.ckpt.
INFO:tensorflow:Loss for final step: 5.146884.


<tensorflow.python.estimator.canned.dnn.DNNClassifier at 0x11a1635f8>

## Model Evaluation

** Use the predict method from the classifier model to create predictions from X_test **

In [20]:
# Now use estimator test on our test dataset
pred_fn = tf.estimator.inputs.pandas_input_fn(x=X_test,batch_size=len(X_test),shuffle=False)

In [21]:
note_predictions = list(classifier.predict(input_fn=pred_fn))

INFO:tensorflow:Restoring parameters from /var/folders/1s/fc5ymx_j00vdg383_t0__vk80000gn/T/tmp4rk_q28h/model.ckpt-50


In [22]:
note_predictions # probabilities is the probability for each class

[{'class_ids': array([0]),
  'classes': array([b'0'], dtype=object),
  'logits': array([ 4.8443155, -0.7857063, -2.9842076], dtype=float32),
  'probabilities': array([9.9602926e-01, 3.5742472e-03, 3.9663195e-04], dtype=float32)},
 {'class_ids': array([0]),
  'classes': array([b'0'], dtype=object),
  'logits': array([ 5.0409656, -0.8294249, -3.0936263], dtype=float32),
  'probabilities': array([9.9689460e-01, 2.8130086e-03, 2.9230805e-04], dtype=float32)},
 {'class_ids': array([2]),
  'classes': array([b'2'], dtype=object),
  'logits': array([-3.3950083,  2.4074574,  2.7449396], dtype=float32),
  'probabilities': array([0.00125605, 0.41589814, 0.5828458 ], dtype=float32)},
 {'class_ids': array([2]),
  'classes': array([b'2'], dtype=object),
  'logits': array([-3.434986 ,  2.3788385,  2.5214381], dtype=float32),
  'probabilities': array([0.0013848 , 0.46376726, 0.5348479 ], dtype=float32)},
 {'class_ids': array([2]),
  'classes': array([b'2'], dtype=object),
  'logits': array([-3.6387484

In [23]:
note_predictions[0]

{'class_ids': array([0]),
 'classes': array([b'0'], dtype=object),
 'logits': array([ 4.8443155, -0.7857063, -2.9842076], dtype=float32),
 'probabilities': array([9.9602926e-01, 3.5742472e-03, 3.9663195e-04], dtype=float32)}

In [24]:
note_predictions[0]['class_ids'][0]

0

In [25]:
final_preds  = []
for pred in note_predictions:
    final_preds.append(pred['class_ids'][0])

** Now create a classification report and a Confusion Matrix. Does anything stand out to you?**

In [26]:
from sklearn.metrics import classification_report,confusion_matrix

In [27]:
print(confusion_matrix(y_test,final_preds))

[[13  0  0]
 [ 0 17  1]
 [ 0  1 13]]


In [28]:
print(classification_report(y_test,final_preds))

             precision    recall  f1-score   support

          0       1.00      1.00      1.00        13
          1       0.94      0.94      0.94        18
          2       0.93      0.93      0.93        14

avg / total       0.96      0.96      0.96        45



# Great Job!

In [None]:
1. EDA -> Done
2. M5 Proj ->
3. Capston Proj -> into ur resume

POC