## Scikit-Learn type interface of Tensorflow

Tensorflow estimators object helps us quickly create models without needing to manually define the Graph in tensorflow session as we did with MNIST datasets. It's workflow is same as scikit-learn:

1. Read in data (normalize if necessary)
2. train/test split the data
3. create estimator feature columns
4. create input estimator funtion
5. train estimator model
6. predict with new test input function 

In [1]:
import pandas as pd

In [2]:
df = pd.read_csv('iris.csv')

In [3]:
df.head()

Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),target
0,5.1,3.5,1.4,0.2,0.0
1,4.9,3.0,1.4,0.2,0.0
2,4.7,3.2,1.3,0.2,0.0
3,4.6,3.1,1.5,0.2,0.0
4,5.0,3.6,1.4,0.2,0.0


In [4]:
#Tensorflow can't handle spaces and special char in column names and floats in labels. So we shall change it
df.columns = ['sepal_length','sepal_width','petal_length','petal_width','target']
df['target']= df['target'].apply(int)
df.head()

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,target
0,5.1,3.5,1.4,0.2,0
1,4.9,3.0,1.4,0.2,0
2,4.7,3.2,1.3,0.2,0
3,4.6,3.1,1.5,0.2,0
4,5.0,3.6,1.4,0.2,0


In [5]:
x= df.drop('target',axis=1)
y=df['target']

In [6]:
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x,y, test_size=0.3)

In [7]:
import tensorflow as tf

In [8]:
#Feature columns
feat_cols=[]
for col in x.columns:
    feat_cols.append(tf.feature_column.numeric_column(col))

In [9]:
#input fucntion of small batch size
input_func = tf.estimator.inputs.pandas_input_fn(x= x_train, y=y_train, batch_size=10, num_epochs=5, shuffle=True)

In [10]:
#classifier
classifier = tf.estimator.DNNClassifier(hidden_units=[10,20,10], n_classes=3, feature_columns=feat_cols)

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': '/var/folders/7c/lrkh85v52hgcjnt6dwmwk1fh0000gn/T/tmpe_wp2uq4', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x1109e2cf8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


In [11]:
#Training sessions
classifier.train(input_fn=input_func,steps=50)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into /var/folders/7c/lrkh85v52hgcjnt6dwmwk1fh0000gn/T/tmpe_wp2uq4/model.ckpt.
INFO:tensorflow:loss = 10.687069, step = 1
INFO:tensorflow:Saving checkpoints for 50 into /var/folders/7c/lrkh85v52hgcjnt6dwmwk1fh0000gn/T/tmpe_wp2uq4/model.ckpt.
INFO:tensorflow:Loss for final step: 2.109293.


<tensorflow.python.estimator.canned.dnn.DNNClassifier at 0x123658828>

In [12]:
pred_func = tf.estimator.inputs.pandas_input_fn(x= x_test,batch_size=len(x_test),shuffle=False)

In [13]:
#Predictions
prediction = list(classifier.predict(input_fn=pred_func))

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /var/folders/7c/lrkh85v52hgcjnt6dwmwk1fh0000gn/T/tmpe_wp2uq4/model.ckpt-50
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.


In [14]:
#Taking out labels from final report
final_pred=[]
for pred in prediction:
    final_pred.append(pred['class_ids'][0])

In [15]:
from sklearn.metrics import classification_report,confusion_matrix
print(confusion_matrix(y_test,final_pred))
print(classification_report(y_test,final_pred))

[[18  0  0]
 [ 0 12  1]
 [ 0  0 14]]
              precision    recall  f1-score   support

           0       1.00      1.00      1.00        18
           1       1.00      0.92      0.96        13
           2       0.93      1.00      0.97        14

   micro avg       0.98      0.98      0.98        45
   macro avg       0.98      0.97      0.98        45
weighted avg       0.98      0.98      0.98        45

