#### A Deep Neural Network (DNN) model to discern Iris flowers
We'll build a DNN model to predict the type of Iris flower species given only a flower's natural features(sepal and petal length).

We will use Google's Tensorflow: an open source machine learning tool for everyone.

The types of flowers we would like to distunguish is Iris Versicolour, Iris Virginica, Iris Setosa. 

The flowers look like this:

* Iris Virginica
![Iris Virginica](images/178px-Iris_virginica.jpg)

* Iris Versicolor
![Iris_versicolor](images/193px-Iris_versicolor_3.jpg)

* Iris Setosa
![Iris Kosaciec](images/109px-Kosaciec_szczecinkowaty_Iris_setosa.jpg)

The data columns we have are:
* sepal length in cm
* sepal width in cm
* petal length in cm
* petal width in cm

These data points will be used to train the model and test its accuracy.

Time to write some Neural Nets!


In [183]:
# Step 1: split the data into training and testing data
# Key points to note:
############ In the data frame that we wish to form,
############ * Setosa's value = 0, Versicolor = 1 and Virginica = 2
########### * The species column will be changed to contain these values before we split the data.

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

# read data from the csv file
actual_data = pd.read_csv("iris.csv")
data = pd.read_csv("iris.csv", skiprows=[0], header=None)
labels_data = pd.read_csv("iris.csv", usecols=[4], skiprows=[0], header=None)
labels = np.unique(np.array(labels_data, 'str'))
print (labels)

# iterate through each row, changing the last column (4) to have integers instead of flower names
for index, row in data.iterrows():
    if row[4] == labels[0]:
        data.loc[index, 4] = 0
    elif row[4] == labels[1]:
        data.loc[index, 4] = 1
    else:
        data.loc[index, 4] = 2

# shuffle the newly formatted data
data = data.sample(frac=1).reset_index(drop=True)
y = actual_data.species

# split that data, 80-20 rule
X_train, X_test, y_train, y_test = train_test_split(data, y, test_size=0.2)
print (X_test.head())

# write the split data to csv
# but first create a custom header for the training and test csv files
# and omit the species column from the header since that's what we want to predict (hence the "- 1")
X_train_header = list((len(X_train.index), len(X_train.columns) - 1))  + list(labels)
X_test_header = list((len(X_test.index), len(X_test.columns) - 1)) + list(labels)
print (X_train_header)
print (X_test_header)

# write the split data to csv files
X_train.to_csv("iris_training.csv", index=False, index_label=False, header=X_train_header)
X_test.to_csv("iris_test.csv", index=False, index_label=False, header=X_test_header)

['setosa' 'versicolor' 'virginica']
       0    1    2    3  4
50   7.9  3.8  6.4  2.0  2
95   4.4  2.9  1.4  0.2  0
143  6.4  3.2  5.3  2.3  2
89   6.1  3.0  4.6  1.4  1
30   5.9  3.0  4.2  1.5  1
[120, 4, 'setosa', 'versicolor', 'virginica']
[30, 4, 'setosa', 'versicolor', 'virginica']


The reason why we had to format the header of the training and test csv to be 
`(row length, column length, 'setosa', 'versicolor', 'virginica')` is to 
make the `load_csv_with_header` function be able to read our data. Data preparation matters. :-)

In [184]:
import tensorflow as tf
from tensorflow.contrib.learn.python.learn.datasets import base

# Data files containing our sepal and petal features
IRIS_TRAINING = "iris_training.csv"
IRIS_TEST = "iris_test.csv"

training_set = base.load_csv_with_header(
    filename=IRIS_TRAINING, 
    features_dtype=np.float32, 
    target_dtype=np.int)

test_set = base.load_csv_with_header(
    filename=IRIS_TEST,
    features_dtype=np.float32,
    target_dtype=np.int)

print(test_set.target)




[2 0 2 1 1 1 0 1 2 0 2 0 2 2 0 1 0 1 1 0 0 2 0 2 1 0 1 1 2 0]


In [185]:
# Specify that all feature columns have real-value data
feature_columns = [tf.contrib.layers.real_valued_column("", dimension=4)]

Now, we build the 3 layer Deep Nueral Network classfier.

In [201]:
classifier = tf.contrib.learn.DNNClassifier(
    feature_columns=featured_columns,
    hidden_units=[10, 20, 10],
    n_classes=3,
    model_dir="/tmp/iris_model"
)

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_task_type': None, '_task_id': 0, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x1c21f97320>, '_master': '', '_num_ps_replicas': 0, '_num_worker_replicas': 0, '_environment': 'local', '_is_chief': True, '_evaluation_master': '', '_tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1
}
, '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_secs': 600, '_log_step_count_steps': 100, '_session_config': None, '_save_checkpoints_steps': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_model_dir': '/tmp/iris_model'}


Next, we define the training inputs

In [202]:
def training_inputs():
    x = tf.constant(training_set.data)
    y = tf.constant(training_set.target)
    
    return x,y

Then next step is to fit the model with the training data. Fitting is where the model is trained.

In [216]:
# fit the classifier with the training data
classifier.fit(input_fn=training_inputs, steps=2000)
    

INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Restoring parameters from /tmp/iris_model/model.ckpt-14000
INFO:tensorflow:Saving checkpoints for 14001 into /tmp/iris_model/model.ckpt.
INFO:tensorflow:loss = 0.04674709, step = 14001
INFO:tensorflow:global_step/sec: 783.448
INFO:tensorflow:loss = 0.046954535, step = 14101 (0.131 sec)
INFO:tensorflow:global_step/sec: 860.518
INFO:tensorflow:loss = 0.047011122, step = 14201 (0.115 sec)
INFO:tensorflow:global_step/sec: 890.17
INFO:tensorflow:loss = 0.046789907, step = 14301 (0.111 sec)
INFO:tensorflow:global_step/sec: 637.91
INFO:tensorflow:loss = 0.046695717, step = 14401 (0.159 sec)
INFO:tensorflow:global_step/sec: 778.683
INFO:tensorflow:loss = 0.04667021, step = 14501 (0.127 sec)
INFO:tensorflow:global_step/sec: 735.148
INFO:tensorflow:loss = 0.046662346, step = 14601 (0.141 sec)
INFO:tensorflow:global_step/sec: 890.48
INFO:tensorflow:loss = 0.046659503, step = 14701 (0.108 sec)
INFO:tensorflow:global_step/sec: 678.196
INFO

DNNClassifier(params={'head': <tensorflow.contrib.learn.python.learn.estimators.head._MultiClassHead object at 0x1c21f97358>, 'hidden_units': [10, 20, 10], 'feature_columns': (_RealValuedColumn(column_name='', dimension=4, default_value=None, dtype=tf.float32, normalizer=None),), 'optimizer': None, 'activation_fn': <function relu at 0x1160d4d90>, 'dropout': None, 'gradient_clip_norm': None, 'embedding_lr_multipliers': None, 'input_layer_min_slice_size': None})

In [204]:
# Define test inputs
def test_inputs():
    x = tf.constant(test_set.data)
    y = tf.constant(test_set.target)
    
    return x, y

After training, we evaluate the accuracy of our trained model. 
We do this using the evaluate method. It takes in the test input data and target to build its input data pipeline. After measuring the model's accuracy, it returns a dictionary containing the results.

In [205]:
# evaluate the classifier's accuracy
accuracy_score = classifier.evaluate(input_fn=test_inputs, steps=1)['accuracy']
print ("Accuracy score", accuracy_score)

INFO:tensorflow:Starting evaluation at 2018-03-05-11:23:42
INFO:tensorflow:Restoring parameters from /tmp/iris_model/model.ckpt-10000
INFO:tensorflow:Evaluation [1/1]
INFO:tensorflow:Finished evaluation at 2018-03-05-11:23:43
INFO:tensorflow:Saving dict for global step 10000: accuracy = 1.0, global_step = 10000, loss = 0.015831133
Accuracy score 1.0


Time to see if our model can predict the type of Iris flower given a new flower sample. 

In [206]:
# Classify two new flower samples.
def new_flower_samples():
    return np.array(
        [[6.4, 3.2, 4.5, 1.5],
        [5.8, 3.1, 5.0, 1.7]], dtype=np.float32)


In [207]:
# Predict the type of Iris flower
prediction = classifier.predict_classes(input_fn=new_flower_samples)
print (list(prediction))

INFO:tensorflow:Restoring parameters from /tmp/iris_model/model.ckpt-10000
[1, 2]


We made it!!

Now, let's try creating a Linear model and compare it's prediction with that of the DNN we just created

In [None]:
# Create a built in linear model classifier
