# Tensorflow / Keras Iris Data set problem solution

In the following Jupyter notebook file we will create a visual representation of the Iris data set. I will walk you through this problem set as if you were a novice programmer.

##### What is the Iris data set you ask?

If you didn't read the readMe let me summarise it:

The Iris data set contains 50 samples each of 3 different species of flowers:

1. Iris Setosa
2. Iris Virginica
3. Iris Versicolor

The data has 4 measurements from each sample:

1. Sepal length
2. Sepal Width
3. Petal length
4. Petal width

If you'd like to know more go [Here](https://archive.ics.uci.edu/ml/datasets/iris)

Lets get started!

In [1]:
# Reference: https://github.com/emerging-technologies/keras-iris

import csv
import numpy as np
import keras as kr 

Using TensorFlow backend.


###### Wait whats a TensorFlow?

Well in mathematics, tensors are geometric objects that describe linear relations between geometric vectors, scalars, and other tensors. TensorFlow is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. If you would like to know more about TensorFlow, I'd start [here](https://www.tensorflow.org/)

###### What does that all mean?

We're building a neural network, a neural network is a system of hardware and/or software patterned after the operations of neurons in the human brain. Neural networks also called artificial neural networks (ANN) are a variety of deep learning technologies. You can read up on neural networks [here](https://en.wikipedia.org/wiki/Artificial_neural_network) if you so wish!

Ok firstly lets load up our data set!

### Load the iris dataset into an list and then split the data into arrays

To do this we're going to create a list that will contain all of our data. Then populate arrays using sub-elements of that list

In [2]:
#load the Iris dataset
iris = list(csv.reader(open("./data/IRIS.csv")))[1:]

# 4 inputs sepal length & width, petal length & width
inputs = np.array(iris)[:,:4].astype(np.float)

# outputs are initially strings , setosa, versicolor, virginica
outputs = np.array(iris)[:,4]

#convert the out strings to ints
outputs_vals, outputs_inds = np.unique(outputs, return_inverse=True)

print(inputs)

[[ 5.1  3.5  1.4  0.2]
 [ 4.9  3.   1.4  0.2]
 [ 4.7  3.2  1.3  0.2]
 [ 4.6  3.1  1.5  0.2]
 [ 5.   3.6  1.4  0.2]
 [ 5.4  3.9  1.7  0.4]
 [ 4.6  3.4  1.4  0.3]
 [ 5.   3.4  1.5  0.2]
 [ 4.4  2.9  1.4  0.2]
 [ 4.9  3.1  1.5  0.1]
 [ 5.4  3.7  1.5  0.2]
 [ 4.8  3.4  1.6  0.2]
 [ 4.8  3.   1.4  0.1]
 [ 4.3  3.   1.1  0.1]
 [ 5.8  4.   1.2  0.2]
 [ 5.7  4.4  1.5  0.4]
 [ 5.4  3.9  1.3  0.4]
 [ 5.1  3.5  1.4  0.3]
 [ 5.7  3.8  1.7  0.3]
 [ 5.1  3.8  1.5  0.3]
 [ 5.4  3.4  1.7  0.2]
 [ 5.1  3.7  1.5  0.4]
 [ 4.6  3.6  1.   0.2]
 [ 5.1  3.3  1.7  0.5]
 [ 4.8  3.4  1.9  0.2]
 [ 5.   3.   1.6  0.2]
 [ 5.   3.4  1.6  0.4]
 [ 5.2  3.5  1.5  0.2]
 [ 5.2  3.4  1.4  0.2]
 [ 4.7  3.2  1.6  0.2]
 [ 4.8  3.1  1.6  0.2]
 [ 5.4  3.4  1.5  0.4]
 [ 5.2  4.1  1.5  0.1]
 [ 5.5  4.2  1.4  0.2]
 [ 4.9  3.1  1.5  0.2]
 [ 5.   3.2  1.2  0.2]
 [ 5.5  3.5  1.3  0.2]
 [ 4.9  3.6  1.4  0.1]
 [ 4.4  3.   1.3  0.2]
 [ 5.1  3.4  1.5  0.2]
 [ 5.   3.5  1.3  0.3]
 [ 4.5  2.3  1.3  0.3]
 [ 4.4  3.2  1.3  0.2]
 [ 5.   3.5

###### What are those box things? 

These: [1:], [:,:4], [:,4] ?

Thats called slicing! It allows us to take a set of sub-elements from an array, tuple, or in our case a list without using any boring for loops! It's explained in detail [here](http://www.pythoncentral.io/how-to-slice-listsarrays-and-tuples-in-python/), pretty handy!


### Encode and split into test & train subsets

Now we're going to split the data into 2 subsets, the training set, and the testing set.

###### Why?

Well the reason we do this is so that we can firstly train our model with the training set, and then test it to see how well it performs using the test set. 

In [3]:
# endocde the category integers as binary categorial vairables.
outputs_cats = kr.utils.to_categorical(outputs_inds)

# split the input & output data sets into training and test subsets
inds = np.random.permutation(len(inputs))

train_inds, test_inds = np.array_split(inds, 2)

inputs_train, outputs_train = inputs[train_inds], outputs_cats[train_inds]
inputs_test, outputs_test = inputs[test_inds], outputs_cats[test_inds]

###### Hold on, wheres this model you speak of?

We're getting to it now, we needed to prepare the data for our model first.

Now were going to create our neural network, or model as we're calling it. Ok, to build this model we'll be using Keras, a high-level API that sits on top of TensorFlow. Check out their [Documentation](https://keras.io/) for more information on why we're going to use it. But for now, all you need to know is it'll make our lives a bit easier.

###### Ok! How do we do that?

First we create our model, we'll use a linear stack, by assigning our model to keras.models.Sequential(). Next we add our first layer by by calling the .add() mehod and specify the amount of nodes in our hidden layer and the input shape. The model needs to know what input shape it should expect. For this reason, the first layer in a Sequential model (and only the first, because following layers can do automatic shape inference) needs to receive information about its input shape. 

Dense implements the operation: ```output = activation(dot(input, kernel) + bias)``` where ```activation``` is the element-wise activation function passed as the activation argument.

A sigmoid function is a mathematical function having a characteristic "S"-shaped curve or sigmoid curve. Often, sigmoid function refers to the special case of the logistic function shown in the first figure and defined by the formula.
![Sigmoid](https://qph.ec.quoracdn.net/main-qimg-2f0e7ccc8fd54e238ae46a3d5fcc6908?convert_to_webp=true.png)

Then We'll add another layer with 3 nodes, and a final layer using Softmax activation. In mathematics, the softmax function, or normalized exponential function, is a generalization of the logistic function that "squashes" a K-dimensional vector of arbitrary real values to a K-dimensional vector of real values in the range [0, 1] that add up to 1.
![Softmax](https://cdn-images-1.medium.com/max/1600/1*l6GNTFihUu0EuUMUGHMwpA.png)

So by the time all thats said and done, we should have something that looks similar to this!
![NN](https://ethervision.net/wp-content/uploads/2014/01/neural-network.png)


### Create a neural network, add layers & nodes

In [4]:
#creats a neral network
model = kr.models.Sequential()

# add an initial layer with 4 inputs nodes, and a hidden layer with 16 nodes
model.add(kr.layers.Dense(16, input_shape=(4,)))

#appy the signoid activation function to that layer
model.add(kr.layers.Activation("sigmoid"))

#add another layer connected to the layet with 16 nodes containing 3 output nodes
model.add(kr.layers.Dense(3))

#use the softmax activation function there
model.add(kr.layers.Activation("softmax"))


### Configure the model for training, fit using training data, & evaluate using the test data

First we will compile the the data using the Adam optimizer, categorical cross entropy, for multi-class classification where each example belongs to a single class.

###### What is the Adam optimizer? 

Adam is an optimization algorithm that can be used instead of the classical stochastic gradient descent procedure to update network weights iterative based in training data. You can find out more [here](https://machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/)

We then want to go ahead and train our model, to do this we will use the train data with an epoch of 100 

Then we will evaluate the test data and output the results

In [6]:
# configure the model for training
# uses the adam optimizer and categorialcross entropy as the loss function
# add in some extra metrics, accuracy being the only one.
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])

#fit the model using our training data
model.fit(inputs_train, outputs_train, epochs=100, batch_size=1, verbose=1)
#evaluate the model using the test data set
loss, accuracy = model.evaluate(inputs_test, outputs_test, verbose=1)


print("\n\nloss: %6.4f\nAccuracy: %6.4f"% (loss, accuracy))

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100


loss: 0.0565
Accuracy: 0.9733


In [9]:
# Predict the class of a single flower.
prediction = np.around(model.predict(np.expand_dims(inputs_test[0], axis=0))).astype(np.int)[0]
print("Actual: %s\tEstimated: %s" % (outputs_test[0].astype(np.int), prediction))
print("That means it's a %s" % outputs_vals[prediction.astype(np.bool)][0])

#print("Error {}% Accucary {}%".format((1.0 - accuracy)*100,(accuracy*100)))

# Save the model to a file for later use.
model.save("./data/iris_nn.h5")
# Load the model again with: model = load_model("iris_nn.h5")

Actual: [1 0 0]	Estimated: [1 0 0]
That means it's a setosa


Now with all that done we pick out a piece of data and check to see if we can predict accurately, Which it has, as shown above the Accuracy is roughly 97 - 98% which is a great result.

# End