## Keras Multiclass Classification

Use Keras with the standard iris flowers dataset.

Do the standard things such as one hot encode but using Keras functionality as well as scikit learn

In [23]:
import os
#turn off warnings
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
import numpy as np
from pandas import read_csv
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from keras.utils import np_utils
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import LabelEncoder
from sklearn.pipeline import Pipeline


In [38]:
#init seed
seed = 7
np.random.seed(7)

#data can now be found on kaggle
#http://www.kaggle.com/saurabh00007/iriscsv
dataset = pd.read_csv('iris.csv')
dataset.head()
dataset = dataset.values

All of the 4 input variables are numeric and have the same scale in centimeters. Each instance
describes the properties of an observed flower measurements and the output variable is specific
iris species. The attributes for this dataset can be summarized as follows:
1. Sepal length in centimeters.
2. Sepal width in centimeters.
3. Petal length in centimeters.
4. Petal width in centimeters.
5. Class.

Well studied problem with classification of three flower species - usually expect a model accuracy in the range 95% plus.


In [40]:
X = dataset[:,0:4].astype(float)
Y = dataset[:,4]
Y[1:len(Y):10]

array(['setosa', 'setosa', 'setosa', 'setosa', 'setosa', 'versicolor',
       'versicolor', 'versicolor', 'versicolor', 'versicolor',
       'virginica', 'virginica', 'virginica', 'virginica', 'virginica'],
      dtype=object)

### Process target

In [32]:
#target contains string values for each class - one hot encode as integers
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)
#use keras one hot encoding 
dummy_y = np_utils.to_categorical(encoded_Y)
dummy_y[1:len(dummy_y):10]

array([[1., 0., 0.],
       [1., 0., 0.],
       [1., 0., 0.],
       [1., 0., 0.],
       [1., 0., 0.],
       [0., 1., 0.],
       [0., 1., 0.],
       [0., 1., 0.],
       [0., 1., 0.],
       [0., 1., 0.],
       [0., 0., 1.],
       [0., 0., 1.],
       [0., 0., 1.],
       [0., 0., 1.],
       [0., 0., 1.]])

### Define the Model

Create a simple fully connected network with one hidden layer that contains 8
neurons. The hidden layer uses a rectifier activation function. Because
we used a one hot encoding for our iris dataset, the output layer must create 3 output values,
one for each class. The output value with the largest value will be taken as the class predicted
by the model. The network topology of this simple one-layer neural network can be summarized:

4 inputs --> 8 hidden nodes --> 3 outputs

To ensure output values are in range 0 to 1 use softmax function.  Try adam gradient descent with log loss function **categorical_crossentropy** in Keras

In [41]:
# define baseline model
def baseline_model():
    # create model
    model = Sequential()
    model.add(Dense(8, input_dim=4, activation='relu'))
    model.add(Dense(3, activation='softmax'))
    # Compile model
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model


In [46]:
#create the classifier with settings, try batch size = 50 to speed things up gave accuracy of 89%
#decreased batch size to 10 - as expected better results
estimator = KerasClassifier(build_fn=baseline_model, epochs=200, batch_size=10, verbose=0)

### Evaluate the model with k-Fold CV

In [47]:
kfold = KFold(n_splits=3, shuffle=True, random_state=seed)
results = cross_val_score(estimator, X, dummy_y, cv=kfold)
print("Accuracy: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

Accuracy: 94.67% (4.99%)
