# Multiclass Classification

In this notebook, we will explore the use of neural networks to solve the famous iris flowers classification problem. The attributes of the iris dataset can be summarized as:
1. Sepal length (cm)
2. Sepal width (cm)
3. Petal length (cm)
4. Petal length (cm)
5. Class (target)

## Imports

In [1]:
import pandas as pd
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
import tensorflow.keras.utils as tfu
from scikeras.wrappers import KerasClassifier
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import LabelEncoder

## Loading Data

In [12]:
df = pd.read_csv('iris.csv', header=None)
dataset = df.values
X = dataset[:, 0:4].astype(float)
y = dataset[:, 4]

In [13]:
X[:5]

array([[5.1, 3.5, 1.4, 0.2],
       [4.9, 3. , 1.4, 0.2],
       [4.7, 3.2, 1.3, 0.2],
       [4.6, 3.1, 1.5, 0.2],
       [5. , 3.6, 1.4, 0.2]])

In [14]:
y[:5]

array(['Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
       'Iris-setosa'], dtype=object)

## Encoding the Output Variable

The output variable contains three different values. We can reshape the output into a vector representing class identity via one-hot encoding. 

In [23]:
dummy_y = pd.get_dummies(y).to_numpy()
dummy_y[:5]

array([[1, 0, 0],
       [1, 0, 0],
       [1, 0, 0],
       [1, 0, 0],
       [1, 0, 0]], dtype=uint8)

## Defining the Neural Network Model

As before, we can use `KerasClassifier` to define our neural network. Unlike the previous example with one final output, we will naturally need to create three outputs, one representing each potential class value. 

4 inputs --> [8 hidden nodes] --> 3 outputs

We will use a softmax activation function in the output layer. This is to ensure the output values are in the range of 0 and 1 and may be used as predicted probabilities. In the compilation step, we will need to alter the gradient descent function to account for the multiple value entropy being calculated. We can use `categorical_crossentropy` instead of the `binary_crossentropy` used in the binary classification problem. 

In [24]:
def baseline_model():
    # create model
    model = Sequential()
    model.add(Dense(8, input_shape=(4,), activation='relu'))
    model.add(Dense(3, activation='softmax'))
    # compile model
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

In [25]:
# Creating the KerasClassifier

estimator = KerasClassifier(model=baseline_model, epochs=200, batch_size=5, verbose=0)

## Evaluating the Model with *k*-Fold Cross-Validation

We can use the gold standard for validation, *k*-Fold cross-validation. First, we will need to define the model evaluation procedure.

In [26]:
kfold = KFold(n_splits=10, shuffle=True)

Now we can evaluate our model on our dataset (`X` and `dummy_y`) using the 10-fold cross-validation procedure (`kfold`). 

In [27]:
results = cross_val_score(estimator, X, dummy_y, cv=kfold)
print("Accuracy: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

Accuracy: 96.67% (4.47%)


The results are summarized as both the mean and standard deviation of the model accuracy on the dataset. This is a reasonable estimation of the performance of the model on unseen data. 