<a href="https://colab.research.google.com/github/Deep-Learning-Challenge/challenge-notebooks/blob/master/1.Multilayer%20Perceptrons/2.Guided%20Projects/1.Multiclass%20Classification%20Of%20Flower%20Species.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" /></a>

# Multiclass Classification Of Flower Species

This project tutorial will explore how to use Keras to develop and evaluate neural network models for multi-class classification problems. After completing this step-by-step tutorial, you will know:

* How to load data from CSV and make it available to Keras.
* How to prepare multi-class classification data for modeling with neural networks.
* How to evaluate Keras neural network models with scikit-learn.

Let's get started.

## Iris Flowers Classification Dataset

In this tutorial, we will use the standard machine learning problem called the iris flowers dataset. This dataset is well studied and is a good problem for practicing on neural networks because all of the four input variables are numeric and have the same scale in centimeters. Each instance describes an observed flower measurement's properties, and the output variable is specific iris species. The attributes for this dataset can be summarized as follows:

1. Sepal length in centimeters.
2. Sepal width in centimeters.
3. Petal length in centimeters.
4. Petal width in centimeters.
5. Class.

This is a multi-class classification problem, meaning that there are more than two classes to be predicted; there are three flower species. This is a fundamental problem for practicing with neural networks because the three-class values require specialized handling. Below is a sample of the first five of the 150 instances:

```
5.1,3.5,1.4,0.2,Iris-setosa
4.9,3.0,1.4,0.2,Iris-setosa
4.7,3.2,1.3,0.2,Iris-setosa
4.6,3.1,1.5,0.2,Iris-setosa
5.0,3.6,1.4,0.2,Iris-setosa
```

The iris flower dataset is a well-studied problem, and such, we can expect to achieve a model accuracy in the range of 95% to 97%. This provides an excellent target to aim for when developing our models. You can also download the iris flowers dataset from the UCI Machine Learning [repository](http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data) and place it in your current working directory with the filename iris.csv. You can learn more about the iris flower classification dataset on the [UCI Machine Learning Repository page](https://archive.ics.uci.edu/ml/datasets/Iris).

## Runtime Setup

In [57]:
import sys

dataset_name = "iris.csv"
if 'google.colab' in sys.modules:
    DATASET = f"https://github.com/Deep-Learning-Challenge/challenge-notebooks/raw/master/datasets/{dataset_name}"
else:
    DATASET = f"../../datasets/{dataset_name}"
    
DATASET

'../../datasets/iris.csv'

## Import Classes and Functions

We can begin by importing all of the classes and functions we will need in this tutorial. This includes both the functionality we require from Keras and data loading from Pandas and data preparation and model evaluation from scikit-learn.

In [58]:
import tensorflow as tf

import logging
tf.get_logger().setLevel(logging.ERROR)

import os
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier
from tensorflow.keras import utils

from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import LabelEncoder
from sklearn.pipeline import Pipeline

import numpy
from pandas import read_csv

## Initialize Random Number Generator

Next, we need to initialize the random number generator to a constant value. This is important to ensure that the results we achieve from this model can be achieved again precisely. It ensures that the stochastic process of training a neural network model can be reproduced.

In [59]:
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)

## Load The Dataset

The dataset can be loaded directly. Because the output variable contains strings, it is easiest to load the data using pandas. We can then split the attributes (columns) into input variables (X) and output variables (Y).

In [60]:
# load dataset
dataframe = read_csv(DATASET, header=None)
dataset = dataframe.values
X = dataset[:,:-1].astype(float)
Y = dataset[:,-1]

dataframe.head()

Unnamed: 0,0,1,2,3,4
0,5.1,3.5,1.4,0.2,Iris-setosa
1,4.9,3.0,1.4,0.2,Iris-setosa
2,4.7,3.2,1.3,0.2,Iris-setosa
3,4.6,3.1,1.5,0.2,Iris-setosa
4,5.0,3.6,1.4,0.2,Iris-setosa


## Encode The Output Variable

The output variable contains three different string values. When modeling multi-class classification problems using neural networks, it is good practice to reshape the output attribute from a vector that contains values for each class value to be a matrix with a boolean for each class value and whether or not a given instance has that class value or not. This is called one-hot encoding or creating dummy variables from a categorical variable. For example, in this problem, the three-class values are `Iris-setosa`, `Iris-versicolor`, and `Iris-virginica`. If we had the three observations:

```
Iris-setosa
Iris-versicolor
Iris-virginica
```

We can turn this into a one-hot encoded binary matrix for each data instance that would look as follows:

```
Iris-setosa, Iris-versicolor, Iris-virginica
1,           0,               0
0,           1,               0
0,           0,               1
```

We can do this by first encoding the strings consistently to integers using the scikit-learn class `LabelEncoder`. Then convert the vector of integers to a one-hot encoding using the Keras function `to_categorical()`.

In [61]:
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)
dummy_y = tf.keras.utils.to_categorical(encoded_Y)

## Define The Neural Network Model

The Keras library provides wrapper classes to allow you to use neural network models developed with Keras in scikit-learn, as we saw in the previous lesson. The `KerasClassifier` class in Keras can be used as an `Estimator` in scikit-learn, the base type of model in the library. The `KerasClassifier` takes the name of a function as an argument. This function must return the constructed neural network model, ready for training.

Below is a function that will create a baseline neural network for the iris classification problem. It creates a simple, fully connected network with one hidden layer that contains 8 neurons. The hidden layer uses a rectifier activation function, which is a good practice. Because we used a one-hot encoding for our iris dataset, the output layer must create three output values, one for each class. The output value with the largest value will be taken as the class predicted by the model. The network topology of this simple one-layer neural network can be summarized as:

`4 inputs -> [8 hidden nodes] -> 3 outputs`

Note that we use a softmax activation function in the output layer. This ensures the output values are in the range of 0 and 1 and may be used as predicted probabilities. Finally, the network uses the efficient Adam gradient descent optimization algorithm with a logarithmic loss function called `categorical_crossentropy` in Keras.

In [62]:
# define baseline model
def baseline_model ():
    #create model
    model = Sequential()
    model.add(layer=Dense(units=8, activation='relu', input_dim=4))
    model.add(layer=Dense(units=3, activation='softmax'))

    #compile model
    model.compile(
        optimizer='adam',
        loss='categorical_crossentropy',
        metrics=['accuracy']
    )

    return model

We can now create our `KerasClassifier` for use in scikit-learn. We can also pass arguments in the construction of the `KerasClassifier` class that will be passed on to the `fit()` function internally used to train the neural network. Here, we pass the number of `epochs` as 200 and `batch_size` as 5 to use when training the model. Debugging is also turned off when training by setting `verbose` to 0.

In [63]:
estimator = tf.keras.wrappers.scikit_learn.KerasClassifier (
    build_fn=baseline_model,
    epochs=200,
    batch_size=5,
    verbose=0
)

## Evaluate The Model with k-Fold Cross-Validation

We can now evaluate the neural network model on our training data. The scikit-learn library has excellent capability to evaluate models using a suite of techniques. The gold standard for evaluating machine learning models is k-fold cross-validation. First, we can define the model evaluation procedure. Here, we set the folds to be 10 (an excellent default) and shuffle the data before partitioning it.

In [64]:
kfold = KFold(n_splits=10,shuffle=True, random_state=seed)

Now we can evaluate our model (`estimator`) on our dataset (`X` and `dummy_y`) using a 10-fold cross-validation procedure (`kfold`). Evaluating the model only takes approximately 10 seconds and returns an object that describes the ten constructed models' evaluation for each of the dataset's splits.

In [65]:
results = cross_val_score(estimator=estimator, X=X, y=dummy_y, cv=kfold, n_jobs=-1)

The results are summarized as both the mean and standard deviation of the model accuracy on the dataset. This is a reasonable estimation of the performance of the model on unseen data. It is also within the realm of known top results for this problem.

In [66]:
print("Accuracy: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

Accuracy: 96.67% (4.47%)


## Summary

In this lesson, you discovered how to develop and evaluate a neural network using the Keras Python library for deep learning. You learned:

* How to load data and make it available to Keras.
* How to prepare multi-class classification data for modeling using one-hot encoding.
* How to use Keras neural network models with scikit-learn.
* How to define a neural network using Keras for multi-class classification.
* How to evaluate a Keras neural network model using scikit-learn with k-fold cross-validation.