# Multi Class Classification Problem using IRIS Flowers Dataset 

This is a multi class classification problem which has got more than 2 values to be predicted. 

Iris Dataset contains four features (length and width of sepals and petals) of 50 samples of **three** species of Iris (Iris setosa, Iris virginica and Iris versicolor). 

We will use keras Deep Learning to predict the outcome 

#### Libraries needed for run this mode

In [None]:
import pandas
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from keras.utils import np_utils
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import LabelEncoder
from sklearn.pipeline import Pipeline

### Loading the data from IRIS Data set in a dataframe.  
### Split the data into X ( Input Variable ) & Y ( Output Varaible ). 

In [None]:
df = pandas.read_csv("../input/iris-dataset/iris.data.csv", header=None)
dset = df.values
X = dset[:,0:4].astype(float)
Y = dset[:,4]

### View the shape and size of the data 

In [None]:
df.shape, df.size

### View the data 

In [None]:
df.tail()

### Unique values in the target variable 

In [None]:
df[4].unique()

### The output variable as you can see contains String values (Categorrical). There are 3 variables
### We will do a one hot encoding

### encode class values as integers

In [None]:
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)

In [None]:
print(encoded_Y)

### Convert integers to dummy variables (i.e. one hot encoded)

In [None]:
dummy_y = np_utils.to_categorical(encoded_Y)

In [None]:
print(dummy_y)

### Data is fully ready and we will now define the neural network models 

#### - KerasClassifier class in Keras can be used as an Estimator in scikit-learn. The KerasClassifier takes the name of a function as an argument. This function must return the constructed neural network model, ready for training.

#### - Below is a function that will create a baseline neural network for the iris classification problem. It creates a simple fully connected network with 1 hidden layer that contains 16 neurons.

#### - Hidden layer uses a rectifier activation function which is a good practice. Because we used a one-hot encoding for our iris dataset, the output layer must create 3 output values, one for each class. The output value with the largest value will be taken as the class predicted by the model.

#### - Network topology of this basic one-layer neural network is as follows:-
4 inputs ------> 8 hidden nodes -------> 3 outputs 

#### -  “softmax” activation It is often used as the last activation function of a neural network to normalize the output of a network to a probability distribution over predicted output classes. — Wikipedia[link] Very good link on how softmax works https://towardsdatascience.com/softmax-activation-function-how-it-actually-works-d292d335bd78

#### - "Adam" is an adaptive learning rate optimizer algorithn “categorical_crossentropy” is an logarithmic loss function more on this https://towardsdatascience.com/adam-latest-trends-in-deep-learning-optimization-6be9a291375c

#### - We can also pass arguments in the construction of the KerasClassifier class that will be passed on to the fit() function internally used to train the neural network. Here, we pass the number of epochs as 200 and batch size as 5 to use when training the model. Debugging is also turned off when training by setting verbose to 0.



In [None]:
# define baseline model
def baseline_model():
# create model
    model = Sequential()
    model.add(Dense(16, input_dim=4, activation='relu')) # Hidden Layer 
    model.add(Dense(3, activation='softmax')) # Output Layer
# Compile model
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model
estimator = KerasClassifier(build_fn=baseline_model, epochs=150, batch_size=5, verbose=0)

#### We can now evaluate the neural network model on our training data.

#### The scikit-learn has excellent capability to evaluate models using a suite of techniques. The gold standard for evaluating machine learning models is k-fold cross validation.

#### First we can define the model evaluation procedure. Here, we set the number of folds to be 10 (an excellent default) and to shuffle the data before partitioning it.

#### Finally evaluate our model (estimator) on our dataset (X and dummy_y) using a 10-fold cross-validation procedure (kfold).

In [None]:
kfold = KFold(n_splits=10, shuffle=True)
results = cross_val_score(estimator, X, dummy_y, cv=kfold)
print("Baseline: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

#### Performance of the model on unseen data, this is in the known top results available for this problem.

#### Big thanks to Jason Brownlee for inspiring all of us by very easy understandable hands on examples https://machinelearningmastery.com/about/