# **Multi-classification problems occur when we have a class with more than two classes**

NOTE: 
* In chapter 2, we dealt with the development of a classifier for a two-class problem. There the values of the class were provided for us in numerical form. 
* In this chapter:
    - we shall deal with a problem where the class is more than 2 (this is called multi-class problem)
    - Furthermore, the data provided for this problem is such that the class are provided in "string" form rather than numerical form required by most neural network framework.
    - Based on the point above, we shall be using the Categorical Crossentropy as the loss function.
        - Categorical Crossentropy measures the difference between the predicted probabilities and the true label of the class we should have predicted. With this loss function, the loss is always big if the predicted probabilites is different from what we should have predicted. Deep dip on this. 

* For this problem, the output will be vectors containing the probabilities of each class. *However, the output in the dataset are in the form of strings, so they must first be converted into categorical variables*

* This less will show, how to first treat string output by converting them into categorical variable and then follow by one-hot encoding of this new variable


## Problem statement:
*We have data about the positions of dart throws by four individuals. Using the position coordinates of past throws by these competitors, we wish to build a classifier that can differentiates who throws the dart. This problem is a multi-class classification problem since each dart can only be thrown by one of 4 competitors. So classes/labels are mutually exclusive, and therefore we can build classifier that will have a final layer that will have as many neurons as the competitors, pass the output of this layer to the softmax activation function to achieve a total sum of probabilities of 1 over all competitors.*

In [15]:
import pandas as pd
from keras.api.models import Sequential
from keras.api.layers import Dense, InputLayer
from keras.api.utils import to_categorical, set_random_seed

from sklearn.model_selection import train_test_split

set_random_seed(123)


## **Data importation and assessment**


In [2]:
# Import the data
dart = pd.read_csv("/kaggle/input/dart-throws/darts_throws_competitors.csv")

# Check for data info - which will tell you the datatype and count
print(dart.info())

# Check for missing values
print(dart.isnull().sum())

# Print out a few rows of data to check
print(dart.head(4))

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 800 entries, 0 to 799
Data columns (total 3 columns):
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----  
 0   xCoord      800 non-null    float64
 1   yCoord      800 non-null    float64
 2   competitor  800 non-null    object 
dtypes: float64(2), object(1)
memory usage: 18.9+ KB
None
xCoord        0
yCoord        0
competitor    0
dtype: int64
     xCoord    yCoord competitor
0  0.196451 -0.520341      Steve
1  0.476027 -0.306763      Susan
2  0.003175 -0.980736    Michael
3  0.294078  0.267566       Kate


## **Data pre-processing**
    - Using the pandas Categorical() function here to convert string object to categorical object

In [14]:
# As you can see from above, the competitor column is a string

# Separate out the input and output data
input_dat = dart.drop(["competitor"], axis = 1).values
output = dart["competitor"]

# Convert output to categorical using pandas Categorical() function
output_cat = pd.Categorical(output)                     ## Converts ordinary string to categorical object
output_cat_numbers = output_cat.codes                ## This will encode the labels with number

# One hot encoding the above categorical numbers
output_cat_encoded = to_categorical(output_cat_numbers) ## This converts a vector into matrix


print(output_cat[:8])
# Now print the one-hot encoded labels
print('One-hot encoded competitors: \n',output_cat_encoded[:8])



['Steve', 'Susan', 'Michael', 'Kate', 'Steve', 'Kate', 'Kate', 'Steve']
Categories (4, object): ['Kate', 'Michael', 'Steve', 'Susan']
One-hot encoded competitors: 
 [[0. 0. 1. 0.]
 [0. 0. 0. 1.]
 [0. 1. 0. 0.]
 [1. 0. 0. 0.]
 [0. 0. 1. 0.]
 [1. 0. 0. 0.]
 [1. 0. 0. 0.]
 [0. 0. 1. 0.]]


### This is an alternative approach for the conversion and encoding

In [7]:
## An alternative method to convert the string object to numerical variable is via factors
output_factorized = output.copy().factorize()[0]
#print(output_factorized)

# One hot encoding the above categorical numbers
output_factorized_encoded = to_categorical(output_factorized) ## This converts a vector into matrix
print(output_factorized_encoded[:8])

[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]
 [1. 0. 0. 0.]
 [0. 0. 0. 1.]
 [0. 0. 0. 1.]
 [1. 0. 0. 0.]]


## **Split the data into training and testing components**

In [25]:
X_train, X_test, y_train, y_test = train_test_split(input_dat, output_cat_encoded,
                                                   test_size=0.3, stratify=output_cat_encoded)

## **Build the model architecture**
    - We wish to build a model with 3 dense layers of 128, 64 and 32 neurons each.

In [9]:
model = Sequential()
model.add(InputLayer(shape = (2,)))
model.add(Dense(128, activation = "relu"))
model.add(Dense(64, activation = "relu"))
model.add(Dense(32, activation = "relu"))
# For the output layer below,we add a dense layer with as many neurons as competitors: 4
model.add(Dense(4, activation = "sigmoid"))

model.summary()



## **Compliling the model**

In [10]:
# Compile your model using categorical_crossentropy loss
model.compile(loss="categorical_crossentropy",optimizer='adam',metrics=['accuracy'])

## **Fit the model on training and testing data**

In [23]:
model_fit = model.fit(X_train, y_train, epochs = 20, verbose = True)


Epoch 1/20
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.8380 - loss: 0.3952 
Epoch 2/20
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.8349 - loss: 0.3885 
Epoch 3/20
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.8396 - loss: 0.3865 
Epoch 4/20
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.8412 - loss: 0.3893 
Epoch 5/20
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.8363 - loss: 0.3898 
Epoch 6/20
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.8417 - loss: 0.3881 
Epoch 7/20
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.8385 - loss: 0.3917 
Epoch 8/20
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.8364 - loss: 0.3903 
Epoch 9/20
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━

In [None]:
print(model_fit)
print(model_fit.history["accuracy"])

## **Evaluate the model**

In [29]:
model_eval = model.evaluate(X_test, y_test)
print(model_eval[1])

[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.8466 - loss: 0.4502 
0.8583333492279053
