<a href="https://colab.research.google.com/github/psykeefuego/DS---ML/blob/main/Week8.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

30/05/24

# Artificial Neural Network

In [24]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import sklearn
from sklearn.datasets import load_iris

import keras
import tensorflow as tf

from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import Adam
from keras.utils import to_categorical

In [25]:
# Load the iris dataset
iris = load_iris()
data = pd.DataFrame(iris.data, columns=iris.feature_names)
data['target'] = iris.target.reshape(-1, 1)

In [26]:
# Prepare the data
X = data.drop('target', axis=1)
y = data['target']
y_one_hot = to_categorical(y)  # Convert to one-hot encoded vectors

In [27]:
# Split the data
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y_one_hot, test_size=0.2)

In [28]:
# Deep feed forward network i.e. the layers are constructed sequentially
model = Sequential()

In [29]:
# 3 layered neural network
model.add(Dense(10, input_shape=(4,), activation='relu', name='ly1'))  # input layer with 10 neurons
model.add(Dense(10, activation='relu', name='ly2'))  # middle layer with 10 neurons
model.add(Dense(3, activation='softmax', name='ly3'))  # output layer

# layer 1 : 10 neurons x 4 inputs + 10 bias = 50 params
# layer 2 : 10 neurons x

In [30]:
# Adam optimizer with a learning rate of 0.001
optimizer = Adam(learning_rate=0.001)
model.compile(optimizer, loss='categorical_crossentropy', metrics=['accuracy', 'AUC'])

In [31]:
print('Neural Network Model Summary:')
print(model.summary())

Neural Network Model Summary:
Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 ly1 (Dense)                 (None, 10)                50        
                                                                 
 ly2 (Dense)                 (None, 10)                110       
                                                                 
 ly3 (Dense)                 (None, 3)                 33        
                                                                 
Total params: 193 (772.00 Byte)
Trainable params: 193 (772.00 Byte)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
None


In [32]:
# Training the model
model.fit(X_train, y_train, verbose=2, batch_size=10, epochs=200)

Epoch 1/200
12/12 - 1s - loss: 1.2868 - accuracy: 0.3583 - auc: 0.5343 - 863ms/epoch - 72ms/step
Epoch 2/200
12/12 - 0s - loss: 1.1416 - accuracy: 0.3417 - auc: 0.5405 - 25ms/epoch - 2ms/step
Epoch 3/200
12/12 - 0s - loss: 1.0434 - accuracy: 0.2250 - auc: 0.5852 - 36ms/epoch - 3ms/step
Epoch 4/200
12/12 - 0s - loss: 0.9635 - accuracy: 0.2250 - auc: 0.6107 - 29ms/epoch - 2ms/step
Epoch 5/200
12/12 - 0s - loss: 0.9086 - accuracy: 0.4000 - auc: 0.7048 - 27ms/epoch - 2ms/step
Epoch 6/200
12/12 - 0s - loss: 0.8497 - accuracy: 0.6250 - auc: 0.7886 - 27ms/epoch - 2ms/step
Epoch 7/200
12/12 - 0s - loss: 0.8059 - accuracy: 0.6417 - auc: 0.8420 - 28ms/epoch - 2ms/step
Epoch 8/200
12/12 - 0s - loss: 0.7741 - accuracy: 0.6417 - auc: 0.8804 - 26ms/epoch - 2ms/step
Epoch 9/200
12/12 - 0s - loss: 0.7494 - accuracy: 0.6500 - auc: 0.8915 - 28ms/epoch - 2ms/step
Epoch 10/200
12/12 - 0s - loss: 0.7275 - accuracy: 0.7333 - auc: 0.9070 - 26ms/epoch - 2ms/step
Epoch 11/200
12/12 - 0s - loss: 0.6969 - accura

<keras.src.callbacks.History at 0x7a0e40738850>

In [34]:
# Evaluate the model
loss, accuracy, auc = model.evaluate(X_test, y_test)
print(f'Test loss: {loss:.4f}')
print(f'Test accuracy: {accuracy:.4f}')
print(f'Test AUC: {auc: .4f}')

Test loss: 0.0775
Test accuracy: 0.9667
Test AUC:  0.9983




---

1. The error occurs because the output layer of your neural network uses a softmax activation, which is designed for multi-class classification problems and expects the target labels to be `one-hot encoded vectors`.

          The `categorical_crossentropy loss` function also expects the target labels to be one-hot encoded.

2. Even though your labels are already numerical (0, 1, 2), `categorical_crossentropy` requires the labels to be in the format of `one-hot encoded vectors`, where each label is represented as a binary vector with a length equal to the number of classes.


* **Model Output Shape:** The output layer of your model has 3 units with a softmax activation, which means the model will output a probability distribution over 3 classes for each input. This results in predictions with shape (batch_size, 3).

* **Target Labels Shape:** Your target labels are currently integers (0, 1, 2) with shape (batch_size,). This is suitable for sparse categorical crossentropy but not for categorical crossentropy.

* **Shape Mismatch:** When you use categorical_crossentropy loss, it expects the target labels to have the same shape as the model’s output, i.e., (batch_size, 3). This is why you need to convert the target labels to one-hot encoded vectors.

