# Week 1: Iris Dataset Classification with Keras

### 1. Introduction & Objectives

In this notebook, we will use the Iris dataset to classify the species of iris flowers. 

Our objective for this notebook is to build a multiclass classifier with Keras, that takes the four numerical features of the iris samples as inputs and outputs the prediction for the species of the iris flowers. The classifier will include the input layer, one hidden layer, and the output layer. We will use the softmax activation function in the output layer and the categorical crossentropy loss function to train the model.


In this project, we will use the Iris dataset, which is available as a raw data file from the [UCI Machine Learning Repository](https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data). This dataset consists of 150 samples of iris flowers, each with four numerical features: **sepal length**, **sepal width**, **petal length**, and **petal width**. The target label specifies the **species** of each iris flower.

For model training and evaluation, we will divide the dataset as follows:

- **Training set**: 80% of the total samples (120 samples)
- **Validation set**: 10% of the training samples (12 samples from the training data)
- **Test set**: 20% of the total samples (30 samples)

To ensure an even distribution across the training, validation, and test sets, we will shuffle the data randomly before splitting. This setup provides sufficient data for training and allows us to evaluate the model's performance on separate validation and test sets.


#### 2.1 Importing Required Libraries and Loading the Dataset

To start, we’ll disable TensorFlow warnings to keep the output clean, and explicitly set the Keras backend to TensorFlow for consistency in model development. 

Next, we’ll import the necessary libraries, including `pandas` for data manipulation, `numpy` for numerical operations, and `matplotlib.pyplot` for potential data visualization. We’ll also import essential Keras components such as `layers`, `Input`, `Model`, and `callbacks` to facilitate building and training the neural network.

To load the Iris dataset, we’ll read directly from the UCI Machine Learning Repository’s raw data link. Using `pandas`, we’ll load the dataset into a DataFrame and assign column names to clarify the features: **sepal length**, **sepal width**, **petal length**, **petal width**, and the **species** label.


In [None]:
# Disabling the tensoflow warnings and setting the keras backend to tensorflow
import os

os.environ['KERAS_BACKEND'] = 'tensorflow'
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

In [None]:
# Importing the required libraries
import pandas as pd
import numpy as np
from keras import Input, Model
from keras.src.layers import Dense
from keras.src.callbacks import ModelCheckpoint

In [None]:
# Loading the dataset
url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'
column_names = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species']
dataset = pd.read_csv(url, names=column_names)

The dataset has been successfully loaded into a DataFrame. Let's take a look at the first few rows to understand the structure of the data.

In [None]:
# Displaying the first few rows of the dataset
dataset.head()

The dataset contains five columns: **sepal_length**, **sepal_width**, **petal_length**, **petal_width**, and **species**. The **species** column represents the target label for each sample, indicating the species of the iris flower. We can move onto the next step of preprocessing the data for training the model.

#### 2.2 Preprocessing the Data

Before training the model, we need to preprocess the data to ensure that it is suitable for training. In this step, we will perform the following preprocessing steps:

    Randomly shuffle the data to ensure an even distribution across the training, validation, and test sets.
    Split the features and target labels into separate variables.
    One-hot encode the target labels to convert them into a binary matrix representation.
    Split the data into training, validation, and test sets.
    
To shuffle the data, we will use `random.shuffle()` from the `numpy` library. This function will shuffle the indices of the samples in the dataset, allowing us to split the data randomly.

In [None]:
# Shuffling the data
np.random.seed(0)
dataset = dataset.sample(frac=1).reset_index(drop=True)

dataset.head()

The data has been successfully shuffled. Next, we will split the features and target labels into separate variables and one-hot encode the target labels.

In [None]:
# Split the features and target labels
features = dataset[['sepal_length', 'sepal_width', 'petal_length', 'petal_width']]
target = dataset['species']

# One-hot encode the target labels
target = pd.get_dummies(target).astype(int)

The features have been separated from the target labels, and the target labels have been one-hot encoded. We can now split the data into training and testing sets.

In [None]:
# Define the split ratio (80% training, 20% testing)
split = 0.8

# Split the feature labels into training and testing sets
split_index = int(split * len(dataset))
train_data, test_data = features[:split_index], features[split_index:]

# Split the target labels into training and testing sets
train_target, test_target = target[:split_index], target[split_index:]

The data has now been successfully split into training and testing sets. We can check the shape of the training and testing sets to ensure that the data has been split correctly.

In [None]:
# Display the shape of the training and testing sets
train_data.shape, train_target.shape, test_data.shape, test_target.shape

The training and testing sets contain 120 and 30 samples, respectively. We can now proceed to creating the model and fitting it to the training data.

### 3. Creating the Model

To build the neural network model, we will use the Keras functional API. The model will consist of the following layers:
    
        Input layer: Accepts the four numerical features of the iris samples.
        Hidden layer: Contains 8 neurons and uses the ReLU activation function.
        Output layer: Contains 3 neurons (one for each species) and uses the softmax activation function.

In [None]:
# Define the input layer
inputs = Input(shape=(4,), name='input')

# Define the hidden layer
hidden = Dense(8, activation='relu', name='hidden')(inputs)

# Define the output layer
outputs = Dense(3, activation='softmax', name='output')(hidden)

# Create the model
model = Model(inputs=inputs, outputs=outputs)

# Display the model summary
model.summary()

The model has been successfully created with the input, hidden, and output layers. The model summary provides information about the layers, including the number of parameters in each layer. We can now proceed to compiling and fitting the model to the training data.

### 4. Compiling and Fitting the Model

Before training the model, we need to compile it by specifying the optimizer, loss function, and evaluation metric. In this case, we will use the Adam optimizer, categorical crossentropy loss function, and accuracy as the evaluation metric. 

We will implement a callback to save the best model based on the validation loss during training. This callback will help us avoid overfitting by saving the model with the lowest validation loss.

During the training, we will also save the best model based on the validation loss using the `ModelCheckpoint` callback.

In [None]:
# Compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Define the ModelCheckpoint callback
callbacks = [ModelCheckpoint(filepath='../Models/iris_model.keras', monitor='val_loss', save_best_only=True)]

# Fit the model to the training data
history = model.fit(train_data, train_target, epochs=100, batch_size=16, validation_split=0.1, callbacks=callbacks)