# Classification of Iris Plants using MLP
### Data Scientist: Paolo G. Hilado

Project Objective: Train a Multilayer Perceptron Machine learning model for Classification using Tensorflow package. Using the four features (sepal length, sepal width, petal length, and petal width), the model should classify the observation whether it is an iris setosa, iris virginica, or iris versicolor. 

In [22]:
# Load the necessary libraries.
import pandas as pd
import numpy as np
import tensorflow as tf
from tensorflow.keras import layers, models
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from tensorflow.keras.utils import to_categorical
from tensorflow.keras import Input
from tensorflow.keras.models import load_model

# Load the Iris Data Set for the classification problem.
iris = load_iris()
data = iris.data #retrieving the features
target = iris.target #retrieving the response variable

# Create the DataFrame
df = pd.DataFrame(data, columns=iris.feature_names)
df['species'] = iris.target_names[target] # adding the response variable into the dataframe
df.head() # checking the first 6 rows

Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),species
0,5.1,3.5,1.4,0.2,setosa
1,4.9,3.0,1.4,0.2,setosa
2,4.7,3.2,1.3,0.2,setosa
3,4.6,3.1,1.5,0.2,setosa
4,5.0,3.6,1.4,0.2,setosa


In [15]:
# Separating the Features and target variable.
X = df[iris.feature_names].values
y = df['species'].values

# Encode the target labels (species) as integers.
label_encoder = LabelEncoder()
y_encoded = label_encoder.fit_transform(y)

# Convert target labels to one-hot encoding.
# This is needed as the categorical_crossentropy loss function expects the target variable in 
# a one-hot coded format. This gets to represent categorical variables (in this case with 3 levels)
# to be in a binary format. For example, iris setosa being [1,0,0] etc. For multiclass classification
# it is also a best practice to use one-hot coding for Tensorflow.
y_onehot = to_categorical(y_encoded) 

# Do a data split with 70% train and 30% test set.
X_train, X_test, y_train, y_test = train_test_split(X, y_onehot, test_size=0.3, random_state=42)

In [19]:
# Build the Multi-Layer Perceptron Model.
model = tf.keras.Sequential([
    Input(shape=(X_train.shape[1],)),  # Explicitly define the input shape.
    tf.keras.layers.Dense(64, activation='relu'),  # Hidden layer
    tf.keras.layers.Dense(32, activation='relu'),  # Hidden layer
    tf.keras.layers.Dense(3, activation='softmax')  # Output layer
])

# Indicate the approaches for optimization, loss function, and evaluation of model performance.
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Train the MLP Model.
model.fit(X_train, y_train, epochs=50, batch_size=8, validation_data=(X_test, y_test))

# Evaluate model performance using accuracy as we have a balanced data set.
# Also check out the loss.
test_loss, test_acc = model.evaluate(X_test, y_test)
print(f"Test loss: {test_loss}")
print(f"Test accuracy: {test_acc * 100:.2f}%")

Epoch 1/50
[1m14/14[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 24ms/step - accuracy: 0.3247 - loss: 1.3875 - val_accuracy: 0.3333 - val_loss: 1.0020
Epoch 2/50
[1m14/14[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - accuracy: 0.4141 - loss: 0.9191 - val_accuracy: 0.7333 - val_loss: 0.8196
Epoch 3/50
[1m14/14[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - accuracy: 0.6744 - loss: 0.8029 - val_accuracy: 0.7111 - val_loss: 0.7131
Epoch 4/50
[1m14/14[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - accuracy: 0.7360 - loss: 0.7407 - val_accuracy: 0.8889 - val_loss: 0.6273
Epoch 5/50
[1m14/14[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - accuracy: 0.7981 - loss: 0.6407 - val_accuracy: 0.7333 - val_loss: 0.5493
Epoch 6/50
[1m14/14[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - accuracy: 0.6704 - loss: 0.6175 - val_accuracy: 0.8000 - val_loss: 0.5011
Epoch 7/50
[1m14/14[0m [32m━━━━━━━━━

In [21]:
# Save the model for future use.
model.save('Iris_class.keras')  # This will save the model in a directory called 'Iris_class'

### This project demonstrated the training of a Multi-Layer Perceptron ML model using Tensorflow library. The ML model has an accuracy of 97.78% for classifying the iris plant whether it is setosa, virginica, or versicolor. 