# <font face = 'Palatino Linotype' color = '#4297A0'> Classification of Iris Plants with MLP using Tensorflow <font/>
## <font face = 'Palatino Linotype' color = '#2F5061'> Data Scientists: Paolo Hilado <font/>

## <font face = 'Palatino Linotype' color = '#4297A0'> Project Objective: <font/> 
<font face = 'Palatino Linotype' color = '#2F5061'> In this mini-project, we are going to train a Multilayer Perceptron Machine learning model using Tensorflow to classify the different species of iris plants. Using the four features (sepal length, sepal width, petal length, and petal width), the model should classify the observation whether it is an iris setosa, iris virginica, or iris versicolor. <font/>

In [1]:
# Load the necessary libraries.
import pandas as pd
import numpy as np
import tensorflow as tf
from tensorflow.keras import layers, models
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from tensorflow.keras.utils import to_categorical
from tensorflow.keras import Input
from tensorflow.keras.models import load_model

# Load the Iris Data Set for the classification problem.
iris = load_iris()
data = iris.data #retrieving the features
target = iris.target #retrieving the response variable

# Create the DataFrame
df = pd.DataFrame(data, columns=iris.feature_names)
df['species'] = iris.target_names[target] # adding the response variable into the dataframe
df.head() # checking the first 6 rows

Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),species
0,5.1,3.5,1.4,0.2,setosa
1,4.9,3.0,1.4,0.2,setosa
2,4.7,3.2,1.3,0.2,setosa
3,4.6,3.1,1.5,0.2,setosa
4,5.0,3.6,1.4,0.2,setosa


In [2]:
# Separating the Features and target variable.
X = df[iris.feature_names].values
y = df['species'].values

# Encode the target labels (species) as integers.
label_encoder = LabelEncoder()
y_encoded = label_encoder.fit_transform(y)

# Convert target labels to one-hot encoding.
# This is needed as the categorical_crossentropy loss function expects the target variable in 
# a one-hot coded format. This gets to represent categorical variables (in this case with 3 levels)
# to be in a binary format. For example, iris setosa being [1,0,0] etc. For multiclass classification
# it is also a best practice to use one-hot coding for Tensorflow.
y_onehot = to_categorical(y_encoded) 

# Do a data split with 70% train and 30% test set.
X_train, X_test, y_train, y_test = train_test_split(X, y_onehot, test_size=0.3, random_state=42)

In [9]:
# Standardize all the continuous variables.
from sklearn.preprocessing import StandardScaler
# Initialize StandardScaler.
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

In [14]:
# Build the Multi-Layer Perceptron Model.
# Let us set up a simple deep neural network model given that we are working on
# a data set that has patterns that can easily be detected.
model = tf.keras.Sequential([
    Input(shape=(X_train.shape[1],)),  # Explicitly define the input shape.
    tf.keras.layers.Dense(16, activation='relu'),  # Hidden layer
    tf.keras.layers.Dense(12, activation='relu'),  
    tf.keras.layers.Dense(3, activation='softmax') # Output layer
])

# Indicate the approaches for optimization, loss function, and evaluation of model performance.
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Train the MLP Model.
model.fit(X_train, y_train, epochs=50, batch_size=8, validation_data=(X_test, y_test))

# Evaluate model performance using accuracy as we have a balanced data set.
# Also check out the loss.
test_loss, test_acc = model.evaluate(X_test, y_test)
print(f"Test loss: {test_loss}")
print(f"Test accuracy: {test_acc * 100:.2f}%")

Epoch 1/50
[1m14/14[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 23ms/step - accuracy: 0.2972 - loss: 1.1816 - val_accuracy: 0.5333 - val_loss: 1.0195
Epoch 2/50
[1m14/14[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - accuracy: 0.4939 - loss: 1.0430 - val_accuracy: 0.6667 - val_loss: 0.9028
Epoch 3/50
[1m14/14[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - accuracy: 0.6053 - loss: 0.9556 - val_accuracy: 0.6667 - val_loss: 0.8121
Epoch 4/50
[1m14/14[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - accuracy: 0.6442 - loss: 0.8851 - val_accuracy: 0.6889 - val_loss: 0.7517
Epoch 5/50
[1m14/14[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - accuracy: 0.6906 - loss: 0.8193 - val_accuracy: 0.6889 - val_loss: 0.7047
Epoch 6/50
[1m14/14[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - accuracy: 0.5899 - loss: 0.8277 - val_accuracy: 0.7111 - val_loss: 0.6623
Epoch 7/50
[1m14/14[0m [32m━━━━━━━━━

In [15]:
# Save the model for future use.
model.save('MLP_TF_ClassIris.keras')  # This will save the model in a directory called 'Iris_class'

## <font face = 'Palatino Linotype' color = '#4297A0'> Summary:<font/> 
<font face = 'Palatino Linotype' color = '#2F5061'> This mini-project demonstrated the training of a Multi-Layer Perceptron ML model using Tensorflow library. The ML model has an accuracy of 95.56% for classifying the iris plant whether it is setosa, virginica, or versicolor. A simple deep neural network setup was considered given that the data set is less complex and that we also ensure that it avoids overfitting. <font/> 