# Assignment 02 - Multiclass Classification using the Iris Dataset

## **00. Upload and Set Up the Dataset**

**00-1** Upload the `iris.csv` file into Google Drive (Colab folder).

## **01. Find and Load the File**

**01-a** Find the path of `iris.csv` file and save it in a variable called `fileName`. 

In [None]:

from google.colab import drive
drive.mount('/content/drive')

# Define file path variable
fileName = "/content/drive/My Drive/Assignment_02_iris.csv"  # Update if needed
print("File Path:", fileName)
    

**01-b** Read the content of the `iris.csv` file and save it in a variable, named `iris`. 

In [None]:

import pandas as pd

# Load the dataset
iris = pd.read_csv(fileName)

# Display the first few rows
iris.head()
    

## **02. Convert Species to Numeric Values**

**02-a** Convert species names to integers (`setosa -> 0`, `versicolor -> 1`, `virginica -> 2`).

In [None]:

from sklearn.preprocessing import LabelEncoder

# Convert species names to numbers
label_encoder = LabelEncoder()
iris["species"] = label_encoder.fit_transform(iris["species"])

# Display updated dataset
iris.head()
    

## **03. Split Data into Features (X) and Target (y)**

**03-a** Select the first four columns as the Data and call it `X`. 

In [None]:

# Select features (all columns except species)
X = iris.drop("species", axis=1)
X.head()
    

**03-b** Select the last column as the label and call it `y`. 

In [None]:

# Select target variable
y = iris["species"]
y.head()
    

## **04. Split Data into Training and Testing Sets**

**04-a** Split the data into `X_train, X_test, y_train, y_test` with 80% training and 20% testing.

In [None]:

from sklearn.model_selection import train_test_split

# Split the dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=123)

# Print dataset shapes
print("Training Set Shape:", X_train.shape)
print("Testing Set Shape:", X_test.shape)
    

## **05. Convert Labels to One-Hot Encoding**

**05-a** Convert `y_train` and `y_test` from categorical to one-hot encoded values.

In [None]:

from sklearn.preprocessing import OneHotEncoder
import numpy as np

# Initialize OneHotEncoder
encoder = OneHotEncoder(sparse_output=False)

# Reshape y_train and y_test for encoding
y_train_reshaped = np.array(y_train).reshape(-1, 1)
y_test_reshaped = np.array(y_test).reshape(-1, 1)

# Apply OneHotEncoder
one_hot_train_labels = encoder.fit_transform(y_train_reshaped)
one_hot_test_labels = encoder.transform(y_test_reshaped)

print("One-hot encoded labels for training data:
", one_hot_train_labels[:5])
    

## **06. Define and Train a Neural Network Model**

**06-a** Define a sequential model that contains 3 dense layers.

In [None]:

import tensorflow as tf
from tensorflow import keras

# Define the model
model = keras.Sequential([
    keras.layers.Dense(100, activation='relu', input_shape=(4,)),
    keras.layers.Dense(100, activation='relu'),
    keras.layers.Dense(3, activation='softmax')
])

# Compile the model
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
    

**07-a** Train the model using `epoch=20, batch=10`. 

In [None]:

# Train the model
model.fit(X_train, one_hot_train_labels, epochs=20, batch_size=10, validation_data=(X_test, one_hot_test_labels))
    

## **08. Evaluate Model Performance**

In [None]:

# Evaluate the model
test_loss, test_acc = model.evaluate(X_test, one_hot_test_labels)
print("Test Accuracy:", test_acc)
    

## **09. Repeat Training with Different Parameters**

**09-a** Train a smaller model with 10 hidden units per layer and `epochs=4, batch_size=10`. 

In [None]:

# Define a smaller model
small_model = keras.Sequential([
    keras.layers.Dense(10, activation='relu', input_shape=(4,)),
    keras.layers.Dense(10, activation='relu'),
    keras.layers.Dense(3, activation='softmax')
])

# Compile the model
small_model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model
small_model.fit(X_train, one_hot_train_labels, epochs=4, batch_size=10, validation_data=(X_test, one_hot_test_labels))
    

## **10. Final Submission Instructions**


- **Save and download this notebook (`.ipynb` file).**
- **Record a 5-10 minute video explaining your work.**
- **Upload the video to YouTube** and submit both the notebook and the video link.
    