# Iris Flower Classification

## About the Project

This project focuses on classifying the Iris flower species using various machine learning models. The goal is to accurately predict the species of an Iris flower based on its features.

## About the Dataset

The Iris dataset is a classic dataset in the machine learning community. It contains 150 samples of Iris flowers, with 50 samples each from three species: Iris-setosa, Iris-versicolour, and Iris-virginica. Each sample includes four features:
- Sepal length
- Sepal width
- Petal length
- Petal width

## Import Libraries

To start, we need to import the necessary libraries:

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

import sklearn
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier, plot_tree
from sklearn.ensemble import RandomForestClassifier, AdaBoostClassifier
from sklearn.svm import SVC
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from sklearn.model_selection import RandomizedSearchCV
from catboost import CatBoostClassifier
from xgboost import XGBClassifier

## Load Dataset

We load the Iris dataset from `sklearn`:

In [None]:
iris = load_iris(as_frame=True)
data = iris['data']
data['target'] = iris['target']
data.info()

## Splitting Features and Target

Separate the features and the target variable:


In [None]:
X = data.drop(columns = ['target'],axis = 1)
y = data['target']
X

In [None]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2,random_state=42)
X_train.shape, X_test.shape

## Create Function to Evaluate Models

We create a function to evaluate the models using accuracy, precision, recall, and F1 score:

In [None]:
def evaluate_model(true, predicted):
    accuracy = accuracy_score(true, predicted)
    precision = precision_score(true, predicted, average='weighted')
    recall = recall_score(true, predicted, average='weighted')
    f1 = f1_score(true, predicted, average='weighted')
    return accuracy, precision, recall, f1

## Models Created, Ran, and Evaluated

We create, run, and evaluate the following models:

- **K-Nearest Neighbors**
- **Decision Tree**
- **Random Forest**
- **AdaBoost**
- **Support Vector Machine**
- **Logistic Regression**
- **CatBoost**
- **XGBoost**

Example for evaluating a model and showing the performance of each model:

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier, plot_tree
from sklearn.ensemble import RandomForestClassifier, AdaBoostClassifier
from sklearn.svm import SVC
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import RandomizedSearchCV
from catboost import CatBoostClassifier
from xgboost import XGBClassifier

# Load dataset
iris = load_iris()
X = iris.data
y = iris.target

# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Models
models = {
    "Logistic Regression": LogisticRegression(),
    "K-Neighbors Classifier": KNeighborsClassifier(),
    "Decision Tree": DecisionTreeClassifier(),
    "Random Forest Classifier": RandomForestClassifier(),
    "SVC": SVC(),
    "XGBClassifier": XGBClassifier(),
    "CatBoost Classifier": CatBoostClassifier(verbose=False),
    "AdaBoost Classifier": AdaBoostClassifier()
}

model_list = []
accuracy_list = []

def evaluate_model(y_true, y_pred):
    accuracy = accuracy_score(y_true, y_pred)
    precision = precision_score(y_true, y_pred, average='macro')
    recall = recall_score(y_true, y_pred, average='macro')
    f1 = f1_score(y_true, y_pred, average='macro')
    return accuracy, precision, recall, f1

# Train, evaluate and visualize models
for i in range(len(list(models))):
    model_name = list(models.keys())[i]
    model = models[model_name]
    model.fit(X_train, y_train)  # Train model

    # Make predictions
    y_train_pred = model.predict(X_train)
    y_test_pred = model.predict(X_test)
    
    # Evaluate Train and Test dataset
    model_train_accuracy, model_train_precision, model_train_recall, model_train_f1 = evaluate_model(y_train, y_train_pred)
    model_test_accuracy, model_test_precision, model_test_recall, model_test_f1 = evaluate_model(y_test, y_test_pred)

    print(model_name)
    model_list.append(model_name)
    
    print('Model performance for Training set')
    print("- Accuracy: {:.4f}".format(model_train_accuracy))
    print("- Precision: {:.4f}".format(model_train_precision))
    print("- Recall: {:.4f}".format(model_train_recall))
    print("- F1 Score: {:.4f}".format(model_train_f1))

    print('----------------------------------')
    
    print('Model performance for Test set')
    print("- Accuracy: {:.4f}".format(model_test_accuracy))
    print("- Precision: {:.4f}".format(model_test_precision))
    print("- Recall: {:.4f}".format(model_test_recall))
    print("- F1 Score: {:.4f}".format(model_test_f1))
    accuracy_list.append(model_test_accuracy)
    
    print('='*35)
    print('\n')

    # Visualize Decision Tree
    if model_name == "Decision Tree":
        plt.figure(figsize=(20, 10))
        plot_tree(model, filled=True, feature_names=iris.feature_names, class_names=iris.target_names)
        plt.title("Decision Tree Visualization")
        plt.show()


## Prediction

In [None]:
# Function to get user input
def get_user_input():
    print("Please enter the following features of the Iris flower:")
    sepal_length = float(input("Sepal length (cm): "))
    sepal_width = float(input("Sepal width (cm): "))
    petal_length = float(input("Petal length (cm): "))
    petal_width = float(input("Petal width (cm): "))
    return np.array([[sepal_length, sepal_width, petal_length, petal_width]])

# Function to choose model
def choose_model():
    print("Choose a model to make the prediction:")
    for i, model_name in enumerate(models.keys(), 1):
        print(f"{i}. {model_name}")
    choice = int(input("Enter the number corresponding to the model: "))
    model_name = list(models.keys())[choice - 1]
    return models[model_name], model_name

# Main function to make prediction

X_new = get_user_input()
model, model_name = choose_model()
prediction = model.predict(X_new)
species = {0: 'Iris-setosa', 1: 'Iris-versicolour', 2: 'Iris-virginica'}
print(f"The predicted species using {model_name} is: {species[prediction[0]]}")


## Conclusion

This project demonstrates the use of various machine learning models to classify Iris flower species. Each model was evaluated based on accuracy, precision, recall, and F1 score. The results show the effectiveness of different algorithms on this classification task.

## Acknowledgements

This project was completed as part of the Let's Grow More Virtual Internship Program. Special thanks to the Let's Grow More team for providing this opportunity.

---

