**IRIS DATASET**
The Iris dataset contains 150 samples of iris flowers, each with four features: sepal length, sepal width, petal length, and petal width. 

We'll go through the process of applying a simple machine learning algorithm to this dataset using Python and the scikit-learn library.

***Step-by-Step Guide***
 1. Load the dataset
 2. Preprocess the data
 3. Split the dataset
 4. Train a model
 5. Evaluate the model

In [1]:
# importing libraries
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris

# For KNN
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix


In [42]:
# Loading Dataset
# Create a DataFrame from the iris data and target
iris = load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)
# df.head()
df['species'] = iris.target
print(df)
# print(df)

     sepal length (cm)  sepal width (cm)  petal length (cm)  petal width (cm)  \
0                  5.1               3.5                1.4               0.2   
1                  4.9               3.0                1.4               0.2   
2                  4.7               3.2                1.3               0.2   
3                  4.6               3.1                1.5               0.2   
4                  5.0               3.6                1.4               0.2   
..                 ...               ...                ...               ...   
145                6.7               3.0                5.2               2.3   
146                6.3               2.5                5.0               1.9   
147                6.5               3.0                5.2               2.0   
148                6.2               3.4                5.4               2.3   
149                5.9               3.0                5.1               1.8   

     species  
0          0

In [43]:
df.head(10)

Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),species
0,5.1,3.5,1.4,0.2,0
1,4.9,3.0,1.4,0.2,0
2,4.7,3.2,1.3,0.2,0
3,4.6,3.1,1.5,0.2,0
4,5.0,3.6,1.4,0.2,0
5,5.4,3.9,1.7,0.4,0
6,4.6,3.4,1.4,0.3,0
7,5.0,3.4,1.5,0.2,0
8,4.4,2.9,1.4,0.2,0
9,4.9,3.1,1.5,0.1,0


In [46]:
# Features (X) and labels (y)
X = df.drop('species', axis=1)
y = df['species']

In [49]:

# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)


In [52]:
from sklearn.neighbors import KNeighborsClassifier
# Initialize the model
knn = KNeighborsClassifier(n_neighbors=3)
# Train the model
knn.fit(X_train, y_train)

In [56]:
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

# Make predictions
y_pred = knn.predict(X_test)

# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy:.2f}')

# Detailed classification report
print(classification_report(y_test, y_pred, target_names=iris.target_names))


# print(confusion_matrix(y_test, y_pred))


Accuracy: 1.00
              precision    recall  f1-score   support

      setosa       1.00      1.00      1.00        19
  versicolor       1.00      1.00      1.00        13
   virginica       1.00      1.00      1.00        13

    accuracy                           1.00        45
   macro avg       1.00      1.00      1.00        45
weighted avg       1.00      1.00      1.00        45



In [58]:
# Extract metrics for the Setosa class (class 0)
# Confusion matrix
conf_matrix = confusion_matrix(y_test, y_pred)
setosa_precision = conf_matrix[0, 0] / conf_matrix[:, 0].sum()
setosa_recall = conf_matrix[0, 0] / conf_matrix[0, :].sum()
setosa_f1_score = 2 * (setosa_precision * setosa_recall) / (setosa_precision + setosa_recall)

print(f'Setosa Precision: {setosa_precision:.2f}')
print(f'Setosa Recall: {setosa_recall:.2f}')
print(f'Setosa F1-Score: {setosa_f1_score:.2f}')

Setosa Precision: 1.00
Setosa Recall: 1.00
Setosa F1-Score: 1.00
