In [71]:
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
from sklearn.metrics import precision_score
from sklearn.metrics import recall_score 
from sklearn.metrics import f1_score
from sklearn.metrics import confusion_matrix 
from sklearn.metrics import roc_auc_score

In [72]:
df = sns.load_dataset("iris")
df.head()
df.info()
df.describe()
df = sns.load_dataset("iris")

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 5 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   sepal_length  150 non-null    float64
 1   sepal_width   150 non-null    float64
 2   petal_length  150 non-null    float64
 3   petal_width   150 non-null    float64
 4   species       150 non-null    object 
dtypes: float64(4), object(1)
memory usage: 6.0+ KB


In [73]:
X = df.drop(columns=["species"])  
y = df["species"]  

In [74]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

In [75]:
clf = RandomForestClassifier(random_state=42)
clf.fit(X_train, y_train)

y_pred = clf.predict(X_test)
y_proba = clf.predict_proba(X_test)  

In [76]:
y_pred

array(['versicolor', 'setosa', 'virginica', 'versicolor', 'versicolor',
       'setosa', 'versicolor', 'virginica', 'versicolor', 'versicolor',
       'virginica', 'setosa', 'setosa', 'setosa', 'setosa', 'versicolor',
       'virginica', 'versicolor', 'versicolor', 'virginica', 'setosa',
       'virginica', 'setosa', 'virginica', 'virginica', 'virginica',
       'virginica', 'virginica', 'setosa', 'setosa', 'setosa', 'setosa',
       'versicolor', 'setosa', 'setosa', 'virginica', 'versicolor',
       'setosa', 'setosa', 'setosa', 'virginica', 'versicolor',
       'versicolor', 'setosa', 'setosa'], dtype=object)

In [77]:
y_proba

array([[0.  , 0.97, 0.03],
       [1.  , 0.  , 0.  ],
       [0.  , 0.02, 0.98],
       [0.  , 0.99, 0.01],
       [0.  , 0.92, 0.08],
       [1.  , 0.  , 0.  ],
       [0.  , 1.  , 0.  ],
       [0.  , 0.07, 0.93],
       [0.  , 0.85, 0.15],
       [0.  , 1.  , 0.  ],
       [0.  , 0.09, 0.91],
       [1.  , 0.  , 0.  ],
       [0.95, 0.05, 0.  ],
       [1.  , 0.  , 0.  ],
       [1.  , 0.  , 0.  ],
       [0.  , 0.87, 0.13],
       [0.  , 0.  , 1.  ],
       [0.  , 1.  , 0.  ],
       [0.  , 0.98, 0.02],
       [0.  , 0.  , 1.  ],
       [1.  , 0.  , 0.  ],
       [0.  , 0.07, 0.93],
       [1.  , 0.  , 0.  ],
       [0.  , 0.  , 1.  ],
       [0.  , 0.02, 0.98],
       [0.  , 0.02, 0.98],
       [0.  , 0.03, 0.97],
       [0.  , 0.  , 1.  ],
       [1.  , 0.  , 0.  ],
       [1.  , 0.  , 0.  ],
       [1.  , 0.  , 0.  ],
       [1.  , 0.  , 0.  ],
       [0.01, 0.99, 0.  ],
       [1.  , 0.  , 0.  ],
       [1.  , 0.  , 0.  ],
       [0.  , 0.06, 0.94],
       [0.  , 0.98, 0.02],
 

#### 1. Confusion Matrix
A **confusion matrix** is a table that describes the performance of a classification model on a set of data for which the true values are known. It shows the counts of:

- **True Positives (TP)**: Correctly predicted positive cases.
- **True Negatives (TN)**: Correctly predicted negative cases.
- **False Positives (FP)**: Incorrectly predicted as positive.
- **False Negatives (FN)**: Incorrectly predicted as negative.

In [78]:
conf_matrix = confusion_matrix(y_test, y_pred)
print("Confusion Matrix:\n", conf_matrix)

Confusion Matrix:
 [[19  0  0]
 [ 0 13  0]
 [ 0  0 13]]


### Accuracy Metric

**Accuracy** is a commonly used evaluation metric for classification models. It measures the proportion of correctly classified instances (both true positives and true negatives) out of the total number of instances in the dataset.

The formula for calculating accuracy is:

\[
\text{Accuracy} = \frac{\text{TP} + \text{TN}}{\text{TP} + \text{TN} + \text{FP} + \text{FN}}
\]

Where:
- **TP (True Positives)**: Correctly predicted positive instances.
- **TN (True Negatives)**: Correctly predicted negative instances.
- **FP (False Positives)**: Incorrectly predicted as positive.
- **FN (False Negatives)**: Incorrectly predicted as negative.

Accuracy provides a simple way to assess how well the model is performing, but it can be misleading if the dataset is imbalanced (i.e., one class is much more frequent than the other).

In [79]:
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy",accuracy)

Accuracy 1.0


### Precision Metric

**Precision** is a metric used to evaluate the accuracy of positive predictions. It is the ratio of correctly predicted positive observations to the total predicted positive observations. Precision answers the question: *Of all the instances the model predicted as positive, how many are actually positive?*

Precision is especially important when the cost of false positives is high. For example, in email spam detection, we want to minimize false positives (non-spam emails marked as spam).

In [80]:
precision = precision_score(y_test, y_pred, average='weighted') 
print(f"Precision (Weighted): {precision}")

Precision (Weighted): 1.0


**Recall** (also known as **sensitivity** or **true positive rate**) measures the proportion of actual positive instances that are correctly identified by the model. It answers the question: *Of all the instances that are actually positive, how many did the model correctly classify?*

The formula for recall is:

\[
\text{Recall} = \frac{\text{TP}}{\text{TP} + \text{FN}}
\]

Where:
- **TP (True Positives)**: Correctly predicted positive instances.
- **FN (False Negatives)**: Incorrectly predicted as negative.

Recall is crucial when the cost of false negatives is high, such as in medical diagnoses where missing a positive case (e.g., cancer detection) could be dangerous.


In [81]:
recall = recall_score(y_test, y_pred, average='weighted')
print(f"Recall (Weighted): {recall}")

Recall (Weighted): 1.0


In [82]:
conf_matrix = confusion_matrix(y_test, y_pred)
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred, average='weighted') 
recall = recall_score(y_test, y_pred, average='weighted')
f1 = f1_score(y_test, y_pred, average='weighted')
roc_auc = roc_auc_score(y_test, y_proba, multi_class='ovr')

The **F1 Score** is the harmonic mean of **precision** and **recall**. It provides a balance between the two metrics and is particularly useful when the class distribution is imbalanced. The formula for F1 score is:

\[
\text{F1 Score} = 2 \times \frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}}
\]

The F1 score helps to balance the trade-off between precision and recall. It is especially useful when you need to account for both false positives and false negatives.



In [83]:
f1 = f1_score(y_test, y_pred, average='weighted')
print(f"F1 Score (Weighted): {f1}")

F1 Score (Weighted): 1.0


**ROC AUC** is a metric that evaluates how well the classifier distinguishes between classes. The **Receiver Operating Characteristic (ROC)** curve plots the **True Positive Rate (Recall)** against the **False Positive Rate**. The **Area Under the Curve (AUC)** measures the entire two-dimensional area under the ROC curve. The AUC value ranges from 0 to 1:
- **AUC = 1**: Perfect model that perfectly distinguishes between classes.
- **AUC = 0.5**: The model performs no better than random guessing.
- **AUC < 0.5**: The model performs worse than random guessing.

The **multi-class AUC** can be computed with the "one-vs-rest" (OvR) method for multi-class classification.



In [84]:
roc_auc = roc_auc_score(y_test, y_proba, multi_class='ovr')
print(f"ROC-AUC: {roc_auc}")

ROC-AUC: 1.0
