**Lab: Evaluating Machine Learning Models with Metrics**

In this lab, you'll evaluate a machine learning model (K-Nearest Neighbors) using various classification and regression metrics. The goal is to understand how different metrics reflect model performance and how they can guide improvements.

**Lab Setup:**

1) Dataset: For this lab, we'll use the Iris dataset for classification.
2) Programming Language: Python
3) Libraries: pandas, numpy, scikit-learn, matplotlib

Make sure you have the following libraries installed. You can install them using pip:

In [None]:
pip install pandas numpy scikit-learn matplotlib

**Load the Iris Dataset**

1. Import necessary libraries and load the Iris dataset.

In [1]:
import numpy as np
import pandas as pd
from sklearn import datasets

# Load Iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

2. Print the first few rows of the dataset to familiarize yourself with the features and target labels.

In [None]:
# Convert to DataFrame for better visualization
df = pd.DataFrame(data=np.c_[iris['data'], iris['target']],
                  columns=iris['feature_names'] + ['target'])
print(df.head())

**Split the Data into Training and Testing Sets**

1. Split the dataset into training (80%) and testing (20%) sets.

In [3]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

**Train a K-Nearest Neighbors (KNN) Model**

1. Import and train a KNN classifier with **k=3**.

In [None]:
from sklearn.neighbors import KNeighborsClassifier

# Initialize KNN with k=3
knn = KNeighborsClassifier(n_neighbors=3)

# Train the model
knn.fit(X_train, y_train)

**Make Predictions**

1. Use the trained KNN model to make predictions on the test set.

In [5]:
# Predict on the test set
y_pred = knn.predict(X_test)

**Evaluate the Model Using Classification Metrics**

1. **Accuracy:** Compute the accuracy of the model.

In [None]:
from sklearn.metrics import accuracy_score

accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")

2. **Confusion Matrix:** Compute and display the confusion matrix.

In [None]:
from sklearn.metrics import confusion_matrix

cm = confusion_matrix(y_test, y_pred)
print("Confusion Matrix:\n", cm)

3. **Precision, Recall, and F1 Score:** Compute precision, recall, and F1 score.

In [None]:
from sklearn.metrics import precision_score, recall_score, f1_score

precision = precision_score(y_test, y_pred, average='weighted')
recall = recall_score(y_test, y_pred, average='weighted')
f1 = f1_score(y_test, y_pred, average='weighted')

print(f"Precision: {precision:.2f}")
print(f"Recall: {recall:.2f}")
print(f"F1 Score: {f1:.2f}")

**Evaluate the Model Using Regression Metrics**

Although this is a classification task, you can also evaluate models using regression metrics for demonstration purposes. You'll treat the class labels as continuous values.

1. **Mean Absolute Error (MAE):** Compute the MAE.

In [None]:
from sklearn.metrics import mean_absolute_error

mae = mean_absolute_error(y_test, y_pred)
print(f"Mean Absolute Error: {mae:.2f}")

2. **Mean Squared Error (MSE):** Compute the MSE.

In [None]:
from sklearn.metrics import mean_squared_error

mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse:.2f}")

3. **R-Squared (R²):** Compute the R-squared score.

In [None]:
from sklearn.metrics import r2_score

r2 = r2_score(y_test, y_pred)
print(f"R-Squared: {r2:.2f}")

**Lab Questions:**

1) **Accuracy:** Based on the accuracy score, would you say that the model performs well? What are the limitations of using accuracy as a metric?
2) **Confusion Matrix:** Analyze the confusion matrix. How many instances did the model classify incorrectly? Which class did the model struggle with the most?
3) **Precision, Recall, and F1 Score:** Between precision, recall, and F1 score, which one do you think is the most informative in this case? Why?
4) **Regression Metrics:** Even though this is a classification task, what insights can regression metrics like MAE and MSE provide?
5) **Improvements:** Based on the metrics you computed, what are some potential ways to improve the model’s performance (e.g., tuning hyperparameters, using a different model)?