# Lesson 8: Algorithm Evaluation Metrics

## Classification

### Use the Accuracy and Cohen Kappa Score on a classification problem

In [31]:
from sklearn.datasets import load_wine # Classification dataset
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score, cohen_kappa_score

In [2]:
# Load the wine dataset
wine_data = load_wine()

In [3]:
# Convert the dataset to a pandas DataFrame
df = pd.DataFrame(data=wine_data.data, columns=wine_data.feature_names)

In [4]:
# Display the first few rows of the DataFrame
df.head(10)

Unnamed: 0,alcohol,malic_acid,ash,alcalinity_of_ash,magnesium,total_phenols,flavanoids,nonflavanoid_phenols,proanthocyanins,color_intensity,hue,od280/od315_of_diluted_wines,proline
0,14.23,1.71,2.43,15.6,127.0,2.8,3.06,0.28,2.29,5.64,1.04,3.92,1065.0
1,13.2,1.78,2.14,11.2,100.0,2.65,2.76,0.26,1.28,4.38,1.05,3.4,1050.0
2,13.16,2.36,2.67,18.6,101.0,2.8,3.24,0.3,2.81,5.68,1.03,3.17,1185.0
3,14.37,1.95,2.5,16.8,113.0,3.85,3.49,0.24,2.18,7.8,0.86,3.45,1480.0
4,13.24,2.59,2.87,21.0,118.0,2.8,2.69,0.39,1.82,4.32,1.04,2.93,735.0
5,14.2,1.76,2.45,15.2,112.0,3.27,3.39,0.34,1.97,6.75,1.05,2.85,1450.0
6,14.39,1.87,2.45,14.6,96.0,2.5,2.52,0.3,1.98,5.25,1.02,3.58,1290.0
7,14.06,2.15,2.61,17.6,121.0,2.6,2.51,0.31,1.25,5.05,1.06,3.58,1295.0
8,14.83,1.64,2.17,14.0,97.0,2.8,2.98,0.29,1.98,5.2,1.08,2.85,1045.0
9,13.86,1.35,2.27,16.0,98.0,2.98,3.15,0.22,1.85,7.22,1.01,3.55,1045.0


In [26]:
# Select the features and target labels
X = wine_data.data
y = wine_data.target
class_names = wine_data.target_names

# Print the shape of X and Y
print("Shape of X:", X.shape)
print("Shape of Y:", y.shape)

Shape of X: (178, 13)
Shape of Y: (178,)


In [27]:
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize and train classifier 
k = 3
clf = KNeighborsClassifier(n_neighbors=k)
clf.fit(X_train, y_train)

# Predict the labels for the test set
y_pred = clf.predict(X_test)

In [28]:
# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)

# Calculate Cohen's Kappa
kappa = cohen_kappa_score(y_test, y_pred)

print("Accuracy:", accuracy)
print("Cohen's Kappa:", kappa)

Accuracy: 0.8055555555555556
Cohen's Kappa: 0.704225352112676


* **Accuracy:** It measures the ratio of correctly predicted instances (both true positives and true negatives) to the total number of instances. While accuracy is simple to understand and interpret, it may not be suitable for imbalanced datasets where one class dominates the others. For such cases, additional metrics like precision, recall, and F1-score are often used.
* **Cohen's Kappa Score:** A statistic that measures the level of agreement between two raters (or evaluators) for categorical items. In the context of classification models, it measures the agreement between the actual class labels and the predicted class labels, correcting for the possibility of agreement occurring by chance. Cohen's Kappa Score ranges from -1 to 1, where 1 indicates perfect agreement, 0 indicates agreement equivalent to chance, and negative values indicate disagreement beyond what would be expected by chance.

### Generate a confusion matrix and a classification report

In [21]:
from sklearn.metrics import confusion_matrix, classification_report
import matplotlib.pyplot as plt
import seaborn as sns

In [22]:
# Generate confusion matrix
conf_matrix = confusion_matrix(y_test, y_pred)
print("Confusion Matrix:")
print(conf_matrix)

Confusion Matrix:
[[12  0  2]
 [ 1 11  2]
 [ 1  1  6]]


In [20]:
# Generate classification report
class_report = classification_report(y_test, y_pred, target_names=wine_data.target_names)
print("\nClassification Report:")
print(class_report)


Classification Report:
              precision    recall  f1-score   support

     class_0       0.86      0.86      0.86        14
     class_1       0.92      0.79      0.85        14
     class_2       0.60      0.75      0.67         8

    accuracy                           0.81        36
   macro avg       0.79      0.80      0.79        36
weighted avg       0.82      0.81      0.81        36



* **Confusion Matrix:** A tabular representation of the performance of a classification model. The confusion matrix consists of rows and columns, where each row corresponds to the true class, and each column corresponds to the predicted class. It provides valuable insights into the types of errors made by the model, such as false positives, false negatives, true positives, and true negatives.

* **Classification Report:** A classification report provides a comprehensive summary of the performance of a classification model. It includes metrics such as precision, recall, F1-score, and support for each class. 
	* **Precision** measures the proportion of true positive predictions among all positive predictions made by the model. 
	* **Recall (also known as sensitivity)** measures the proportion of true positive predictions among all actual positive instances. 
	* **F1-score** is the harmonic mean of precision and recall, providing a balanced measure of a model's performance. 
	* **Support** refers to the number of occurrences of each class in the dataset.taset.