In [1]:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier

# Load dataset
X, y = load_iris(return_X_y=True)

# Split into train/test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)

# Train a classifier
model = DecisionTreeClassifier()
model.fit(X_train, y_train)

# Make predictions
y_pred = model.predict(X_test)

# Evaluation metrics
print("Accuracy:", accuracy_score(y_test, y_pred))
print("Precision:", precision_score(y_test, y_pred, average='macro'))
print("Recall:", recall_score(y_test, y_pred, average='macro'))
print("F1 Score:", f1_score(y_test, y_pred, average='macro'))

Accuracy: 0.9777777777777777
Precision: 0.9722222222222222
Recall: 0.9814814814814815
F1 Score: 0.975983436853002


## Topic 20 – Evaluation Metrics

Evaluation metrics are used to assess the performance of classification models. The most commonly used metrics include:

- **Accuracy**: Overall, how often the classifier is correct.
- **Precision**: How many selected items are relevant.
- **Recall**: How many relevant items are selected.
- **F1 Score**: Harmonic mean of precision and recall.

### 🔹 Dataset:
- Iris dataset from `sklearn.datasets`

### 🔹 Implementation:
- Model: `DecisionTreeClassifier`
- Metrics: `accuracy_score`, `precision_score`, `recall_score`, `f1_score`

### 🔹 Average Type:
- `macro`: calculates metric independently for each class and takes the average.
