# F1-Macro Average for Multiclass Classification

In this notebook, we will explore how to compute the F1-Macro Average for a multiclass classification problem. The F1-Macro average is particularly useful in multiclass classification, where each class has a different importance. It gives equal weight to the F1 score of each class, regardless of its frequency in the dataset.

## Overview of F1-Score
F1-Score is the harmonic mean of precision and recall. It is particularly useful in cases of imbalanced classes. The formula is:

$$F1 = 2 \times \frac{Precision \times Recall}{Precision + Recall}$$

### F1-Macro Average
- **Macro-average:** Calculate the F1-score for each class independently and then take the average. This treats all classes equally.

## Steps in this Notebook:
1. Create a mock dataset for a multiclass classification problem.
2. Train a classifier on the dataset.
3. Compute precision, recall, and F1-scores for each class.
4. Calculate the F1-Macro average.

In [1]:
# Import required libraries
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, f1_score
from sklearn.ensemble import RandomForestClassifier


## Step 1: Create a Mock Dataset
We'll create a simple mock dataset with three classes and a few features for demonstration.

In [2]:
# Mock dataset creation
np.random.seed(42)
X = np.random.randn(150, 4)
y = np.random.choice([0, 1, 2], size=150)  # 3 classes: 0, 1, 2

# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Display dataset shapes
X_train.shape, X_test.shape, y_train.shape, y_test.shape

## Step 2: Train a Classifier
We'll use a simple Random Forest Classifier for this task.

In [3]:
# Train a Random Forest Classifier
clf = RandomForestClassifier(random_state=42)
clf.fit(X_train, y_train)

# Make predictions
y_pred = clf.predict(X_test)

## Step 3: Compute Precision, Recall, and F1-Score
Now we'll compute the classification report to get precision, recall, and F1 scores for each class.

In [4]:
# Classification report
report = classification_report(y_test, y_pred, output_dict=True)
pd.DataFrame(report).transpose()

## Step 4: Calculate F1-Macro Average
In multiclass classification, the F1-macro average treats all classes equally. Here's how to compute it.

In [5]:
# Calculate F1-Macro Average
f1_macro = f1_score(y_test, y_pred, average='macro')
print('F1-Macro Average:', f1_macro)

## Conclusion
The F1-Macro average provides an equal-weighted average of the F1 scores for all classes, making it a good metric when you care about all classes equally, regardless of their frequency.