# 🏁 01b – Baseline Model: Grid Position Rule

## 🎯 Objective
Establish a **simple rule-based baseline** for podium prediction:
> Predict podium = `True` if grid position ≤ 3; otherwise, `False`.

This model acts as a dummy classifier to compare future ML models against.

## 📊 Assumptions
- Total drivers per race: 20
- Only 3 drivers finish on the podium
- Podium distribution is ~15% of drivers
- The goal is to see how accurate a naïve grid-based rule is.

In [1]:
# 🧪 Create mock data for a single race
import pandas as pd
import numpy as np

# Simulate grid positions and actual podium results
np.random.seed(42)
drivers = [f'Driver_{i+1}' for i in range(20)]
grid_positions = list(range(1, 21))
true_podium = [1 if i < 3 else 0 for i in np.random.permutation(20)]  # Random podium

df = pd.DataFrame({
    'Driver': drivers,
    'GridPosition': grid_positions,
    'TruePodium': true_podium
})
df.head()

Unnamed: 0,Driver,GridPosition,TruePodium
0,Driver_1,1,1
1,Driver_2,2,0
2,Driver_3,3,0
3,Driver_4,4,1
4,Driver_5,5,0


In [2]:
# 🚦 Apply baseline rule: Predict podium if Grid ≤ 3
df['BaselinePrediction'] = df['GridPosition'] <= 3
df['BaselinePrediction'] = df['BaselinePrediction'].astype(int)
df

Unnamed: 0,Driver,GridPosition,TruePodium,BaselinePrediction
0,Driver_1,1,1,1
1,Driver_2,2,0,1
2,Driver_3,3,0,1
3,Driver_4,4,1,0
4,Driver_5,5,0,0
5,Driver_6,6,0,0
6,Driver_7,7,0,0
7,Driver_8,8,0,0
8,Driver_9,9,0,0
9,Driver_10,10,0,0


In [3]:
# 📈 Evaluate performance
from sklearn.metrics import precision_score, f1_score, accuracy_score

precision = precision_score(df['TruePodium'], df['BaselinePrediction'])
f1 = f1_score(df['TruePodium'], df['BaselinePrediction'])
accuracy = accuracy_score(df['TruePodium'], df['BaselinePrediction'])

print(f"Precision: {precision:.2f}")
print(f"F1 Score: {f1:.2f}")
print(f"Accuracy: {accuracy:.2f}")

Precision: 0.33
F1 Score: 0.33
Accuracy: 0.80


## 🧠 Interpretation
This simple baseline provides a reference to beat.

- If your ML model performs worse than this, it's not learning useful patterns.
- This also highlights the class imbalance and the limitations of relying only on grid position.