# 🏁 01b – Baseline Model: Grid Position Rule with Versioned Output

## 🎯 Objective
Establish a rule-based baseline for podium prediction:
> Predict podium = `True` if grid position ≤ 3; otherwise, `False`.

Automatically saves results to versioned folders for `v0_1_0`.

In [1]:
# 📁 Setup paths for versioned output
import os
version = 'v0_1_0'
os.makedirs(f'models/{version}', exist_ok=True)
os.makedirs(f'reports/{version}', exist_ok=True)

In [2]:
# 🧪 Create mock data for a single race
import pandas as pd
import numpy as np

np.random.seed(42)
drivers = [f'Driver_{i+1}' for i in range(20)]
grid_positions = list(range(1, 21))
true_podium = [1 if i < 3 else 0 for i in np.random.permutation(20)]

df = pd.DataFrame({
    'Driver': drivers,
    'GridPosition': grid_positions,
    'TruePodium': true_podium
})
df.head()

Unnamed: 0,Driver,GridPosition,TruePodium
0,Driver_1,1,1
1,Driver_2,2,0
2,Driver_3,3,0
3,Driver_4,4,1
4,Driver_5,5,0


In [3]:
# 🚦 Apply baseline rule
df['BaselinePrediction'] = (df['GridPosition'] <= 3).astype(int)
df

Unnamed: 0,Driver,GridPosition,TruePodium,BaselinePrediction
0,Driver_1,1,1,1
1,Driver_2,2,0,1
2,Driver_3,3,0,1
3,Driver_4,4,1,0
4,Driver_5,5,0,0
5,Driver_6,6,0,0
6,Driver_7,7,0,0
7,Driver_8,8,0,0
8,Driver_9,9,0,0
9,Driver_10,10,0,0


In [4]:
# 📈 Evaluate performance
from sklearn.metrics import precision_score, f1_score, accuracy_score
import json

precision = precision_score(df['TruePodium'], df['BaselinePrediction'])
f1 = f1_score(df['TruePodium'], df['BaselinePrediction'])
accuracy = accuracy_score(df['TruePodium'], df['BaselinePrediction'])

metrics = {
    'precision': round(precision, 4),
    'f1_score': round(f1, 4),
    'accuracy': round(accuracy, 4)
}
print(metrics)

# Save metrics
with open(f'reports/{version}/baseline_metrics.json', 'w') as f:
    json.dump(metrics, f, indent=4)

{'precision': 0.3333, 'f1_score': 0.3333, 'accuracy': 0.8}


## ✅ Output Saved
- `reports/v0_1_0/baseline_metrics.json`
- No model object saved (rule-based, no training needed)