# F1 Race Points Prediction

Predict race finishing points from free practice session telemetry using XGBoost.

**Pipeline:**
1. Load practice session data via fastf1
2. Extract per-driver lap statistics (mean/max/min/std of 12 metrics)
3. Compute composite performance scores
4. Predict race points with a trained XGBoost model

In [None]:
import fastf1
import pandas as pd
import xgboost as xgb
import matplotlib.pyplot as plt

from get_data import enable_cache, load_session, get_lap_data, get_driver_map
from get_scores import compute_session_scores

enable_cache()

## 1. Session Scores

Compute composite performance scores from practice sessions.
Scores combine pace (lap/sector times), consistency (std dev), and speed trap data.

In [None]:
YEAR = 2024
GP = "Bahrain"

session = load_session(YEAR, GP, "FP1")
scores = compute_session_scores(session)

# Plot performance scores
fig, ax = plt.subplots(figsize=(10, 6))
ax.barh(scores["Driver"], scores["Score"], color="#E10600")
ax.set_xlabel("Composite Score")
ax.set_title(f"{GP} {YEAR} — FP1 Performance Scores")
ax.invert_yaxis()
plt.tight_layout()
plt.show()

## 2. Race Points Prediction

Load the trained XGBoost model and predict race points from qualifying/practice data.

In [None]:
# Load model and predict
model = xgb.XGBRegressor()
model.load_model("f1_model.json")

qualifying = load_session(YEAR, GP, "Q")
lap_data = get_lap_data(qualifying)
driver_map = get_driver_map(qualifying)

predictions = model.predict(lap_data)
pred_df = pd.DataFrame({
    "Driver": lap_data.index.astype(str).map(driver_map),
    "Predicted Points": predictions.round(1),
}).sort_values("Predicted Points", ascending=False).reset_index(drop=True)

pred_df.index += 1  # 1-based ranking
pred_df.index.name = "Rank"
pred_df

In [None]:
# Visualize predicted points
fig, ax = plt.subplots(figsize=(10, 6))
colors = ["#E10600" if pts > 0 else "#999999" for pts in pred_df["Predicted Points"]]
ax.barh(pred_df["Driver"], pred_df["Predicted Points"], color=colors)
ax.set_xlabel("Predicted Points")
ax.set_title(f"{GP} {YEAR} — Predicted Race Points")
ax.invert_yaxis()
plt.tight_layout()
plt.show()

## 3. Feature Importance

Which telemetry metrics matter most for predicting race points?

In [None]:
importance = pd.Series(
    model.feature_importances_,
    index=lap_data.columns,
).sort_values(ascending=True)

fig, ax = plt.subplots(figsize=(10, 8))
importance.tail(15).plot.barh(ax=ax, color="#E10600")
ax.set_xlabel("Feature Importance (gain)")
ax.set_title("Top 15 Features for Race Points Prediction")
plt.tight_layout()
plt.show()