# F1 Tire Degradation Modeling & Pit Strategy Optimization

## 1. Problem Statement

Formula 1 is a very competitive motor sport and the margin for victory can sometimes be razor thin.  For example, for the 2025 season, Lando Norris won the championship with 423 points versus Max Verstappen’s 421 points.   The championship title was decided in the season’s last race (Abi Dhabi), and yet it could have been secured earlier, a race before, if it were not for two errors – a car specification disqualification in Las Vegas, and a pit stop strategy blunder in Qatar.

For this project, I will focus on the pit strategy blunder in Qatar by answering this question.  “As a F1 Team Principal, when should we pit?”.   I propose answering this question by building a race simulator and pit strategy optimizer that answers the question each lap, should I pit now, or wait?   In the case of the 2025 Qatar race, when the Safety Car was deployed on lap 7, the pit strategy optimizer would have answered the question, “yes”, pit immediately.

The goal is of the project is to focus on F1 pit strategy in the face of tire degradation, available tires, tire usage rules, traffic and safety car deployment.    There are at least two potential opportunities for use of regression or ML techniques – modeling tire degradation and improving pit decision outcomes.



This project models Formula 1 tire degradation using multiple approaches (linear regression, quadratic regression, and machine learning models) and uses these models to build a pit strategy optimizer that minimizes total race time.

The notebook includes:
- Data exploration
- Tire degradation modeling
- Model validation and comparison
- Pit strategy optimization
- Final insights and recommendations

## 2. Data Sources & Structure

Describe:
- Where the data came from
- What each dataset contains
- Key columns (lap number, compound, lap time, stint number, etc.)
- Any assumptions or limitations

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

df = pd.read_csv("your_data.csv")
df.head()

## 3. Exploratory Data Analysis (EDA)

Explore:
- Lap time distributions
- Tire compound differences
- Stint lengths
- Degradation patterns

In [None]:
sns.lineplot(data=df, x="lap_number", y="lap_time", hue="compound")
plt.title("Lap Time vs Lap Number by Compound")
plt.show()

# 4. Tire Degradation Modeling

## 4.1 Linear Tire Model

Model form:
lap_time = a + b * lap_number
or simpler

In [None]:
from sklearn.linear_model import LinearRegression

X = df[["lap_number"]]
y = df["lap_time"]]

lin_model = LinearRegression().fit(X, y)
lin_model.coef_, lin_model.intercept_

## 4.2 Quadratic Tire Model

Model form:
lap_time = a + b * lap_number + c * lap_number^2

In [None]:
df["lap_number_sq"] = df["lap_number"]**2
Xq = df[["lap_number", "lap_number_sq"]]

quad_model = LinearRegression().fit(Xq, y)
quad_model.coef_, quad_model.intercept_

## 4.3 Machine Learning Tire Model

Use a more flexible model (Random Forest, Gradient Boosting, etc.)
Include:
- Feature engineering
- Train/validation split

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor

features = ["lap_number", "lap_number_sq"]
X_ml = df[features]

X_train, X_test, y_train, y_test = train_test_split(X_ml, y, test_size=0.2)

rf = RandomForestRegressor(n_estimators=200)
rf.fit(X_train, y_train)

# 5. Model Validation & Comparison

## 5.1 Metrics

Compute RMSE, MAE, R² for each model.

In [None]:
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score

def evaluate(model, X, y):
    preds = model.predict(X)
    return {
        "RMSE": np.sqrt(mean_squared_error(y, preds)),
        "MAE": mean_absolute_error(y, preds),
        "R2": r2_score(y, preds)
    }

lin_eval = evaluate(lin_model, X, y)
quad_eval = evaluate(quad_model, Xq, y)
rf_eval = evaluate(rf, X_test, y_test)

lin_eval, quad_eval, rf_eval

## 5.2 Visual Validation

Plot predicted vs actual lap times.

In [None]:
plt.scatter(y_test, rf.predict(X_test), alpha=0.5)
plt.xlabel("Actual Lap Time")
plt.ylabel("Predicted Lap Time")
plt.title("Random Forest: Actual vs Predicted")
plt.show()

# 6. Pit Strategy Optimization

## 6.1 Problem Definition

Goal:
Minimize total race time by choosing optimal pit laps and tire compounds.

Constraints:
- Max stint length
- Tire compound rules
- Degradation model

## 6.2 Optimization Approach

Use simulation or brute-force search to compute total race time for:
- 1-stop strategies
- 2-stop strategies
- 3-stop strategies
- Possibly other strategies imposed by F1 for specific races, like Monaco (2 pit stops required) and Abi Dhabi (max distance imposed on tires)

In [None]:
def simulate_stint(start_lap, end_lap, model):
    laps = np.arange(start_lap, end_lap+1)
    X_sim = pd.DataFrame({"lap_number": laps, "lap_number_sq": laps**2})
    return model.predict(X_sim).sum()

race_length = df["lap_number"].max()

def simulate_strategy(pit_laps, model):
    total = 0
    stints = [1] + pit_laps + [race_length + 1]
    for i in range(len(stints)-1):
        total += simulate_stint(stints[i], stints[i+1]-1, model)
    return total

## 6.3 Results

Compute race times for all strategies and identify the optimal one.

In [None]:
strategies = {
    "1-stop": [20],
    "2-stop": [15, 35],
    "3-stop": [12, 25, 38]
}

results = {name: simulate_strategy(stops, rf) for name, stops in strategies.items()}
results

In [None]:
plt.bar(results.keys(), results.values())
plt.ylabel("Total Race Time (s)")
plt.title("Strategy Comparison")
plt.show()

# 7. Final Results & Insights

Summarize:
- Best-performing tire model
- Optimal pit strategy
- Key findings from the data

# 8. Limitations & Future Work

Discuss:
- Data limitations
- Model assumptions
- Potential improvements

# 9. Appendix

Additional plots, helper functions, raw tables, etc.