# F1 Tire Degradation Modeling & Pit Strategy Optimization

## 1. Problem Statement

Formula 1 is a very competitive motor sport and the margin for victory can sometimes be razor thin.  For example, for the 2025 season, Lando Norris won the championship with 423 points versus Max Verstappen’s 421 points.   The championship title was decided in the season’s last race (Abi Dhabi), and yet it could have been secured earlier, a race before, if it were not for two errors – a car specification disqualification in Las Vegas, and a pit stop strategy blunder in Qatar.

For this project, I will focus on the pit strategy blunder in Qatar by answering this question.  “As a F1 Team Principal, when should we pit?”.   I propose answering this question by building a race simulator and pit strategy optimizer that answers the question each lap, should I pit now, or wait?   In the case of the 2025 Qatar race, when the Safety Car was deployed on lap 7, the pit strategy optimizer would have answered the question, “yes”, pit immediately.

The goal is of the project is to focus on F1 pit strategy in the face of tire degradation, available tires, tire usage rules, traffic and safety car deployment.    There are at least two potential opportunities for use of regression or ML techniques – modeling tire degradation and improving pit decision outcomes.



This project models Formula 1 tire degradation using multiple approaches (linear regression, quadratic regression, and machine learning models) and uses these models to build a pit strategy optimizer that minimizes total race time.

The notebook includes:
- Data exploration
- Tire degradation modeling
- Model validation and comparison
- Pit strategy optimization
- Final insights and recommendations

## 2. Data Sources & Structure

Describe:
- Where the data came from
- What each dataset contains
- Key columns (lap number, compound, lap time, stint number, etc.)
- Any assumptions or limitations

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

df = pd.read_csv("your_data.csv")
df.head()

## 3. Exploratory Data Analysis (EDA)

Explore:
- Lap time distributions
- Tire compound differences
- Stint lengths
- Degradation patterns

In [None]:
sns.lineplot(data=df, x="lap_number", y="lap_time", hue="compound")
plt.title("Lap Time vs Lap Number by Compound")
plt.show()

# 4. Tire Degradation Modeling

## 4.1 Race Lap Simulator with a linear degradation

In [25]:
import numpy as np
import pandas as pd

# ------------------------------------------------------------
# 1. Simple lap time model
# ------------------------------------------------------------

"""
Use of an actual race simulator, calling an API was explored, but the following approach was taken, "build my own" to help
provide a basic understanding of the components of lap time.

This is the simplest of models that will allow us to start.  Our pit strategy optimizer will provide this model with
a pit strategy.  Pit strategy is defined as follows,

1. What tire do I start the race with?
2. When do I pit?
3. What compound (tire type) do I switch to?
4. When do I pit again?

An optimal pit strategy will minimize lap times and ultimately result in the fastest race total time.
The race lap and total time simulator will allow us to try different compounds and pit time (which lap to pit on) to compare and visualize results

A stint is defined as the time frame that a particular tire compound in use

lap time = base_lap_time + compound_offset + degradation_rate per lap * tire age in laps, all in seconds

Initially we will only concern ourselves with races run in the dry,
so we don't need to model intermediate or wet tires, only soft, medium and hard compound tires

Soft tires are the fastest since they are sticky, but degrade quickly,
Hard tires degrade at the slowest rate, they last a long time, but are the slowest
Medium tires set in the middle of these two for speed and degradation.

Below Soft = S, Medium = M, Hard = H

A strategy example for Austin TX would be to start on Soft, Pit at Lap 18, then switch to Medium tires to complete the race
The short hand for this would be Soft -> Medium
"""

# To get a handle on the results, we will use the Austin TX venue
AUSTIN_BASE_LAP = 93 # the base lap time for the Austin TX venue is 93 seconds
AUSTIN_PIT_LOSS = 24 # this is the amount of time that is lost in the pit lane entrance, pit, and exit
AUSTIN_TOTAL_LAPS = 56 # total laps for Austin Race
TOTAL_LAPS = AUSTIN_TOTAL_LAPS
BASE_LAP_TIME = AUSTIN_BASE_LAP  # seconds, US Gran Prix Austin TX baseline


COMPOUND_OFFSETS = {
    "S": -1.2,   # Softs are fastest, this is the amount per lap where softs improve the baseline performance
    "M": 0.0,    # Medium tire baseline
    "H": +1.5    # Hards are slowest, this is the amount per lap where hard tires add take away from baseline performance
}


DEGRADATION_RATES = {
    "S": 0.12,  # Softs degrade fastest,
    "M": 0.08,  # Medium tires degrade at a rate that's between Softs and Hards
    "H": 0.05   # degrade at a rate that is slowest
}

PIT_LOSS = AUSTIN_PIT_LOSS

#define a function that returns the elapsed time for a lap given inputs tire compound and tire age

def lap_time(compound, tire_age):

    return (
        BASE_LAP_TIME
        + COMPOUND_OFFSETS[compound]
        + DEGRADATION_RATES[compound] * tire_age
    )


# ------------------------------------------------------------
# 2. Simulator
# ------------------------------------------------------------

# Let's try something, just to look up tuples

t = lap_time("S", 56)
t = 0
# now pass in strategy
strategy = "H"
def total_race_time(strategy, TOTAL_LAPS):
    t = 0
    for lap in range(1, TOTAL_LAPS+1):
        t += lap_time(strategy, lap)
        print("lap", lap, "time", t)

total_race_time(strategy, TOTAL_LAPS)
#print (t)
"""
def simulate_race(strategy, total_laps=AUSTIN_TOTAL_LAPS):

    strategy = list of tuples: [(start_compound, pit_lap, new_compound), ...]
    Example:
        [("M", 18, "H"), (None, 40, "S")]
    The first tuple defines the starting compound.

    # Extract starting compound
    start_compound = strategy[0][0]
    current_compound = start_compound
    tyre_age = 0
    total_time = 0.0
    pit_index = 1  # next pit in strategy list

    # Convert strategy to dict for quick lookup
    pit_dict = {pit_lap: new_comp for (_, pit_lap, new_comp) in strategy[1:]}

    lap_times = []

    for lap in range(1, total_laps + 1):

        # Check if this is a pit lap
        if lap in pit_dict:
            total_time += PIT_LOSS
            current_compound = pit_dict[lap]
            tyre_age = 0  # reset after pit

        # Compute lap time
        t = lap_time(current_compound, tyre_age)
        lap_times.append(t)
        total_time += t

        tyre_age += 1

    return total_time, lap_times


# ------------------------------------------------------------
# 3. Try a few strategies
# ------------------------------------------------------------
strategies = {
    "1-stop (M → H)": [
        ("M", 18, "H")
    ],
    "2-stop (M → H → S)": [
        ("M", 18, "H"),
        (None, 40, "S")
    ],
    "Aggressive (S → M → S)": [
        ("S", 12, "M"),
        (None, 38, "S")
    ],
    "1-stop alternative (S → M)": [
        ("S", 20, "M")
    ]
}

results = {}

for name, strat in strategies.items():
    total, laps = simulate_race(strat)
    results[name] = total

# Display results
pd.DataFrame.from_dict(results, orient="index", columns=["Total Race Time (s)"])
"""

lap 1 time 94.55
lap 2 time 189.14999999999998
lap 3 time 283.79999999999995
lap 4 time 378.49999999999994
lap 5 time 473.24999999999994
lap 6 time 568.05
lap 7 time 662.9
lap 8 time 757.8
lap 9 time 852.75
lap 10 time 947.75
lap 11 time 1042.8
lap 12 time 1137.8999999999999
lap 13 time 1233.05
lap 14 time 1328.25
lap 15 time 1423.5
lap 16 time 1518.8
lap 17 time 1614.1499999999999
lap 18 time 1709.55
lap 19 time 1805.0
lap 20 time 1900.5
lap 21 time 1996.05
lap 22 time 2091.65
lap 23 time 2187.3
lap 24 time 2283.0
lap 25 time 2378.75
lap 26 time 2474.55
lap 27 time 2570.4
lap 28 time 2666.3
lap 29 time 2762.25
lap 30 time 2858.25
lap 31 time 2954.3
lap 32 time 3050.4
lap 33 time 3146.55
lap 34 time 3242.75
lap 35 time 3339.0
lap 36 time 3435.3
lap 37 time 3531.65
lap 38 time 3628.05
lap 39 time 3724.5
lap 40 time 3821.0
lap 41 time 3917.55
lap 42 time 4014.15
lap 43 time 4110.8
lap 44 time 4207.5
lap 45 time 4304.25
lap 46 time 4401.05
lap 47 time 4497.900000000001
lap 48 time 4594.8


'\ndef simulate_race(strategy, total_laps=AUSTIN_TOTAL_LAPS):\n\n    strategy = list of tuples: [(start_compound, pit_lap, new_compound), ...]\n    Example:\n        [("M", 18, "H"), (None, 40, "S")]\n    The first tuple defines the starting compound.\n\n    # Extract starting compound\n    start_compound = strategy[0][0]\n    current_compound = start_compound\n    tyre_age = 0\n    total_time = 0.0\n    pit_index = 1  # next pit in strategy list\n\n    # Convert strategy to dict for quick lookup\n    pit_dict = {pit_lap: new_comp for (_, pit_lap, new_comp) in strategy[1:]}\n\n    lap_times = []\n\n    for lap in range(1, total_laps + 1):\n\n        # Check if this is a pit lap\n        if lap in pit_dict:\n            total_time += PIT_LOSS\n            current_compound = pit_dict[lap]\n            tyre_age = 0  # reset after pit\n\n        # Compute lap time\n        t = lap_time(current_compound, tyre_age)\n        lap_times.append(t)\n        total_time += t\n\n        tyre_age += 

## 4.2 Linear Tire Model

Model form:
lap_time = a + b * lap_number
or simpler

In [None]:
from sklearn.linear_model import LinearRegression

X = df[["lap_number"]]
y = df["lap_time"]]

lin_model = LinearRegression().fit(X, y)
lin_model.coef_, lin_model.intercept_

## 4.3 Quadratic Tire Model

Model form:
lap_time = a + b * lap_number + c * lap_number^2

In [None]:
df["lap_number_sq"] = df["lap_number"]**2
Xq = df[["lap_number", "lap_number_sq"]]

quad_model = LinearRegression().fit(Xq, y)
quad_model.coef_, quad_model.intercept_

## 4.4 Machine Learning Tire Model

Use a more flexible model (Random Forest, Gradient Boosting, etc.)
Include:
- Feature engineering
- Train/validation split

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor

features = ["lap_number", "lap_number_sq"]
X_ml = df[features]

X_train, X_test, y_train, y_test = train_test_split(X_ml, y, test_size=0.2)

rf = RandomForestRegressor(n_estimators=200)
rf.fit(X_train, y_train)

# 5. Model Validation & Comparison

## 5.1 Metrics

Compute RMSE, MAE, R² for each model.

In [None]:
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score

def evaluate(model, X, y):
    preds = model.predict(X)
    return {
        "RMSE": np.sqrt(mean_squared_error(y, preds)),
        "MAE": mean_absolute_error(y, preds),
        "R2": r2_score(y, preds)
    }

lin_eval = evaluate(lin_model, X, y)
quad_eval = evaluate(quad_model, Xq, y)
rf_eval = evaluate(rf, X_test, y_test)

lin_eval, quad_eval, rf_eval

## 5.2 Visual Validation

Plot predicted vs actual lap times.

In [None]:
plt.scatter(y_test, rf.predict(X_test), alpha=0.5)
plt.xlabel("Actual Lap Time")
plt.ylabel("Predicted Lap Time")
plt.title("Random Forest: Actual vs Predicted")
plt.show()

# 6. Pit Strategy Optimization

## 6.1 Problem Definition

Goal:
Minimize total race time by choosing optimal pit laps and tire compounds.

Constraints:
- Max stint length
- Tire compound rules
- Degradation model

## 6.2 Optimization Approach

Use simulation or brute-force search to compute total race time for:
- 1-stop strategies
- 2-stop strategies
- 3-stop strategies
- Possibly other strategies imposed by F1 for specific races, like Monaco (2 pit stops required) and Abi Dhabi (max distance imposed on tires)

In [None]:
def simulate_stint(start_lap, end_lap, model):
    laps = np.arange(start_lap, end_lap+1)
    X_sim = pd.DataFrame({"lap_number": laps, "lap_number_sq": laps**2})
    return model.predict(X_sim).sum()

race_length = df["lap_number"].max()

def simulate_strategy(pit_laps, model):
    total = 0
    stints = [1] + pit_laps + [race_length + 1]
    for i in range(len(stints)-1):
        total += simulate_stint(stints[i], stints[i+1]-1, model)
    return total

## 6.3 Results

Compute race times for all strategies and identify the optimal one.

In [None]:
strategies = {
    "1-stop": [20],
    "2-stop": [15, 35],
    "3-stop": [12, 25, 38]
}

results = {name: simulate_strategy(stops, rf) for name, stops in strategies.items()}
results

In [None]:
plt.bar(results.keys(), results.values())
plt.ylabel("Total Race Time (s)")
plt.title("Strategy Comparison")
plt.show()

# 7. Final Results & Insights

Summarize:
- Best-performing tire model
- Optimal pit strategy
- Key findings from the data

# 8. Limitations & Future Work

Discuss:
- Data limitations
- Model assumptions
- Potential improvements

# 9. Appendix

Additional plots, helper functions, raw tables, etc.