Single Race Exploratory Data Analysis -

The purpose of this notebook is to explore and wrangle data from the Fast-F1 API for a single race. This serves as a starting point to establish a clean workflow for data acquisition, preparation, and early exploration before scaling to a multi-race analysis. I will focus on building a data dictionary, assessing data quality, generating descriptive statistics, and creating preliminary visualizations to understand the structure and sufficiency of the data. The outcome of this notebook will be a reproducible workflow for future feature engineering and multi-race EDA.

The code below adds the parent directory to Python’s module search path and configures logging to suppress all FastF1 logs below the warning level. This will enable subsequent code blocks that use imports to work seamlessly and keep my resulting code compilations clean and easy to read.

In [88]:
import sys
import os
import logging

# Add the root directory to sys.path
root = os.path.abspath("..")
sys.path.append(root)

# Suppress FastF1 info logs globally
logging.getLogger('fastf1').setLevel(logging.WARNING)

In this section, I import Python libraries for data visualization, numerical analysis, and working with Pandas dataframes that the FastF1 API is primarily structured with. I also import custom modules for accessing preprocessed F1 data and constants. To support full visibility into the datasets without truncation, I configure Pandas display options to show all rows and columns.

In [89]:
from src.data import f1_data
from src.utils import f1_constants, f1_pandas_helpers

from matplotlib.collections import LineCollection
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

# Set pandas display options
pd.set_option('display.max_rows', None)  # reset_option to compact view
pd.set_option('display.max_columns', None)


The following code initializes a single F1 race session by defining parameters such as year, location, and session type. These values are passed into the custom F1Session class (from f1data.py), which creates a session object built on top of FastF1. This object provides access to key race data as well as custom functions I’ve implemented.

The session parameters were chosen to best match Tier 1 control qualities:

- Weather: Abu Dhabi (dry conditions)
- Max Speed: C5 Ultra Soft Tires
- Minimize Outliers: Top 3 valid laps from Q3
- Traffic: Remove tow laps to avoid slipstream bias

In [90]:
# Define session parameters
year = 2024
grand_prix = f1_constants.F1Constants.LOCATIONS["Abu Dhabi"]
session_type = f1_constants.F1Constants.SESSIONS["Q"]

# Call session object
session = f1_data.F1Session(year, grand_prix, session_type)

All drivers who participated in the Q2 and Q3 sessions will be analyzed and assigned variables identified by their three-letter name code. Q3 data will be prioritized over Q2 if available for that driver. The fastest lap during their qualifying session will be taken and divided out into separate track sector datasets then merged with other drivers' telemetry data in that particular sector.

In [91]:
# Drivers from each team that participated in Q2 and Q3 accessed by their three-letter code

norris_mcLaren = 'NOR'
piastri_mcLaren = 'PIA'

verstappen_redBull = 'VER'
perez_redBull = 'PER'

sainz_ferrari = 'SAI' 
leclerc_ferrari = 'LEC'

gasly_alpine = 'GAS' 
ocon_alpine = 'OCO'

bottas_alfaRomeo = 'BOT' 
tsunoda_alfaRomeo = 'TSU'

alonso_astonMartin = 'ALO' 
stroll_astonMartin = 'STR'

russell_mercedes = 'RUS'

hulkenberg_haas = 'HUL'


In [100]:
# Q3 data for Lando Norris, McLaren
q1, q2, q3 = session.get_laps(norris_mcLaren).split_qualifying_sessions()

q3



Unnamed: 0,Time,Driver,DriverNumber,LapTime,LapNumber,Stint,PitOutTime,PitInTime,Sector1Time,Sector2Time,Sector3Time,Sector1SessionTime,Sector2SessionTime,Sector3SessionTime,SpeedI1,SpeedI2,SpeedFL,SpeedST,IsPersonalBest,Compound,TyreLife,FreshTyre,Team,LapStartTime,LapStartDate,TrackStatus,Position,Deleted,DeletedReason,FastF1Generated,IsAccurate
10,0 days 01:09:22.789000,NOR,4,NaT,11.0,5.0,0 days 01:07:16.829000,NaT,NaT,0 days 00:00:42.999000,0 days 00:00:39.401000,NaT,0 days 01:08:43.397000,0 days 01:09:22.834000,265.0,277.0,226.0,224.0,False,SOFT,3.0,False,McLaren,0 days 00:59:39.729000,2024-12-07 14:44:39.732,12,,False,,False,False
11,0 days 01:10:45.738000,NOR,4,0 days 00:01:22.949000,12.0,5.0,NaT,NaT,0 days 00:00:17.009000,0 days 00:00:36.014000,0 days 00:00:29.926000,0 days 01:09:39.798000,0 days 01:10:15.812000,0 days 01:10:45.738000,288.0,317.0,221.0,321.0,True,SOFT,4.0,False,McLaren,0 days 01:09:22.789000,2024-12-07 14:54:22.792,1,,False,,False,True
12,0 days 01:12:27.398000,NOR,4,0 days 00:01:41.660000,13.0,5.0,NaT,0 days 01:12:26.288000,0 days 00:00:20.327000,0 days 00:00:43.134000,0 days 00:00:38.199000,0 days 01:11:06.065000,0 days 01:11:49.199000,0 days 01:12:27.398000,238.0,228.0,,209.0,False,SOFT,5.0,False,McLaren,0 days 01:10:45.738000,2024-12-07 14:55:45.741,1,,False,,False,False
13,0 days 01:16:59.302000,NOR,4,NaT,14.0,6.0,0 days 01:14:44.446000,NaT,NaT,0 days 00:00:45.179000,0 days 00:00:42.104000,NaT,0 days 01:16:17.267000,0 days 01:16:59.496000,260.0,283.0,229.0,256.0,False,SOFT,1.0,True,McLaren,0 days 01:12:27.398000,2024-12-07 14:57:27.401,1,,False,,False,False
14,0 days 01:18:21.897000,NOR,4,0 days 00:01:22.595000,15.0,6.0,NaT,NaT,0 days 00:00:16.958000,0 days 00:00:35.776000,0 days 00:00:29.861000,0 days 01:17:16.260000,0 days 01:17:52.036000,0 days 01:18:21.897000,289.0,318.0,220.0,323.0,True,SOFT,2.0,True,McLaren,0 days 01:16:59.302000,2024-12-07 15:01:59.305,1,,False,,False,True
15,0 days 01:20:20.243000,NOR,4,0 days 00:01:58.346000,16.0,6.0,NaT,0 days 01:20:17.969000,0 days 00:00:21.461000,0 days 00:00:49.238000,0 days 00:00:47.647000,0 days 01:18:43.358000,0 days 01:19:32.596000,0 days 01:20:20.243000,196.0,201.0,,166.0,False,SOFT,3.0,True,McLaren,0 days 01:18:21.897000,2024-12-07 15:03:21.900,1,,False,,False,False


In [None]:
# Q3 telemetry data for Lando Norris, McLaren
q1, q2, q3 = session.get_laps(norris_mcLaren).split_qualifying_sessions()

norris_q3_telemetry = session.get_telemetry(q3)

sector_1_start = "0 days 01:16:59.496000"
sector_1_end = "0 days 01:17:16.260000"
sector_2_end = "0 days 01:17:52.036000"
sector_3_end = "0 days 01:18:21.897000"

# norris_sector1_telemetry = f1_pandas_helpers.filter_timestamp_range(norris_q3_telemetry, start=sector_1_start, end=sector_1_end)

# norris_sector1_telemetry

# norris_sector2_telemetry = f1_pandas_helpers.filter_timestamp_range(norris_q3_telemetry, start=sector_1_end, end=sector_2_end)

# norris_sector2_telemetry

norris_sector3_telemetry = f1_pandas_helpers.filter_timestamp_range(norris_q3_telemetry, start=sector_2_end, end=sector_3_end)

norris_sector3_telemetry

In [None]:
# Q3 telemetry data for Oscar Piastri, McLaren
q1, q2, q3 = session.get_laps(piastri_mcLaren).split_qualifying_sessions()

piastri_q3_telemetry = session.get_telemetry(q3)

piastri_q3_telemetry

In [None]:
# Q3 telemetry data for Max Verstappen, Red Bull
q1, q2, q3 = session.get_laps(verstappen_redBull).split_qualifying_sessions()

verstappen_q3_telemetry = session.get_telemetry(q3)

verstappen_q3_telemetry

In [None]:
# Q3 telemetry data for Sergio Perez, Red Bull
q1, q2, q3 = session.get_laps(perez_redBull).split_qualifying_sessions()

perez_q3_telemetry = session.get_telemetry(q3)

perez_q3_telemetry

In [None]:
# Q3 telemetry data for Carlos Sainz, Ferrari
q1, q2, q3 = session.get_laps(sainz_ferrari).split_qualifying_sessions()

sainz_q3_telemetry = session.get_telemetry(q3)

sainz_q3_telemetry

In [None]:
# Q2 telemetry data for Charles Leclerc, Ferrari
q1, q2, q3 = session.get_laps(leclerc_ferrari).split_qualifying_sessions()

leclerc_q2_telemetry = session.get_telemetry(q2)

leclerc_q2_telemetry

In [None]:
# Q3 telemetry data for Pierre Gasly, Alpine
q1, q2, q3 = session.get_laps(gasly_alpine).split_qualifying_sessions()

gasly_q3_telemetry = session.get_telemetry(q3)

gasly_q3_telemetry

In [None]:
# Q2 telemetry data for Esteban Ocon, Alpine
q1, q2, q3 = session.get_laps(ocon_alpine).split_qualifying_sessions()

ocon_q2_telemetry = session.get_telemetry(q2)

ocon_q2_telemetry

In [None]:
# Q3 telemetry data for Valtteri Bottas, Alfa Romeo
q1, q2, q3 = session.get_laps(bottas_alfaRomeo).split_qualifying_sessions()

bottas_q3_telemetry = session.get_telemetry(q3)

bottas_q3_telemetry

In [None]:
# Q2 telemetry data for Yuki Tsunoda, Alfa Romeo
q1, q2, q3 = session.get_laps(tsunoda_alfaRomeo).split_qualifying_sessions()

tsunoda_q2_telemetry = session.get_telemetry(q2)

tsunoda_q2_telemetry

In [None]:
# Q3 telemetry data for Fernando Alonso, Aston Martin
q1, q2, q3 = session.get_laps(alonso_astonMartin).split_qualifying_sessions()

alonso_q3_telemetry = session.get_telemetry(q3)

alonso_q3_telemetry

In [None]:
# Q2 telemetry data for Lance Stroll, Aston Martin
q1, q2, q3 = session.get_laps(stroll_astonMartin).split_qualifying_sessions()

stroll_q2_telemetry = session.get_telemetry(q2)

stroll_q2_telemetry

In [None]:
# Q3 telemetry data for George Russell, Mercedes
q1, q2, q3 = session.get_laps(russell_mercedes).split_qualifying_sessions()

russell_q3_telemetry = session.get_telemetry(q3)

russell_q3_telemetry

In [None]:
# Q3 telemetry data for Nico Hulkenberg, Haas
q1, q2, q3 = session.get_laps(hulkenberg_haas).split_qualifying_sessions()

hulkenberg_q3_telemetry = session.get_telemetry(q3)

hulkenberg_q3_telemetry