Single Race Exploratory Data Analysis -

The purpose of this notebook is to explore and wrangle data from the Fast-F1 API for a single race. This serves as a starting point to establish a clean workflow for data acquisition, preparation, and early exploration before scaling to a multi-race analysis. I will focus on building a data dictionary, assessing data quality, generating descriptive statistics, and creating preliminary visualizations to understand the structure and sufficiency of the data. The outcome of this notebook will be a reproducible workflow for future feature engineering and multi-race EDA.

The code below adds the parent directory to Python’s module search path and configures logging to suppress all FastF1 logs below the warning level. This will enable subsequent code blocks that use imports to work seamlessly and keep my resulting code compilations clean and easy to read.

In [37]:
import sys
import os
import logging

# Add the root directory to sys.path
root = os.path.abspath("..")
sys.path.append(root)

# Suppress FastF1 info logs globally
logging.getLogger('fastf1').setLevel(logging.WARNING)

In this section, I import Python libraries for data visualization, numerical analysis, and working with Pandas dataframes that the FastF1 API is primarily structured with. I also import custom modules for accessing preprocessed F1 data and constants. To support full visibility into the datasets without truncation, I configure Pandas display options to show all rows and columns.

In [38]:
from src.data import f1data
from src.utils import f1constants

from matplotlib.collections import LineCollection
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

# Set pandas display options
pd.set_option('display.max_rows', None)  # show all rows
pd.set_option('display.max_columns', None)  # show all columns


The following code initializes a single F1 race session by defining parameters such as year, location, and session type. These values are passed into the custom F1Session class (from f1data.py), which creates a session object built on top of FastF1. This object provides access to key race data as well as custom functions I’ve implemented.

The session parameters were chosen to best match Tier 1 control qualities:

- Weather: Abu Dhabi (dry conditions)
- Max Speed: C5 Ultra Soft Tires
- Minimize Outliers: Top 3 valid laps from Q3
- Teammate Normalization: Cancels out car performance
- Traffic: Remove tow laps to avoid slipstream bias

In [None]:
# Define session parameters
year = 2024
grand_prix = f1constants.F1Constants.LOCATIONS["Abu Dhabi"]
session_type = f1constants.F1Constants.SESSIONS["Q"]

# Call session object
session = f1data.F1Session(year, grand_prix, session_type)

The top three qualifying teams in which both drivers finished the session will be analyzed. Each driver will be assigned variables and identified by their three-letter code for further analysis.

In [None]:
# Drivers from the Top 3 Qualifying Teams (Both Finished)

driverOne_teamOne = 'LEC'
driverTwo_teamOne = 'LEC'

driverOne_teamTwo = 'LEC'
driverTwo_teamTwo = 'LEC'

driverOne_teamThree = 'LEC' 
driverTwo_teamThree = 'LEC'

In [None]:
# Return third qualifying session telemetry data for driverOne_teamOne
q1, q2, q3 = session.get_laps(driverOne_teamOne).split_qualifying_sessions()

q3

In [None]:
# Return third qualifying session telemetry data for driverTwo_teamOne
q1, q2, q3 = session.get_laps(driverTwo_teamOne).split_qualifying_sessions()

q3

In [None]:
# Return third qualifying session telemetry data for driverTwo_teamOne
q1, q2, q3 = session.get_laps(driverOne_teamTwo).split_qualifying_sessions()

q3

In [None]:
# Return third qualifying session telemetry data for driverTwo_teamOne
q1, q2, q3 = session.get_laps(driverTwo_teamTwo).split_qualifying_sessions()

q3

In [None]:
# Return third qualifying session telemetry data for driverTwo_teamOne
q1, q2, q3 = session.get_laps(driverOne_teamThree).split_qualifying_sessions()

q3

In [None]:
# Return third qualifying session telemetry data for driverOne_teamOne
q1, q2, q3 = session.get_laps(driverTwo_teamThree).split_qualifying_sessions()

q3

In [None]:
# Get the fastest lap for the specified driver
fastest_lap = session.get_fastest_lap()

# default scrollable display of dataframe
# fastest_lap

In [9]:
# Retrieve telemetry data for the fastest lap
telemetry_of_fastest = session.get_telemetry(fastest_lap)

# telemetry_of_fastest

In [10]:
# Retrieve car data for the specified driver's fastest lap
car_data = fastest_lap.get_car_data()

# car_data