<a href="https://colab.research.google.com/github/GianUOM/F1TelemetryAnalysis/blob/main/F1TelemetryAnalysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# F1 Telemetry Analysis on 2025 races - In this notebook, we will create visualisations and a powerful Formula 1 data analysis toolkit using Python and the OpenF1 API with the inspiration of this medium article to guide me and understand more the uses. (https://python.plainenglish.io/openf1-api-in-action-building-a-google-colab-notebook-for-f1-race-analysis-fee86c301e5b)

We'll start off by installing and utilising the below libraries for data visualisation

In [None]:
!pip install pandas matplotlib seaborn plotly requests -q

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.graph_objects as go
import plotly.express as px
from plotly.subplots import make_subplots
import requests
import json
from datetime import datetime, timedelta
import warnings

warnings.filterwarnings('ignore')

In [None]:
# Using the Matplotlib and seaborn library to customise the plotting features
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")

Now, next step will be creating the OpenF1 API Client to start interacting with the API

In [None]:
class OpenF1API:

  def __init__(self):
    self.base_url = 'https://api.openf1.org/v1' # Obtaining the API for use

  def get_data(self, endpoint, params=None): # Getting the data from the OpenF1 API
    url = f"{self.base_url}/{endpoint}"
    try:
      response = requests.get(url, params=params)
      response.raise_for_status()
      return response.json()
    except requests.exceptions.RequestException as e:
      print(f"Error fetching data: {e}")
      return None

  def get_dataframe(self, endpoint, params=None): # Fetch the data from the get_data method and convert it into a DataFrame
    data = self.get_data(endpoint, params)
    if data:
      return pd.DataFrame(data)
    else:
      return pd.DataFrame()


api = OpenF1API() # Getting the API client running
print("API client initialised!")

API client initialised!


Afterwards, we will be looking to start the analysis, by first obtaining the 5 most recent races in 2025 and perform the analysis on the most recent race the OpenF1 has to offer

In [None]:
sessions_df = api.get_dataframe('sessions', {
    'year': 2025,
    'session_type': 'Race'
})

if not sessions_df.empty:
    sessions_df = sessions_df.sort_values('date_start')
    recent_sessions = sessions_df.tail(5)
    print("The 5 most recent F1 Races (2025):")
    print("=" * 60)
    for _, session in recent_sessions.iterrows():
        print(f"    {session['country_name']} GP - {session['location']}")
        print(f"    Session Key: {session['session_key']}")
        print(f"    Date: {session['date_start'][:10]}")
        print()

    selected_session = recent_sessions.iloc[-1]
    SESSION_KEY = selected_session['session_key']
    print(f"    Selected for analysis: {selected_session['country_name']} GP")
    print(f"    Session Key: {SESSION_KEY}")


The 5 most recent F1 Races (2025):
    Azerbaijan GP - Baku
    Session Key: 9904
    Date: 2025-09-21

    Singapore GP - Marina Bay
    Session Key: 9896
    Date: 2025-10-05

    United States GP - Austin
    Session Key: 9883
    Date: 2025-10-18

    United States GP - Austin
    Session Key: 9888
    Date: 2025-10-19

    Mexico GP - Mexico City
    Session Key: 9877
    Date: 2025-10-26

    Selected for analysis: Mexico GP
    Session Key: 9877


So, what we have done so far is successfully connecting with the FastF1 API and got the last 5 races' details and session keys for analysis. Now we get to the specific stuff and will fetch driver information from the most recent session for comparison in data analysis

In [None]:
drivers_df = api.get_dataframe('drivers', {'session_key': SESSION_KEY})

if not drivers_df.empty:
  driver_info = {}
  for _, driver in drivers_df.iterrows():
    driver_info[driver['driver_number']] = {
        'name':driver['name_acronym'],
        'full_name':driver['full_name'],
        'team':driver['team_name'],
        'color': f"#{driver['team_colour']}"
    }


  teams = drivers_df.groupby('team_name')['name_acronym'].apply(list).to_dict()
  print("Drivers by Team:")
  print("=" * 60)
  for team, drivers in teams.items():
    print(f"{team}: {', '.join(drivers)}")

  DRIVER_1 = 1
  DRIVER_2 = 4
  print(f"\nDrivers Selected for Comparison:")
  print(f"Driver 1: {driver_info[DRIVER_1]['full_name']}")
  print(f"Driver 2: {driver_info[DRIVER_2]['full_name']}")


Drivers by Team:
Alpine: GAS, COL
Aston Martin: ALO, STR
Ferrari: LEC, HAM
Haas F1 Team: OCO, BEA
Kick Sauber: BOR, HUL
McLaren: NOR, PIA
Mercedes: ANT, RUS
Racing Bulls: HAD, LAW
Red Bull Racing: VER, TSU
Williams: ALB, SAI

Drivers Selected for Comparison:
Driver 1: Max VERSTAPPEN
Driver 2: Lando NORRIS


Now we get to the fun part of the code! We will now start using and listing ways to perform data analysis on drivers that are probably being used by actual Formula 1 teams!

In [None]:
def format_lap_time(seconds):
    minutes = int(seconds // 60)
    secs = int(seconds % 60)
    millis = int(round((seconds - int(seconds)) * 1000))
    return f"{minutes}:{secs:02d}.{millis:03d}"

# This is created by myself to basically convert the lap times from seconds to mm:ss.SSS format for ease of readability since, when I watch F1, this is the go-to format

# A. Fetching Lap Data from Selected Drivers

In [None]:
# Getting the 2 drivers' lap times from the API uisng the pre-defined session key and driver number used in previous cells
laps_driver1 = api.get_dataframe('laps', {
    'session_key': SESSION_KEY,
    'driver_number': DRIVER_1
})

laps_driver2 = api.get_dataframe('laps', {
    'session_key': SESSION_KEY,
    'driver_number': DRIVER_2
})

if not laps_driver1.empty and not laps_driver2.empty:
    fig = go.Figure()
    # Here we will be adding a scatter/line plot for both drivers to compare their lap times
    fig.add_trace(go.Scatter(
        x=laps_driver1['lap_number'],# Defining the x-axis as lap number
        y=laps_driver1['lap_duration'],# Defining the y-axis as lap timing
        mode='lines+markers',# Basically, displaying both lines and markers on the plot
        name=driver_info[DRIVER_1]['name'],# Legend name for the 1st driver
        line=dict(color=driver_info[DRIVER_1]['color'], width=3),# Line colour of the 1st driver based on the API's colour value
        marker=dict(size=6)# Size of the data points
    ))

    fig.add_trace(go.Scatter(
        x=laps_driver2['lap_number'],# Defining the x-axis as lap number
        y=laps_driver2['lap_duration'],# Defining the y-axis as lap timing
        mode='lines+markers',# Basically, displaying both lines and markers on the plot
        name=driver_info[DRIVER_2]['name'],# Legend name for the 2nd driver
        line=dict(color=driver_info[DRIVER_2]['color'], width=3),# Line colour of the 2nd driver based on the API's colour value
        marker=dict(size=6)# Size of the data points
    ))

    # Editing the titles of the plot and other tweaks for the plot
    fig.update_layout(
        title='Lap Time Comparison',
        xaxis_title='Lap Number',
        yaxis_title='Lap Time (seconds)',
        hovermode='x unified',
        template='plotly_dark',
        height=500
    )

    fig.show()# Displaying the plot

    # Displaying the stats for the average and fastest lap time for both drivers in a race to further compare them
    print("\nLap Time Statistics:")
    print("=" * 50)
    print(f"{driver_info[DRIVER_1]['name']}:")
    print(f"  Fastest: {format_lap_time(laps_driver1['lap_duration'].min())}")
    print(f"  Average: {format_lap_time(laps_driver1['lap_duration'].mean())}")
    print(f"\n{driver_info[DRIVER_2]['name']}:")
    print(f"  Fastest: {format_lap_time(laps_driver2['lap_duration'].min())}")
    print(f"  Average: {format_lap_time(laps_driver2['lap_duration'].mean())}")



Lap Time Statistics:
VER:
  Fastest: 1:21.108
  Average: 1:23.169

NOR:
  Fastest: 1:20.764
  Average: 1:22.769


# B. Getting Telemetry Data for a Specific Lap

##

In [None]:
def get_telemetry_for_lap(session_key, driver_number, lap_number):
    telemetry_df = api.get_dataframe('laps', {
        'session_key': session_key,
        'driver_number': driver_number,
        'lap_number': lap_number
    })

    if telemetry_df.empty:
        print(f"No telemetry data found for driver {driver_number} in lap {lap_number}.")
        return None, None

    lap_info = telemetry_df.iloc[0]
    print(f"")

    return telemetry_df, lap_info