# F1 Driver DNA Analysis

This project analyzes Formula 1 drivers' telemetry data to identify distinctive driving styles and patterns. Using machine learning, we classify drivers into different style categories and examine how these styles adapt across track types and weather conditions.

## Key Questions Explored
- What distinct driving styles exist among F1 drivers?
- How do drivers adapt their styles to different track types?
- How does weather affect driving approaches?
- Which drivers show the most versatility vs. consistency?

In [None]:
import fastf1
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import os
import time
from fastf1.plotting import setup_mpl
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import AgglomerativeClustering
import warnings
import pickle

from utils.feature_extraction import extract_driver_dna



## Setup and Configuration

Setting up the FastF1 package with caching to efficiently work with F1 telemetry data. This prevents repeated API calls and allows us to work with the large dataset more effectively.


In [2]:
warnings.filterwarnings('ignore')

# Setup cache
cache_dir = 'cache'
os.makedirs(cache_dir, exist_ok=True)
fastf1.Cache.enable_cache(cache_dir)
setup_mpl(mpl_timedelta_support=True, misc_mpl_mods=False)

print(f"FastF1 version: {fastf1.__version__}")

FastF1 version: 3.5.3


## Data Collection Approach

For reliable data access, we'll bypass the FastF1 schedule API (which can be unstable) and directly load sessions by round number. This gives us consistent access to qualifying sessions across the 2022-2023 seasons.

### Data Structure:
- **Session objects**: Raw qualifying data for each race
- **Driver appearances**: Track which drivers participated in which sessions
- **Track classification**: Categorize tracks by type (high-speed, technical, street)

In [3]:
# BYPASS THE SCHEDULE API: Use direct round numbers instead of track names
print("\nDirectly loading sessions by round number...")

# Map of season rounds to track names for reference
# These are the round numbers for each race in the calendar
rounds_2022 = {
    1: 'Bahrain', 2: 'Saudi Arabia', 3: 'Australia', 4: 'Emilia Romagna', 5: 'Miami',
    6: 'Spain', 7: 'Monaco', 8: 'Azerbaijan', 9: 'Canada', 10: 'Great Britain',
    11: 'Austria', 12: 'France', 13: 'Hungary', 14: 'Belgium', 15: 'Netherlands',
    16: 'Italy', 17: 'Singapore', 18: 'Japan', 19: 'United States', 20: 'Mexico',
    21: 'Brazil', 22: 'Abu Dhabi'
}

rounds_2023 = {
    1: 'Bahrain', 2: 'Saudi Arabia', 3: 'Australia', 4: 'Azerbaijan', 5: 'Miami',
    6: 'Monaco', 7: 'Spain', 8: 'Canada', 9: 'Austria', 10: 'Great Britain',
    11: 'Hungary', 12: 'Belgium', 13: 'Netherlands', 14: 'Italy', 15: 'Singapore',
    16: 'Japan', 17: 'Qatar', 18: 'United States', 19: 'Mexico', 20: 'Brazil',
    21: 'Las Vegas', 22: 'Abu Dhabi'
}

# Define seasons and minimum required sessions
seasons_to_consider = [2022, 2023]
min_driver_sessions = 4

# Store discovered sessions and drivers
discovered_sessions = {}
driver_appearances = {}

# Try to load sessions directly by round number
for year in seasons_to_consider:
    rounds_map = rounds_2022 if year == 2022 else rounds_2023
    
    for round_num, track_name in rounds_map.items():
        try:
            print(f"Loading {year} Round {round_num} ({track_name}) qualifying...")
            
            # Access session directly by round number, bypassing the schedule API
            session = fastf1.get_session(year, round_num, 'Q', force_ergast=False)
            
            # Add delay to prevent rate limiting
            time.sleep(0.5)
            
            # Load the session data
            session.load()
            
            # Get drivers who set a lap time
            session_drivers = session.laps['Driver'].unique().tolist()
            
            # Only include sessions with sufficient drivers
            if len(session_drivers) >= 10:
                # Extract weather condition
                weather_condition = 'dry'  # Default
                
                # Try to determine if it's wet
                try:
                    # Check if wet tires were used
                    tires_used = session.laps['Compound'].unique()
                    if any(x in tires_used for x in ['WET', 'INTERMEDIATE']):
                        weather_condition = 'wet'
                except:
                    pass
                
                # Store session information
                discovered_sessions[(year, track_name)] = {
                    'session': session,
                    'drivers': session_drivers,
                    'weather': weather_condition
                }
                
                # Track driver appearances
                for driver in session_drivers:
                    if driver not in driver_appearances:
                        driver_appearances[driver] = []
                    driver_appearances[driver].append((year, track_name))
                
                print(f"✓ Successfully loaded: {year} {track_name} qualifying with {len(session_drivers)} drivers")
            else:
                print(f"× Skipped: {year} {track_name} - Not enough drivers ({len(session_drivers)})")
        except Exception as e:
            print(f"× Failed: {year} {track_name} - {str(e)}")


# Filter qualified drivers
qualified_drivers = {
    driver: appearances 
    for driver, appearances in driver_appearances.items() 
    if len(appearances) >= min_driver_sessions
}

print(f"\nFound {len(qualified_drivers)} qualified drivers:")
for driver, appearances in qualified_drivers.items():
    print(f"  {driver}: {len(appearances)} sessions")


Directly loading sessions by round number...
Loading 2022 Round 1 (Bahrain) qualifying...


core           INFO 	Loading data for Bahrain Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['16', '1', '55', '11', '44', '77', '20', '14', '63', '10', '31', '47', '4', '23', '24', '22', '27', '3', '18', '6']


✓ Successfully loaded: 2022 Bahrain qualifying with 20 drivers
Loading 2022 Round 2 (Saudi Arabia) qualifying...


core           INFO 	Loading data for Saudi Arabian Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['11', '16', '55', '1', '31', '63', '14', '77', '10', '20', '4', '3', '24', '47', '18', '44', '23', '27', '6', '22']


✓ Successfully loaded: 2022 Saudi Arabia qualifying with 20 drivers
Loading 2022 Round 3 (Australia) qualifying...


core           INFO 	Loading data for Australian Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['16', '1', '11', '4', '44', '63', '3', '31', '55', '14', '10', '77', '22', '24', '47', '23', '20', '5', '6', '18']


✓ Successfully loaded: 2022 Australia qualifying with 20 drivers
Loading 2022 Round 4 (Emilia Romagna) qualifying...


core           INFO 	Loading data for Emilia Romagna Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['1', '16', '4', '20', '14', '3', '11', '77', '5', '55', '63', '47', '44', '24', '18', '22', '10', '6', '31', '23']


✓ Successfully loaded: 2022 Emilia Romagna qualifying with 20 drivers
Loading 2022 Round 5 (Miami) qualifying...


core           INFO 	Loading data for Miami Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['16', '55', '1', '11', '77', '44', '10', '4', '22', '18', '14', '63', '5', '3', '47', '20', '24', '23', '6', '31']


✓ Successfully loaded: 2022 Miami qualifying with 19 drivers
Loading 2022 Round 6 (Spain) qualifying...


core           INFO 	Loading data for Spanish Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['16', '1', '55', '63', '11', '44', '77', '20', '3', '47', '4', '31', '22', '10', '24', '5', '14', '18', '23', '6']


✓ Successfully loaded: 2022 Spain qualifying with 20 drivers
Loading 2022 Round 7 (Monaco) qualifying...


core           INFO 	Loading data for Monaco Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['16', '55', '11', '1', '4', '63', '14', '44', '5', '31', '22', '77', '20', '3', '47', '23', '10', '18', '6', '24']


✓ Successfully loaded: 2022 Monaco qualifying with 20 drivers
Loading 2022 Round 8 (Azerbaijan) qualifying...


core           INFO 	Loading data for Azerbaijan Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['16', '11', '1', '55', '63', '10', '44', '22', '5', '14', '4', '3', '31', '24', '77', '20', '23', '6', '18', '47']


✓ Successfully loaded: 2022 Azerbaijan qualifying with 20 drivers
Loading 2022 Round 9 (Canada) qualifying...


core           INFO 	Loading data for Canadian Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['1', '14', '55', '44', '20', '47', '31', '63', '3', '24', '77', '23', '11', '4', '16', '10', '5', '18', '6', '22']


✓ Successfully loaded: 2022 Canada qualifying with 20 drivers
Loading 2022 Round 10 (Great Britain) qualifying...


core           INFO 	Loading data for British Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['55', '1', '16', '11', '44', '4', '14', '63', '24', '6', '10', '77', '22', '3', '31', '23', '20', '5', '47', '18']


✓ Successfully loaded: 2022 Great Britain qualifying with 20 drivers
Loading 2022 Round 11 (Austria) qualifying...


core           INFO 	Loading data for Austrian Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['1', '16', '55', '63', '31', '20', '47', '14', '44', '10', '23', '77', '11', '22', '4', '3', '18', '24', '6', '5']


✓ Successfully loaded: 2022 Austria qualifying with 20 drivers
Loading 2022 Round 12 (France) qualifying...


core           INFO 	Loading data for French Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['16', '1', '11', '44', '4', '63', '14', '22', '55', '20', '3', '31', '77', '5', '23', '10', '18', '24', '47', '6']


✓ Successfully loaded: 2022 France qualifying with 20 drivers
Loading 2022 Round 13 (Hungary) qualifying...


core           INFO 	Loading data for Hungarian Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['63', '55', '16', '4', '31', '14', '44', '77', '3', '1', '11', '24', '20', '18', '47', '22', '23', '5', '10', '6']


✓ Successfully loaded: 2022 Hungary qualifying with 20 drivers
Loading 2022 Round 14 (Belgium) qualifying...


core           INFO 	Loading data for Belgian Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['1', '55', '11', '16', '31', '14', '44', '63', '23', '4', '3', '10', '24', '18', '47', '5', '6', '20', '22', '77']


✓ Successfully loaded: 2022 Belgium qualifying with 20 drivers
Loading 2022 Round 15 (Netherlands) qualifying...


core           INFO 	Loading data for Dutch Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['1', '16', '55', '44', '11', '63', '4', '47', '22', '18', '10', '31', '14', '24', '23', '77', '3', '20', '5', '6']


✓ Successfully loaded: 2022 Netherlands qualifying with 20 drivers
Loading 2022 Round 16 (Italy) qualifying...


core           INFO 	Loading data for Italian Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['16', '1', '55', '11', '44', '63', '4', '3', '10', '14', '31', '77', '45', '24', '22', '6', '5', '18', '20', '47']


✓ Successfully loaded: 2022 Italy qualifying with 20 drivers
Loading 2022 Round 17 (Singapore) qualifying...


core           INFO 	Loading data for Singapore Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['16', '11', '44', '55', '14', '4', '10', '1', '20', '22', '63', '18', '47', '5', '24', '77', '3', '31', '23', '6']


✓ Successfully loaded: 2022 Singapore qualifying with 20 drivers
Loading 2022 Round 18 (Japan) qualifying...


core           INFO 	Loading data for Japanese Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['1', '16', '55', '11', '31', '44', '14', '63', '5', '4', '3', '77', '22', '24', '47', '23', '10', '20', '18', '6']


✓ Successfully loaded: 2022 Japan qualifying with 20 drivers
Loading 2022 Round 19 (United States) qualifying...


core           INFO 	Loading data for United States Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['55', '16', '1', '11', '44', '63', '18', '4', '14', '77', '23', '5', '10', '24', '22', '20', '3', '31', '47', '6']


✓ Successfully loaded: 2022 United States qualifying with 20 drivers
Loading 2022 Round 20 (Mexico) qualifying...


core           INFO 	Loading data for Mexico City Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['1', '63', '44', '11', '55', '77', '16', '4', '14', '31', '3', '24', '22', '10', '20', '47', '5', '18', '23', '6']


✓ Successfully loaded: 2022 Mexico qualifying with 20 drivers
Loading 2022 Round 21 (Brazil) qualifying...


core           INFO 	Loading data for São Paulo Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['20', '1', '63', '4', '55', '31', '14', '44', '11', '16', '23', '10', '5', '3', '18', '6', '24', '77', '22', '47']


✓ Successfully loaded: 2022 Brazil qualifying with 20 drivers
Loading 2022 Round 22 (Abu Dhabi) qualifying...


core           INFO 	Loading data for Abu Dhabi Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['1', '11', '16', '55', '44', '63', '4', '31', '5', '3', '14', '22', '47', '18', '24', '20', '10', '77', '23', '6']


✓ Successfully loaded: 2022 Abu Dhabi qualifying with 20 drivers
Loading 2023 Round 1 (Bahrain) qualifying...


core           INFO 	Loading data for Bahrain Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['1', '11', '16', '55', '14', '63', '44', '18', '31', '27', '4', '77', '24', '22', '23', '2', '20', '81', '21', '10']


✓ Successfully loaded: 2023 Bahrain qualifying with 20 drivers
Loading 2023 Round 2 (Saudi Arabia) qualifying...


core           INFO 	Loading data for Saudi Arabian Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['11', '16', '14', '63', '55', '18', '31', '44', '81', '10', '27', '24', '20', '77', '1', '22', '23', '21', '4', '2']


✓ Successfully loaded: 2023 Saudi Arabia qualifying with 20 drivers
Loading 2023 Round 3 (Australia) qualifying...


core           INFO 	Loading data for Australian Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['1', '63', '44', '14', '55', '18', '16', '23', '10', '27', '31', '22', '4', '20', '21', '81', '24', '2', '77', '11']


✓ Successfully loaded: 2023 Australia qualifying with 20 drivers
Loading 2023 Round 4 (Azerbaijan) qualifying...


core           INFO 	Loading data for Azerbaijan Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['16', '1', '11', '55', '44', '14', '4', '22', '18', '81', '63', '31', '23', '77', '2', '24', '27', '20', '10', '21']


✓ Successfully loaded: 2023 Azerbaijan qualifying with 20 drivers
Loading 2023 Round 5 (Miami) qualifying...


core           INFO 	Loading data for Miami Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['11', '14', '55', '20', '10', '63', '16', '31', '1', '77', '23', '27', '44', '24', '21', '4', '22', '18', '81', '2']


✓ Successfully loaded: 2023 Miami qualifying with 20 drivers
Loading 2023 Round 6 (Monaco) qualifying...


core           INFO 	Loading data for Monaco Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['1', '14', '16', '31', '55', '44', '10', '63', '22', '4', '81', '21', '23', '18', '77', '2', '20', '27', '24', '11']


✓ Successfully loaded: 2023 Monaco qualifying with 20 drivers
Loading 2023 Round 7 (Spain) qualifying...


core           INFO 	Loading data for Spanish Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['1', '55', '4', '10', '44', '18', '31', '27', '14', '81', '11', '63', '24', '21', '22', '77', '20', '23', '16', '2']


✓ Successfully loaded: 2023 Spain qualifying with 20 drivers
Loading 2023 Round 8 (Canada) qualifying...


core           INFO 	Loading data for Canadian Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['1', '27', '14', '44', '63', '31', '4', '55', '81', '23', '16', '11', '18', '20', '77', '22', '10', '21', '2', '24']


✓ Successfully loaded: 2023 Canada qualifying with 20 drivers
Loading 2023 Round 9 (Austria) qualifying...


core           INFO 	Loading data for Austrian Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['1', '16', '55', '4', '44', '18', '14', '27', '10', '23', '63', '31', '81', '77', '11', '22', '24', '2', '20', '21']


✓ Successfully loaded: 2023 Austria qualifying with 20 drivers
Loading 2023 Round 10 (Great Britain) qualifying...


core           INFO 	Loading data for British Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['1', '4', '81', '16', '55', '63', '44', '23', '14', '10', '27', '18', '31', '2', '77', '11', '22', '24', '21', '20']


✓ Successfully loaded: 2023 Great Britain qualifying with 20 drivers
Loading 2023 Round 11 (Hungary) qualifying...


core           INFO 	Loading data for Hungarian Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['44', '1', '4', '81', '24', '16', '77', '14', '11', '27', '55', '31', '3', '18', '10', '23', '22', '63', '20', '2']


✓ Successfully loaded: 2023 Hungary qualifying with 20 drivers
Loading 2023 Round 12 (Belgium) qualifying...


core           INFO 	Loading data for Belgian Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['1', '16', '11', '44', '55', '81', '4', '63', '14', '18', '22', '10', '20', '77', '31', '23', '24', '2', '3', '27']


✓ Successfully loaded: 2023 Belgium qualifying with 20 drivers
Loading 2023 Round 13 (Netherlands) qualifying...


core           INFO 	Loading data for Dutch Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['1', '4', '63', '23', '14', '55', '11', '81', '16', '2', '18', '10', '44', '22', '27', '24', '31', '20', '77', '40']


✓ Successfully loaded: 2023 Netherlands qualifying with 20 drivers
Loading 2023 Round 14 (Italy) qualifying...


core           INFO 	Loading data for Italian Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['55', '1', '16', '63', '11', '23', '81', '44', '4', '14', '22', '40', '27', '77', '2', '24', '10', '31', '20', '18']


✓ Successfully loaded: 2023 Italy qualifying with 20 drivers
Loading 2023 Round 15 (Singapore) qualifying...


core           INFO 	Loading data for Singapore Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['55', '63', '16', '4', '44', '20', '14', '31', '27', '40', '1', '10', '11', '23', '22', '77', '81', '2', '24', '18']


✓ Successfully loaded: 2023 Singapore qualifying with 20 drivers
Loading 2023 Round 16 (Japan) qualifying...


core           INFO 	Loading data for Japanese Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['1', '81', '4', '16', '11', '55', '44', '63', '22', '14', '40', '10', '23', '31', '20', '77', '18', '27', '24', '2']


✓ Successfully loaded: 2023 Japan qualifying with 20 drivers
Loading 2023 Round 17 (Qatar) qualifying...


core           INFO 	Loading data for Qatar Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['1', '63', '44', '14', '16', '81', '10', '31', '77', '4', '22', '55', '11', '23', '27', '2', '18', '40', '20', '24']


✓ Successfully loaded: 2023 Qatar qualifying with 20 drivers
Loading 2023 Round 18 (United States) qualifying...


core           INFO 	Loading data for United States Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['16', '4', '44', '55', '63', '1', '10', '31', '11', '81', '22', '24', '77', '20', '3', '27', '14', '23', '18', '2']


✓ Successfully loaded: 2023 United States qualifying with 20 drivers
Loading 2023 Round 19 (Mexico) qualifying...


core           INFO 	Loading data for Mexico City Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['16', '55', '1', '3', '11', '44', '81', '63', '77', '24', '10', '27', '14', '23', '22', '31', '20', '18', '4', '2']


✓ Successfully loaded: 2023 Mexico qualifying with 20 drivers
Loading 2023 Round 20 (Brazil) qualifying...


core           INFO 	Loading data for São Paulo Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['1', '16', '18', '14', '44', '63', '4', '55', '11', '81', '27', '31', '10', '20', '23', '22', '3', '77', '2', '24']


✓ Successfully loaded: 2023 Brazil qualifying with 20 drivers
Loading 2023 Round 21 (Las Vegas) qualifying...


core           INFO 	Loading data for Las Vegas Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['16', '55', '1', '63', '10', '23', '2', '77', '20', '14', '44', '11', '27', '18', '3', '4', '31', '24', '81', '22']


✓ Successfully loaded: 2023 Las Vegas qualifying with 20 drivers
Loading 2023 Round 22 (Abu Dhabi) qualifying...


core           INFO 	Loading data for Abu Dhabi Grand Prix - Qualifying [v3.5.3]
req            INFO 	Using cached data for session_info
req            INFO 	Using cached data for driver_info
req            INFO 	Using cached data for session_status_data
req            INFO 	Using cached data for track_status_data
req            INFO 	Using cached data for _extended_timing_data
req            INFO 	Using cached data for timing_app_data
core           INFO 	Processing timing data...
req            INFO 	Using cached data for car_data
req            INFO 	Using cached data for position_data
req            INFO 	Using cached data for weather_data
req            INFO 	Using cached data for race_control_messages
core           INFO 	Finished loading data for 20 drivers: ['1', '16', '81', '63', '4', '22', '14', '27', '11', '10', '44', '31', '18', '23', '3', '55', '20', '77', '24', '2']


✓ Successfully loaded: 2023 Abu Dhabi qualifying with 20 drivers

Found 25 qualified drivers:
  LEC: 44 sessions
  VER: 44 sessions
  SAI: 44 sessions
  PER: 44 sessions
  HAM: 44 sessions
  BOT: 44 sessions
  MAG: 44 sessions
  ALO: 44 sessions
  RUS: 44 sessions
  GAS: 44 sessions
  OCO: 43 sessions
  MSC: 22 sessions
  NOR: 44 sessions
  ALB: 43 sessions
  ZHO: 44 sessions
  TSU: 44 sessions
  HUL: 24 sessions
  RIC: 29 sessions
  STR: 44 sessions
  LAT: 22 sessions
  VET: 20 sessions
  DEV: 11 sessions
  SAR: 22 sessions
  PIA: 22 sessions
  LAW: 5 sessions


## Track Type Classification

F1 circuits vary significantly in their characteristics, which influences driving approaches. We classify tracks into three categories:

1. **High-Speed**: Tracks with long straights and fast corners (e.g., Monza, Spa)
2. **Technical**: Tracks with complex corner sequences requiring precision (e.g., Hungaroring)
3. **Street**: City circuits with walls close to the track and unique surfaces (e.g., Monaco)

In [4]:
# 2. TRACK TYPE AND WEATHER CLASSIFICATION
# Classify tracks as in your code (high_speed, technical, street)
track_types = {
    # High-speed tracks
    'Monza': 'high_speed',
    'Spa': 'high_speed',
    'Silverstone': 'high_speed',
    'Jeddah': 'high_speed',
    'Baku': 'high_speed',
    'Las Vegas': 'high_speed',
    
    # Technical tracks
    'Hungaroring': 'technical',
    'Barcelona': 'technical',
    'Zandvoort': 'technical',
    'Suzuka': 'technical',
    'Imola': 'technical',
    'Lusail': 'technical',
    'Mexico City': 'technical',
    'Austin': 'technical',
    
    # Street circuits
    'Monaco': 'street',
    'Singapore': 'street',
    'Melbourne': 'street',
    'Miami': 'street',
}

# For tracks not in our classification, classify based on name
for year_track in discovered_sessions:
    track_name = year_track[1]
    if track_name not in track_types:
        if any(x in track_name.lower() for x in ['street', 'monte', 'marina', 'vegas', 'miami']):
            track_types[track_name] = 'street'
        elif any(x in track_name.lower() for x in ['monza', 'spa', 'silverstone', 'baku']):
            track_types[track_name] = 'high_speed'
        else:
            track_types[track_name] = 'technical'

## Driver DNA Feature Extraction

The heart of our analysis is the `extract_driver_dna` function, which transforms raw telemetry data into meaningful features that characterize driving style. 

### Features Extracted:
- **Braking patterns**: How aggressively drivers apply brakes
- **Throttle application**: How drivers modulate the throttle
- **Speed variability**: How much speed changes throughout a lap
- **Gear selection**: Shifting patterns and preferences
- **Corner approach**: Entry/exit speed ratios and minimum speeds
- **Racing line**: Smoothness of the driven line

These features collectively create a "fingerprint" of each driver's approach.


we'll process each session to extract driver DNA features and categorize them by:
- Track type (high-speed, technical, street)
- Weather condition (dry, wet)
- Season (2022, 2023)

This allows us to analyze how driving styles vary across different conditions.


In [None]:
# 3. COLLECT AND CATEGORIZE DATA
print("\nCollecting categorized driver DNA data...")
driver_data = {}
driver_data_by_track_type = {'high_speed': {}, 'technical': {}, 'street': {}}
driver_data_by_weather = {'dry': {}, 'wet': {}}

# Process each session
for (year, track), session_info in discovered_sessions.items():
    session = session_info['session']
    weather = session_info['weather']
    
    # Get track type
    track_type = track_types.get(track, 'technical')  # Default to technical if unknown
    
    print(f"\nProcessing {year} {track} ({track_type}, {weather})")
    
    # Try to get circuit info
    circuit_info = None
    try:
        circuit_info = session.get_circuit_info()
    except:
        pass
    
    # Process qualified drivers in this session
    for driver in qualified_drivers:
        if driver in session_info['drivers']:
            try:
                driver_laps = session.laps.pick_driver(driver)
                
                if len(driver_laps) > 0:
                    # Get fastest lap
                    fastest_lap = driver_laps.pick_fastest()
                    
                    # Extract features
                    features = extract_driver_dna(driver, fastest_lap, circuit_info)
                    
                    # Store with track and year info
                    driver_key = f"{driver}_{year}"
                    
                    # Overall storage
                    if driver_key not in driver_data:
                        driver_data[driver_key] = {}
                    driver_data[driver_key][track] = features
                    
                    # Track type storage
                    if driver_key not in driver_data_by_track_type[track_type]:
                        driver_data_by_track_type[track_type][driver_key] = {}
                    driver_data_by_track_type[track_type][driver_key][track] = features
                    
                    # Weather storage
                    if driver_key not in driver_data_by_weather[weather]:
                        driver_data_by_weather[weather][driver_key] = {}
                    driver_data_by_weather[weather][driver_key][track] = features
                    
                    print(f"  {driver}: Features extracted")
                else:
                    print(f"  {driver}: No laps found")
            
            except Exception as e:
                print(f"  {driver}: Error - {e}")


Collecting categorized driver DNA data...

Processing 2022 Bahrain (technical, dry)
  LEC: Features extracted
  VER: Features extracted
  SAI: Features extracted
  PER: Features extracted
  HAM: Features extracted
  BOT: Features extracted
  MAG: Features extracted
  ALO: Features extracted
  RUS: Features extracted
  GAS: Features extracted
  OCO: Features extracted
  MSC: Features extracted
  NOR: Features extracted
  ALB: Features extracted
  ZHO: Features extracted
  TSU: Features extracted
  HUL: Features extracted
  RIC: Features extracted
  STR: Features extracted
  LAT: Features extracted

Processing 2022 Saudi Arabia (technical, dry)
  LEC: Features extracted
  VER: Features extracted
  SAI: Features extracted
  PER: Features extracted
  HAM: Features extracted
  BOT: Features extracted
  MAG: Features extracted
  ALO: Features extracted
  RUS: Features extracted
  GAS: Features extracted
  OCO: Features extracted
  MSC: Features extracted
  NOR: Features extracted
  ALB: Fea

## Feature Aggregation

To identify consistent patterns, we aggregate features across multiple tracks within each category. This creates a more robust representation of a driver's style by reducing the impact of one-off performances or outliers.

In [None]:
# 4. AGGREGATE FEATURES FOR EACH CATEGORY
print("\nAggregating features across categories...")

# Function to aggregate features from track data
def aggregate_driver_features(driver_track_data, min_tracks=2):
    # Only include if we have data from multiple tracks
    if len(driver_track_data) >= min_tracks:
        feature_values = {}
        
        for track, features in driver_track_data.items():
            for feature, value in features.items():
                if feature != 'track_type' and pd.notna(value):
                    if feature not in feature_values:
                        feature_values[feature] = []
                    feature_values[feature].append(value)
        
        # Calculate averages for each feature
        driver_features = {
            feature: np.mean(values) 
            for feature, values in feature_values.items() 
            if len(values) >= min_tracks-1  # Allow some missing
        }
        
        required_features = ['avg_brake_intensity', 'avg_throttle_intensity', 'speed_variability']
        if all(feature in driver_features for feature in required_features):
            return driver_features
    
    return None

# Create aggregated features for all categories
aggregated_overall = {}
aggregated_by_track_type = {track_type: {} for track_type in driver_data_by_track_type}
aggregated_by_weather = {weather: {} for weather in driver_data_by_weather}

# Overall aggregation
for driver_key, track_data in driver_data.items():
    features = aggregate_driver_features(track_data, min_tracks=3)
    if features:
        aggregated_overall[driver_key] = features

# Track type aggregation
for track_type, drivers in driver_data_by_track_type.items():
    for driver_key, track_data in drivers.items():
        features = aggregate_driver_features(track_data, min_tracks=2)  # Lower threshold for specific types
        if features:
            aggregated_by_track_type[track_type][driver_key] = features

# Weather aggregation
for weather, drivers in driver_data_by_weather.items():
    for driver_key, track_data in drivers.items():
        features = aggregate_driver_features(track_data, min_tracks=2)
        if features:
            aggregated_by_weather[weather][driver_key] = features

# Print summary of data availability
print("\nData availability summary:")
print(f"Overall driver-year combinations: {len(aggregated_overall)}")
for track_type in aggregated_by_track_type:
    print(f"Driver-year combinations on {track_type} tracks: {len(aggregated_by_track_type[track_type])}")
for weather in aggregated_by_weather:
    print(f"Driver-year combinations in {weather} conditions: {len(aggregated_by_weather[weather])}")


Aggregating features across categories...

Data availability summary:
Overall driver-year combinations: 42
Driver-year combinations on high_speed tracks: 19
Driver-year combinations on technical tracks: 43
Driver-year combinations on street tracks: 40
Driver-year combinations in dry conditions: 43
Driver-year combinations in wet conditions: 40


## Style Classification Methodology

To identify distinct driving styles, we'll use:
1. **Standardization**: Scale features to comparable ranges
2. **Hierarchical Clustering**: Group drivers with similar approaches
3. **Style Naming**: Create descriptive names based on distinctive characteristics

Our improved naming system ensures each style has a unique, descriptive name that highlights its defining traits.

In [None]:
# 5. CLUSTERING FUNCTION FOR EACH CATEGORY
# Improved naming function with more distinctive style names
def get_improved_style_name(cluster_id, features, all_cluster_means):
    """Generate more specific style names to avoid duplicates"""
    # Get significant features
    significant_features = {}
    for feature, value in features.items():
        if abs(value) > 0.3:
            significant_features[feature] = value
    
    # Sort features by importance
    sorted_features = {k: v for k, v in sorted(
        significant_features.items(), key=lambda item: abs(item[1]), reverse=True)}
    
    # Check for extreme values (helps distinguish similar styles)
    extreme_features = {k: v for k, v in sorted_features.items() if abs(v) > 1.5}
    
    # Base style type - similar to before
    if 'avg_throttle_intensity' in sorted_features and sorted_features['avg_throttle_intensity'] > 0:
        if 'short_shift_ratio' in sorted_features and sorted_features['short_shift_ratio'] > 0:
            base_type = "Technical Aggressor"
        else:
            base_type = "Raw Power Specialist"
    elif 'entry_exit_bias' in sorted_features:
        if sorted_features['entry_exit_bias'] > 0:
            base_type = "Entry Speed Specialist"
        else:
            base_type = "Exit Acceleration Expert"
    elif 'speed_variability' in sorted_features and sorted_features['speed_variability'] > 0:
        base_type = "Adaptive Speed Tactician"
    elif 'avg_brake_intensity' in sorted_features:
        if sorted_features['avg_brake_intensity'] > 0:
            base_type = "Late Braking Master"
        else:
            base_type = "Progressive Braking Technician"
    else:
        top_feature = list(sorted_features.keys())[0]
        base_type = f"{top_feature.replace('_', ' ').title()} Specialist"
    
    # Add modifiers based on secondary traits
    modifiers = []
    
    # Add intensity modifiers for extreme values
    if extreme_features:
        extreme_trait = list(extreme_features.keys())[0]
        extreme_val = extreme_features[extreme_trait]
        if abs(extreme_val) > 2:
            intensity = "Extreme"
        else:
            intensity = "Strong"
        modifiers.append(intensity)
    
    # Add specific trait modifiers
    if 'path_smoothness' in sorted_features:
        if sorted_features['path_smoothness'] > 0.7:
            modifiers.append("Smooth Line")
        elif sorted_features['path_smoothness'] < -0.7:
            modifiers.append("Aggressive Line")
    
    if 'gear_changes' in sorted_features and abs(sorted_features['gear_changes']) > 0.8:
        if sorted_features['gear_changes'] > 0:
            modifiers.append("High Gear Activity")
        else:
            modifiers.append("Minimal Shifting")
    
    # Put it all together
    if modifiers:
        style_name = f"{' '.join(modifiers)} {base_type}"
    else:
        style_name = f"Balanced {base_type}"
    
    # Add unique identifier if needed to prevent duplicates
    seen_names = []
    for i, other_means in enumerate(all_cluster_means.iterrows()):
        if i != cluster_id:  # Don't compare with self
            other_id, other_features = other_means
            other_significant = {k: v for k, v in other_features.items() if abs(v) > 0.3}
            other_sorted = {k: v for k, v in sorted(
                other_significant.items(), key=lambda item: abs(item[1]), reverse=True)}
            
            # Generate other name using same base logic
            if 'avg_throttle_intensity' in other_sorted and other_sorted['avg_throttle_intensity'] > 0:
                if 'short_shift_ratio' in other_sorted and other_sorted['short_shift_ratio'] > 0:
                    other_base = "Technical Aggressor"
                else:
                    other_base = "Raw Power Specialist"
            elif 'entry_exit_bias' in other_sorted:
                if other_sorted['entry_exit_bias'] > 0:
                    other_base = "Entry Speed Specialist"
                else:
                    other_base = "Exit Acceleration Expert"
            elif 'speed_variability' in other_sorted and other_sorted['speed_variability'] > 0:
                other_base = "Adaptive Speed Tactician"
            elif 'avg_brake_intensity' in other_sorted:
                if other_sorted['avg_brake_intensity'] > 0:
                    other_base = "Late Braking Master"
                else:
                    other_base = "Progressive Braking Technician"
            else:
                if other_sorted:
                    other_top = list(other_sorted.keys())[0]
                    other_base = f"{other_top.replace('_', ' ').title()} Specialist"
                else:
                    other_base = "Balanced Driver"
            
            # Record base type to check for duplicates
            seen_names.append(other_base)
    
    # If our base type would be a duplicate, add distinguishing trait
    if base_type in seen_names:
        # Find a unique trait for this cluster compared to others
        for feature, value in sorted_features.items():
            feature_name = feature.replace('_', ' ')
            if value > 0:
                if abs(value) > 1:
                    style_name = f"{base_type} with Exceptional {feature_name}"
                else:
                    style_name = f"{base_type} with Strong {feature_name}"
                break
            else:
                if abs(value) > 1:
                    style_name = f"{base_type} with Minimal {feature_name}"
                else:
                    style_name = f"{base_type} with Reduced {feature_name}"
                break
    
    return style_name

## Running the Analysis

With our methodology defined, we'll now run the analysis across different categories:
- Overall driving style (across all tracks)
- Track-specific styles (high-speed, technical, street)
- Weather-specific styles (dry, wet)

Each analysis will identify distinct clusters of driving approaches and assign descriptive style names.

In [9]:
# Update the analyze_driver_styles function to use the improved naming
def analyze_driver_styles_updated(feature_dict, category_name):
    """Analyze driver styles with improved naming"""
    if len(feature_dict) < 5:
        print(f"\nInsufficient data for {category_name} (found {len(feature_dict)} driver-years)")
        return None
    
    print(f"\n=== DRIVER STYLES FOR {category_name.upper()} ===")
    
    # Create DataFrame and preprocessing (same as before)
    feature_df = pd.DataFrame.from_dict(feature_dict, orient='index')
    feature_df = feature_df.fillna(feature_df.mean())
    feature_df['driver'] = feature_df.index.str.split('_').str[0]
    feature_df['year'] = feature_df.index.str.split('_').str[1]
    
    feature_cols = [col for col in feature_df.columns if col not in ['driver', 'year']]
    scaler = StandardScaler()
    feature_df[feature_cols] = scaler.fit_transform(feature_df[feature_cols])
    
    # Apply clustering (same as before)
    n_clusters = min(4, len(feature_df) // 3)
    hierarchical = AgglomerativeClustering(n_clusters=n_clusters, linkage='ward')
    feature_df['cluster'] = hierarchical.fit_predict(feature_df[feature_cols].values)
    
    # Get cluster characteristics
    cluster_analysis = feature_df.groupby('cluster')[feature_cols].mean()
    
    # Use improved naming logic
    style_names = {}
    style_descriptions = {}
    
    for cluster_id, means in cluster_analysis.iterrows():
        style_name = get_improved_style_name(cluster_id, means, cluster_analysis)
        
        # Generate description based on key features
        significant_features = {k: v for k, v in means.items() if abs(v) > 0.3}
        sorted_sig = {k: v for k, v in sorted(
            significant_features.items(), key=lambda item: abs(item[1]), reverse=True)}
        
        if len(sorted_sig) >= 2:
            top_features = list(sorted_sig.keys())[:2]
            feature1 = top_features[0].replace('_', ' ')
            feature2 = top_features[1].replace('_', ' ')
            dir1 = "high" if sorted_sig[top_features[0]] > 0 else "low"
            dir2 = "high" if sorted_sig[top_features[1]] > 0 else "low"
            desc = f"Characterized by {dir1} {feature1} and {dir2} {feature2}"
        else:
            desc = "Balanced approach across multiple factors"
        
        style_names[cluster_id] = style_name
        style_descriptions[cluster_id] = desc
    
    # Print cluster analysis (same as before)
    for cluster_id, name in style_names.items():
        drivers_in_cluster = feature_df[feature_df['cluster'] == cluster_id]
        
        print(f"\nStyle {cluster_id}: {name}")
        print(f"Description: {style_descriptions[cluster_id]}")
        print(f"Drivers in this cluster: {len(drivers_in_cluster)}")
        
        # Print top features
        print("Key characteristics:")
        for feature, value in cluster_analysis.loc[cluster_id].items():
            if abs(value) > 0.3:
                direction = "High" if value > 0 else "Low"
                print(f"- {direction} {feature.replace('_', ' ')}: {value:.2f}")
        
        # List drivers, grouped by year
        print("\nDrivers by year:")
        for year in sorted(drivers_in_cluster['year'].unique()):
            year_drivers = drivers_in_cluster[drivers_in_cluster['year'] == year]['driver'].tolist()
            print(f"  {year}: {', '.join(year_drivers)}")
    
    return {
        'feature_df': feature_df,
        'style_names': style_names,
        'style_descriptions': style_descriptions,
        'cluster_analysis': cluster_analysis
    }

In [None]:
# 6. RUN ANALYSIS FOR EACH CATEGORY
overall_results = analyze_driver_styles_updated(aggregated_overall, "overall")

track_type_results = {}
for track_type in aggregated_by_track_type:
    track_type_results[track_type] = analyze_driver_styles_updated(
        aggregated_by_track_type[track_type], f"{track_type} tracks")

weather_results = {}
for weather in aggregated_by_weather:
    weather_results[weather] = analyze_driver_styles_updated(
        aggregated_by_weather[weather], f"{weather} conditions")


=== DRIVER STYLES FOR OVERALL ===

Style 0: Entry Speed Specialist with Exceptional path smoothness
Description: Characterized by high path smoothness and low avg corner speed reduction
Drivers in this cluster: 8
Key characteristics:
- High path smoothness: 1.29
- High avg brake intensity: 0.47
- High speed variability: 0.87
- High gear changes: 0.36
- High entry exit bias: 1.01
- Low avg corner speed reduction: -1.10

Drivers by year:
  2022: PER, GAS, MSC, ALB, ZHO, RIC
  2023: DEV, RIC

Style 1: Aggressive Line Raw Power Specialist
Description: Characterized by low path smoothness and low speed variability
Drivers in this cluster: 17
Key characteristics:
- Low path smoothness: -0.72
- High avg throttle intensity: 0.59
- Low speed variability: -0.67
- Low entry exit bias: -0.30

Drivers by year:
  2022: VER, HAM, ALO, RUS, OCO, NOR
  2023: VER, SAI, PER, BOT, ALO, RUS, OCO, NOR, TSU, PIA, LAW

Style 2: Balanced Exit Acceleration Expert
Description: Characterized by high avg corner s

## Style Versatility Analysis

Some drivers maintain a consistent approach across different tracks, while others adapt significantly. Here we analyze how much drivers change their style across track types, quantifying their adaptability.


In [None]:
# 7. STYLE VERSATILITY ANALYSIS
print("\n=== DRIVER STYLE VERSATILITY ANALYSIS ===")

# Get drivers with data across all track types
common_drivers = set()
for driver_key in aggregated_overall:
    driver = driver_key.split('_')[0]
    year = driver_key.split('_')[1]
    
    # Check if driver has data for all track types in this year
    has_all_types = True
    for track_type in aggregated_by_track_type:
        if f"{driver}_{year}" not in aggregated_by_track_type[track_type]:
            has_all_types = False
            break
    
    if has_all_types:
        common_drivers.add(driver_key)

print(f"Found {len(common_drivers)} driver-year combinations with data across all track types")

# For each driver, check if style changes across track types
if common_drivers and all(result is not None for track_type, result in track_type_results.items()):
    style_changes = {}
    
    for driver_key in common_drivers:
        driver = driver_key.split('_')[0]
        year = driver_key.split('_')[1]
        
        styles = {}
        for track_type, result in track_type_results.items():
            cluster = result['feature_df'].loc[driver_key, 'cluster']
            style = result['style_names'][cluster]
            styles[track_type] = style
        
        # Count unique styles
        unique_styles = len(set(styles.values()))
        style_changes[driver_key] = {
            'unique_styles': unique_styles,
            'styles': styles
        }
    
    # Calculate adaptability score (percentage of track types with different styles)
    for driver_key, data in style_changes.items():
        adaptability = (data['unique_styles'] - 1) / (len(track_type_results) - 1) if len(track_type_results) > 1 else 0
        style_changes[driver_key]['adaptability'] = adaptability * 100
    
    # Print adaptability analysis
    print("\nDriver style adaptability across track types:")
    for driver_key, data in sorted(style_changes.items(), key=lambda x: x[1]['adaptability'], reverse=True):
        driver = driver_key.split('_')[0]
        year = driver_key.split('_')[1]
        
        print(f"\n{driver} ({year}) - Adaptability Score: {data['adaptability']:.1f}%")
        print("Styles by track type:")
        for track_type, style in data['styles'].items():
            print(f"  {track_type}: {style}")


=== DRIVER STYLE VERSATILITY ANALYSIS ===
Found 19 driver-year combinations with data across all track types

Driver style adaptability across track types:

VER (2023) - Adaptability Score: 100.0%
Styles by track type:
  high_speed: Exit Acceleration Expert with Minimal avg throttle intensity
  technical: Raw Power Specialist with Strong avg throttle intensity
  street: Aggressive Line Raw Power Specialist

HUL (2023) - Adaptability Score: 100.0%
Styles by track type:
  high_speed: Exit Acceleration Expert with Minimal avg throttle intensity
  technical: Raw Power Specialist with Exceptional gear changes
  street: Aggressive Line Raw Power Specialist

ALB (2023) - Adaptability Score: 100.0%
Styles by track type:
  high_speed: Balanced Technical Aggressor
  technical: Balanced Exit Acceleration Expert
  street: Aggressive Line Raw Power Specialist

ALO (2023) - Adaptability Score: 100.0%
Styles by track type:
  high_speed: Exit Acceleration Expert with Minimal avg throttle intensity
  

## Year-to-Year Consistency Analysis

For drivers with data across multiple seasons, we analyze how consistent their driving style remains from year to year. This helps identify if drivers evolve their approach or maintain a signature style.

In [None]:
# 8. STYLE CONSISTENCY ANALYSIS
print("\n=== DRIVER STYLE CONSISTENCY ACROSS YEARS ===")

# Group by driver
driver_years = {}
for driver_key in aggregated_overall:
    driver = driver_key.split('_')[0]
    if driver not in driver_years:
        driver_years[driver] = []
    driver_years[driver].append(driver_key)

# Only analyze drivers with multiple years of data
multi_year_drivers = {driver: years for driver, years in driver_years.items() if len(years) > 1}

# For each driver, check consistency across years
if overall_results is not None:
    for driver, driver_keys in multi_year_drivers.items():
        driver_styles = {}
        
        for driver_key in driver_keys:
            year = driver_key.split('_')[1]
            cluster = overall_results['feature_df'].loc[driver_key, 'cluster']
            style = overall_results['style_names'][cluster]
            driver_styles[year] = style
        
        # Calculate consistency (1 if all styles are the same, 0 if all different)
        unique_styles = len(set(driver_styles.values()))
        consistency = 1 - ((unique_styles - 1) / (len(driver_styles) - 1)) if len(driver_styles) > 1 else 1
        
        print(f"\n{driver} - Style Consistency: {consistency*100:.1f}%")
        print("Styles by year:")
        for year, style in sorted(driver_styles.items()):
            print(f"  {year}: {style}")


=== DRIVER STYLE CONSISTENCY ACROSS YEARS ===

LEC - Style Consistency: 100.0%
Styles by year:
  2022: Entry Speed Specialist with Minimal avg brake intensity
  2023: Entry Speed Specialist with Minimal avg brake intensity

VER - Style Consistency: 100.0%
Styles by year:
  2022: Aggressive Line Raw Power Specialist
  2023: Aggressive Line Raw Power Specialist

SAI - Style Consistency: 0.0%
Styles by year:
  2022: Entry Speed Specialist with Minimal avg brake intensity
  2023: Aggressive Line Raw Power Specialist

PER - Style Consistency: 0.0%
Styles by year:
  2022: Entry Speed Specialist with Exceptional path smoothness
  2023: Aggressive Line Raw Power Specialist

HAM - Style Consistency: 0.0%
Styles by year:
  2022: Aggressive Line Raw Power Specialist
  2023: Entry Speed Specialist with Minimal avg brake intensity

BOT - Style Consistency: 0.0%
Styles by year:
  2022: Balanced Exit Acceleration Expert
  2023: Aggressive Line Raw Power Specialist

MAG - Style Consistency: 100.0%
St

## Track Type Specialists

Some drivers show distinctive approaches on specific track types that differ from their overall style. This analysis identifies these specialists and their unique adaptations.

In [None]:
# 9. TRACK SPECIALISTS ANALYSIS
print("\n=== TRACK TYPE SPECIALISTS ===")

# For each track type, identify drivers who perform notably better on that type
specialists = {}

for track_type, result in track_type_results.items():
    if result is None:
        continue
        
    # For each style cluster in this track type
    for cluster, features in result['cluster_analysis'].iterrows():
        # Get the defining positive features
        positive_features = {f: v for f, v in features.items() if v > 0.7}
        
        # If we have strong positive features
        if positive_features:
            style = result['style_names'][cluster]
            
            # Find drivers in this cluster
            drivers_in_cluster = result['feature_df'][result['feature_df']['cluster'] == cluster]
            
            for idx, row in drivers_in_cluster.iterrows():
                driver = row['driver']
                year = row['year']
                
                # Check if this driver has the same style overall
                overall_key = f"{driver}_{year}"
                if overall_key in overall_results['feature_df'].index:
                    overall_cluster = overall_results['feature_df'].loc[overall_key, 'cluster']
                    overall_style = overall_results['style_names'][overall_cluster]
                    
                    # If the track-specific style is different from overall
                    if style != overall_style:
                        if driver not in specialists:
                            specialists[driver] = []
                        specialists[driver].append({
                            'year': year,
                            'track_type': track_type,
                            'specialized_style': style,
                            'overall_style': overall_style,
                            'key_features': positive_features
                        })

# Print track specialists
if specialists:
    for driver, specialties in specialists.items():
        if specialties:
            print(f"\n{driver} shows specialized styles on certain track types:")
            for specialty in specialties:
                print(f"  {specialty['year']} - {specialty['track_type']}: {specialty['specialized_style']}")
                print(f"    (Overall style: {specialty['overall_style']})")
                print(f"    Key specialized features: {', '.join(specialty['key_features'].keys())}")
else:
    print("No clear track type specialists identified.")


=== TRACK TYPE SPECIALISTS ===

BOT shows specialized styles on certain track types:
  2023 - high_speed: Balanced Technical Aggressor
    (Overall style: Aggressive Line Raw Power Specialist)
    Key specialized features: speed_variability, entry_exit_bias

MAG shows specialized styles on certain track types:
  2023 - high_speed: Balanced Technical Aggressor
    (Overall style: Balanced Exit Acceleration Expert)
    Key specialized features: speed_variability, entry_exit_bias
  2022 - street: Entry Speed Specialist with Exceptional path smoothness
    (Overall style: Balanced Exit Acceleration Expert)
    Key specialized features: path_smoothness, speed_variability, entry_exit_bias

OCO shows specialized styles on certain track types:
  2023 - high_speed: Balanced Technical Aggressor
    (Overall style: Aggressive Line Raw Power Specialist)
    Key specialized features: speed_variability, entry_exit_bias
  2022 - street: Balanced Exit Acceleration Expert
    (Overall style: Aggressiv

## Saving Analysis Results

The complete analysis results are saved for use in our interactive dashboard, allowing for visualization and further exploration.

In [14]:
# Save analysis results
with open('driver_dna_analysis.pkl', 'wb') as f:
    pickle.dump({
        'overall_results': overall_results,
        'track_type_results': track_type_results,
        'weather_results': weather_results,
        'aggregated_overall': aggregated_overall
    }, f)