## All Qualifying Session Analysis

Using the data gathered in the previous file, I will conduct an analysis for all qualifying sessions & drivers in the 2021 season.

The data contains the fastest lap for each driver in each qualifying session in the 2021 season

In [2]:
#Setting up - import packages
import pandas as pd
import numpy as np
import fastf1 as ff1
#Enable the cache
ff1.Cache.enable_cache('C:/Users/jackh/OneDrive/Documents/Python Scripts/f1_cache') 

In [3]:
#Load existing data - saved as .csv file
raw = pd.read_csv('C:/Users/jackh/OneDrive/Documents/Python Scripts/2021_f1_qualifying_laps.csv')

In [4]:
raw.head()

Unnamed: 0.1,Unnamed: 0,Date,SessionTime,DriverAhead,DistanceToDriverAhead,Time,RPM,Speed,nGear,Throttle,...,Source,RelativeDistance,Status,X,Y,Z,Distance,Driver,GP,Location
0,0,2021-03-27 15:59:07.466,0 days 01:14:07.081000,,715.495556,0 days 00:00:00,10531,295,8,100,...,interpolation,1.6e-05,OnTrack,-384,1200,-159,0.0,VER,1,Bahrain
1,1,2021-03-27 15:59:07.603,0 days 01:14:07.218000,,715.495556,0 days 00:00:00.137000,10537,296,8,100,...,pos,0.002104,OnTrack,-378,1312,-159,11.264444,VER,1,Bahrain
2,2,2021-03-27 15:59:07.635,0 days 01:14:07.250000,,715.495556,0 days 00:00:00.169000,10549,297,8,100,...,car,0.002592,OnTrack,-376,1338,-159,13.904444,VER,1,Bahrain
3,3,2021-03-27 15:59:07.823,0 days 01:14:07.438000,,715.495556,0 days 00:00:00.357000,10635,297,8,100,...,pos,0.005465,OnTrack,-369,1493,-159,29.414444,VER,1,Bahrain
4,4,2021-03-27 15:59:07.875,0 days 01:14:07.490000,44.0,715.495556,0 days 00:00:00.409000,10721,298,8,100,...,car,0.006262,OnTrack,-367,1536,-158,33.718889,VER,1,Bahrain


### I: Breaking Down Tracks into Minisectors

I will build an intra-team comparison across the season; however, before we analyze drivers we must break each track into minisectors, because telemetry data does not exactly line up at each point. By grouping each lap into minisectors we can analyze each teammate's performance on the same granularity.

In [6]:
#Define number of minisectors
num_minisectors = 25.0
#Find the max distance for each gp
distance_gp = raw.groupby('GP')['Distance'].max().reset_index().rename(columns = {"Distance":"max_distance"})
#And attach to main df
df = raw.merge(distance_gp, how = 'inner', on = 'GP')

In [7]:
#Now calculate the minisector for each point in the data
df['Minisector'] = (df['Distance'] // (df['max_distance']/num_minisectors + .01) ) + 1

#And convert minisector to integer type - must do this in a second step
df['Minisector'] = df['Minisector'].astype(int)

### II: Assigning each driver an average speed per track & minisector

Because the telemetry isn't captured at exactly the same spot for each driver, a good way to break down performance is to capture average speed per minisector.

In [9]:
#First - create an object that ranks overall speed of each driver, so that we ensure each team's faster driver is the one that comes out ahead vs. his teammate (ie the gap in speed is always positive for the faster driver)
r = df[["Location", "Driver", "Speed"]]
r = r.groupby(['Location', 'Driver'])['Speed'].mean().reset_index()
r['rank'] = r.groupby('Location')['Speed'].rank(ascending = False)
r = r.drop('Speed', axis = 1)

In [10]:
#And merge into the dataframe
df = df.merge(r, how = 'left', on = ['Location', 'Driver'])

In [13]:
#Add team information to dataset
driver_team = pd.DataFrame({'Driver':['VER', 'PER', 'HAM', 'BOT', 'LEC', 'SAI', 'RIC', 'NOR', 'ALO', 'OCO'
                                      , 'VET', 'STR', 'GAS', 'TSU', 'RUS', 'LAT', 'RAI', 'GIO', 'MSC', 'MAZ'],
                            'Team': ['Red Bull', 'Red Bull', 'Mercedes', 'Mercedes', 'Ferrari', 'Ferrari', 'McLaren', 'McLaren', 'Alpine', 'Alpine'
                                     , 'Aston Martin', 'Aston Martin', 'Alpha Tauri', 'Alpha Tauri', 'Williams', 'Williams', 'Alfa Romeo' , 'Alfa Romeo', 'Haas', 'Haas']})

df = df.merge(driver_team, how = 'left', on = 'Driver')

In [16]:
#Calculate each driver's minisector speed for each session, an arrange the results based on overall rank in the session
average_speed = df.groupby(['GP', 'Location', 'Minisector', 'Team', 'Driver', 'rank'])['Speed'].mean().reset_index().sort_values(by=['GP', 'Team', 'Minisector', 'rank'], ascending = (True))

### Assign each minisector a speed category
Based on each driver's average speed through that minisector

In [32]:
#Determine what the cutoff is for a 'low speed' and 'high speed' minisector is for each track
sector_speed = df.groupby(['Location', 'Minisector'])['Speed'].mean().reset_index()
low_speeds = sector_speed.groupby('Location')['Speed'].quantile(q=.33).rename('low_speed_threshold').reset_index()
high_speeds = sector_speed.groupby('Location')['Speed'].quantile(q=.66).rename('high_speed_threshold').reset_index()

speed_key = low_speeds.merge(high_speeds, how = 'inner', on = 'Location')


In [35]:
#Now we can compare the thresholds to the average speeds each driver maintained in each minisector
#To do this, we'll come up with a grid-wide average speed indicator, similar to the average_speed table above
field_average_speed = df.groupby(['Location', 'Minisector'])['Speed'].mean().reset_index().sort_values(by=['Location', 'Minisector'], ascending = (True))

#And apply the speed_key object to this table
field_average_speed = field_average_speed.merge(speed_key, how = 'left', on = 'Location')


In [43]:
#Now we can assign each track's minisector a high, medium, or low value
field_average_speed['Sector_speed'] = np.where(field_average_speed['Speed'].lt(field_average_speed['low_speed_threshold']), 'Low',np.where(field_average_speed['Speed'].lt(field_average_speed['high_speed_threshold']),'Medium', 'High'))
#And drop the extra columns
field_average_speed = field_average_speed.drop(columns=['Speed', 'low_speed_threshold', 'high_speed_threshold'])

In [44]:
#And join it to the main data frame
average_speed = average_speed.merge(field_average_speed, how = 'left', on = ['Location', 'Minisector'])

In [45]:
average_speed.head()

Unnamed: 0,GP,Location,Minisector,Team,Driver,rank,Speed,Sector_speed
0,1,Bahrain,1,Alfa Romeo,GIO,13.0,303.173913,High
1,1,Bahrain,1,Alfa Romeo,RAI,14.0,302.916667,High
2,1,Bahrain,2,Alfa Romeo,GIO,13.0,315.045455,High
3,1,Bahrain,2,Alfa Romeo,RAI,14.0,314.714286,High
4,1,Bahrain,3,Alfa Romeo,GIO,13.0,311.227273,High


### III: Analysis