In [1]:
import requests
import pandas as pd

APIs:
http://ergast.com/mrd/
https://api.openf1.org/

Main routes:
1. Driver's (and/or team's) performance in last x races determine performance in this/next race
2. Same as above but with performance in pre-race sessions as determinant
Note: Major reg changes in 2022, and new in 2026
Other ideas:
1. Car data: DRS Utilization: Analyze the effectiveness of DRS zones by correlating DRS activation with speed increments.
2. Driver Performance Metrics: Compare performance statistics across different drivers / Team Dynamics: Analyze how driver pairings within teams influence overall team performance.
3. Pace Assessment: Determine the consistency of a driver's lap times relative to others.
6. Historical Comparisons: Compare data across different meetings to identify performance trends.
7. Pit Strategy Evaluation: Assess the effectiveness of pit stop strategies and their impact on race outcomes.
9. (maybe?) Safety Assessments: Analyze the frequency and causes of race incidents
10. Preparation Analysis: Evaluate how practice session performances translate to race results.
11. Tire Management: Assess tire performance and degradation over stints. / Fuel Strategy: Analyze fuel load management and its effect on lap times.
12. Driver Feedback: Understand driver concerns and feedback during the race. <- wordcloud could be funny except it's mp3s not transcripts...
13. Performance Correlation: Analyze how weather conditions affect car and driver performance. / Strategic Planning: Assess how teams adapt strategies based on weather forecasts.

List of races for 2023 and 2024:

In [7]:
URL = 'https://api.openf1.org/v1/meetings?year=2024&csv=true'
races_2024 = pd.read_csv(URL)

In [57]:
races_2024.to_csv('../data/races_2024.csv')

In [9]:
URL = 'https://api.openf1.org/v1/meetings?year=2023&csv=true'
races_2023 = pd.read_csv(URL)

In [11]:
races = [races_2023, races_2024]
races_df = pd.concat(races).reset_index(drop=True)

In [59]:
races_2023.to_csv('../data/races_2023.csv')

In [13]:
races_df

Unnamed: 0,circuit_key,circuit_short_name,country_code,country_key,country_name,date_start,gmt_offset,location,meeting_code,meeting_key,meeting_name,meeting_official_name,year
0,63,Sakhir,BRN,36,Bahrain,2023-02-23 07:00:00+00:00,03:00:00,Sakhir,,1140,Pre-Season Testing,FORMULA 1 ARAMCO PRE-SEASON TESTING 2023,2023
1,63,Sakhir,BRN,36,Bahrain,2023-03-03 11:30:00+00:00,03:00:00,Sakhir,,1141,Bahrain Grand Prix,FORMULA 1 GULF AIR BAHRAIN GRAND PRIX 2023,2023
2,149,Jeddah,KSA,153,Saudi Arabia,2023-03-17 13:30:00+00:00,03:00:00,Jeddah,,1142,Saudi Arabian Grand Prix,FORMULA 1 STC SAUDI ARABIAN GRAND PRIX 2023,2023
3,10,Melbourne,AUS,5,Australia,2023-03-31 01:30:00+00:00,11:00:00,Melbourne,,1143,Australian Grand Prix,FORMULA 1 ROLEX AUSTRALIAN GRAND PRIX 2023,2023
4,144,Baku,AZE,30,Azerbaijan,2023-04-28 09:30:00+00:00,04:00:00,Baku,,1207,Azerbaijan Grand Prix,FORMULA 1 AZERBAIJAN GRAND PRIX 2023,2023
5,151,Miami,USA,19,United States,2023-05-05 18:00:00+00:00,-04:00:00,Miami,,1208,Miami Grand Prix,FORMULA 1 CRYPTO.COM MIAMI GRAND PRIX 2023,2023
6,22,Monte Carlo,MON,114,Monaco,2023-05-26 11:30:00+00:00,02:00:00,Monaco,,1210,Monaco Grand Prix,FORMULA 1 GRAND PRIX DE MONACO 2023,2023
7,15,Catalunya,ESP,1,Spain,2023-06-02 11:30:00+00:00,02:00:00,Barcelona,,1211,Spanish Grand Prix,FORMULA 1 AWS GRAN PREMIO DE ESPAÑA 2023,2023
8,23,Montreal,CAN,46,Canada,2023-06-16 17:30:00+00:00,-04:00:00,Montréal,,1212,Canadian Grand Prix,FORMULA 1 PIRELLI GRAND PRIX DU CANADA 2023,2023
9,19,Spielberg,AUT,17,Austria,2023-06-30 15:00:00+00:00,02:00:00,Spielberg,AUT,1213,Austrian Grand Prix,FORMULA 1 ROLEX GROSSER PREIS VON ÖSTERREICH 2023,2023


In [61]:
races_df.to_csv('../data/races_df.csv')

List of sessions for 2023 and 2024:
(A session includes 1 or 3 non-competitive 'Free Practice' sessions, qualifying, (sprint-quali and sprint?), and race)

In [33]:
URL = 'https://api.openf1.org/v1/sessions?year=2024&csv=true'
sessions_2024 = pd.read_csv(URL)

In [63]:
sessions_2024.to_csv('../data/sessions_2024.csv')

In [35]:
URL = 'https://api.openf1.org/v1/sessions?year=2023&csv=true'
sessions_2023 = pd.read_csv(URL)

In [65]:
sessions_2023.to_csv('../data/sessions_2023.csv')

In [37]:
sessions = [sessions_2023, sessions_2024]
sessions_df = pd.concat(sessions).reset_index(drop=True)

In [67]:
sessions_df.to_csv('../data/sessions_df.csv')

Extracting keys for just race sessions to use in a bit:

In [39]:
races = sessions_df[sessions_df['session_name'] == 'Race']
race_keys = pd.array(races['session_key'])

In [41]:
race_keys

<NumpyExtensionArray>
[7953, 7779, 7787, 9070, 9078, 9094, 9102, 9110, 9118, 9126, 9133, 9141, 9149,
 9157, 9165, 9173, 9221, 9213, 9181, 9205, 9189, 9197, 9472, 9480, 9488, 9496,
 9673, 9507, 9515, 9523, 9531, 9539, 9550, 9558, 9566, 9574, 9582, 9590, 9598,
 9606, 9617, 9625, 9636, 9644, 9655, 9662]
Length: 46, dtype: int64

Compiling list of drivers for 2023 and 2024 seasons:

In [45]:
drivers = []
for x in race_keys:
    data = pd.read_csv(f'https://api.openf1.org/v1/drivers?session_key={x}&csv=true')
    drivers.append(data)
drivers_df = pd.concat(drivers, ignore_index=True)

In [69]:
drivers_df.to_csv('../data/drivers_df.csv')

In [71]:
drivers_df#['full_name'].unique()

Unnamed: 0,broadcast_name,country_code,driver_number,first_name,full_name,headshot_url,last_name,meeting_key,name_acronym,session_key,team_colour,team_name
0,M VERSTAPPEN,,1,,Max VERSTAPPEN,,,1141,VER,7953,,
1,L SARGEANT,,2,,Logan SARGEANT,,,1141,SAR,7953,,
2,L NORRIS,,4,,Lando NORRIS,,,1141,NOR,7953,,
3,P GASLY,,10,,Pierre GASLY,,,1141,GAS,7953,,
4,S PEREZ,,11,,Sergio PEREZ,,,1141,PER,7953,,
...,...,...,...,...,...,...,...,...,...,...,...,...
913,C SAINZ,ESP,55,Carlos,Carlos SAINZ,https://media.formula1.com/d_driver_fallback_i...,Sainz,1252,SAI,9662,E80020,Ferrari
914,J DOOHAN,,61,,Jack DOOHAN,,,1252,DOO,9662,,
915,G RUSSELL,GBR,63,George,George RUSSELL,https://media.formula1.com/d_driver_fallback_i...,Russell,1252,RUS,9662,27F4D2,Mercedes
916,V BOTTAS,FIN,77,Valtteri,Valtteri BOTTAS,https://media.formula1.com/d_driver_fallback_i...,Bottas,1252,BOT,9662,52E252,Kick Sauber


Credit to weiranyu (Github) for the original code to pull from ergast F1 API (modified for my needs)

List of drivers for 2023 and 2024 (using ergast)

In [101]:
year = 2024
URL = 'http://ergast.com/api/f1/{}/drivers.json?limit=1000'.format(year)
drivers_2024_json = requests.get(URL).json()
drivers_2024_er = pd.DataFrame(drivers_2024_json["MRData"]["DriverTable"]['Drivers'])

In [105]:
drivers_2024_er.to_csv('../data/drivers_2024_er.csv')

In [119]:
year = 2023
URL = 'http://ergast.com/api/f1/{}/drivers.json?limit=1000'.format(year)
drivers_2023_json = requests.get(URL).json()
drivers_2023_er = pd.DataFrame(drivers_2023_json["MRData"]["DriverTable"]['Drivers'])

List of constructors for 2024

In [113]:
year = 2024
URL = 'http://ergast.com/api/f1/{}/constructors.json?limit=1000'.format(year)
teams_2024_json = requests.get(URL).json()
teams_2024_er = pd.DataFrame(teams_2024_json["MRData"]["ConstructorTable"]['Constructors'])

In [117]:
teams_2024_er.to_csv('../data/teams_2024_er.csv')

Race results for 2024

In [None]:
year = 2024
for x in range(1,2):
    URL = 'http://ergast.com/api/f1/{}/{}/results.json?limit=1000'.format(year, x)
    data = requests.get(URL).json()
    results_24.append(data)
results_24
#drivers_df = pd.concat(drivers, ignore_index=True)