## F1 Consistency Dashboard
This analysis will evaluate and visualize the consistency of F1 drivers from the past season (2024) based on their lap times and finishing positions from races.

### Steps
1. Find a valid dataset 
    - For the time being, we will use one from Kaggle. In the future, I would like to use an open source API from RapidAI called [Hyprace API](https://rapidapi.com/hyprace-hyprace-default/api/hyprace-api) to get real-time race data for the 2025 season.
2. Import packages and read/load datasets from csv files
3. Clean data and merge datasets
    - Mainly using: lap_times.csv, results.csv, races.csv, drivers.csv
4. Rename and rearrange columns

In [4]:
# import packages
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt 
import seaborn as sb 

#%matplob lib inline

In [9]:
lap_times = pd.read_csv('/Users/Kat/Documents/repos/F1-Race-Result-Analysis/data/raw/lap_times.csv')
results = pd.read_csv('/Users/Kat/Documents/repos/F1-Race-Result-Analysis/data/raw/results.csv')
races = pd.read_csv('/Users/Kat/Documents/repos/F1-Race-Result-Analysis/data/raw/races.csv')
drivers = pd.read_csv('/Users/Kat/Documents/repos/F1-Race-Result-Analysis/data/raw/drivers.csv')

In [29]:
df = pd.merge(results, races[['raceId','year','name','round']], on='raceId', how='left')
df = pd.merge(df, drivers[['driverId','driverRef','number','code','forename','surname']], on='driverId', how='left')
print(df.tail(5))

       resultId  raceId  driverId  constructorId number_x  grid position  \
26953     26959    1154       865            215        6    12       16   
26954     26960    1154       840            117       18    17       17   
26955     26961    1154       846              1        4     7       18   
26956     26962    1154       859            215       30    19       19   
26957     26963    1154       848              3       23     9       20   

      positionText  positionOrder  points  ...  fastestLapSpeed statusId  \
26953           16             16     0.0  ...               \N       11   
26954           17             17     0.0  ...               \N       11   
26955           18             18     0.0  ...               \N        4   
26956            R             19     0.0  ...               \N       25   
26957            R             20     0.0  ...               \N        5   

       year                 name round driverRef number_y  code   forename  \
26953  2

In [30]:
# drop columns
df.drop(['positionText','points','number_x','statusId'], axis=1, inplace=True)
print(df.tail(5))


       resultId  raceId  driverId  constructorId  grid position  \
26953     26959    1154       865            215    12       16   
26954     26960    1154       840            117    17       17   
26955     26961    1154       846              1     7       18   
26956     26962    1154       859            215    19       19   
26957     26963    1154       848              3     9       20   

       positionOrder  laps           time milliseconds  ... fastestLapTime  \
26953             16    69         +8.737      5521425  ...       1:16.292   
26954             17    69         +9.063      5521751  ...       1:14.902   
26955             18    66  +-1:52:04.782      5037470  ...       1:14.229   
26956             19    53             \N           \N  ...       1:16.320   
26957             20    46             \N           \N  ...       1:16.197   

      fastestLapSpeed  year                 name  round driverRef  number_y  \
26953              \N  2025  Canadian Grand Prix 