# 🏎️ F1 Sprint Race Wins Analysis
**Author:** Faisal Nazir  
**Objective:** Download Kaggle F1 dataset and compute sprint race wins per driver.

> This notebook uses the `kagglehub` library to fetch the dataset and Pandas to clean and analyse the data.

In [3]:
!pip install kagglehub 

import kagglehub
import pandas as pd
import os

Collecting kagglehub
  Downloading kagglehub-0.3.12-py3-none-any.whl.metadata (38 kB)
Downloading kagglehub-0.3.12-py3-none-any.whl (67 kB)
Installing collected packages: kagglehub
Successfully installed kagglehub-0.3.12


In [4]:
# Download the dataset
path = kagglehub.dataset_download("rohanrao/formula-1-world-championship-1950-2020")
print("Path to dataset files:", path)

Downloading from https://www.kaggle.com/api/v1/datasets/download/rohanrao/formula-1-world-championship-1950-2020?dataset_version_number=24...


100%|██████████| 6.28M/6.28M [00:00<00:00, 10.6MB/s]

Extracting files...





Path to dataset files: /Users/marianbolous/.cache/kagglehub/datasets/rohanrao/formula-1-world-championship-1950-2020/versions/24


In [5]:
# Load the necessary dataframes
circuits_df = pd.read_csv(os.path.join(path, 'circuits.csv'))
races_df = pd.read_csv(os.path.join(path, 'races.csv'))
results_df = pd.read_csv(os.path.join(path, 'results.csv'))
drivers_df = pd.read_csv(os.path.join(path, 'drivers.csv'))
sprint_results_df = pd.read_csv(os.path.join(path, 'sprint_results.csv'))

In [6]:
# Merge sprint results with races to get race information
sprint_races_df = pd.merge(sprint_results_df, races_df[['raceId', 'year', 'name']], on='raceId')

# Merge with drivers to get driver names
sprint_race_winners = pd.merge(sprint_races_df[sprint_races_df['positionOrder'] == 1], drivers_df[['driverId', 'forename', 'surname']], on='driverId')

# Combine forename and surname for full driver name
sprint_race_winners['driverName'] = sprint_race_winners['forename'] + ' ' + sprint_race_winners['surname']

In [7]:
# Count the number of sprint wins for each driver
sprint_wins_count = sprint_race_winners['driverName'].value_counts().reset_index()
sprint_wins_count.columns = ['driverName', 'sprintWins']

In [8]:
# Display the results
print("Number of F1 Sprint race wins per driver:")
print(sprint_wins_count)

Number of F1 Sprint race wins per driver:
        driverName  sprintWins
0   Max Verstappen          11
1  Valtteri Bottas           2
2    Oscar Piastri           2
3   George Russell           1
4     Sergio Pérez           1
5     Lando Norris           1
