# Formula 1 "True Performance" Index: A Data-Driven Approach

## Introduction

Formula 1 is not always fair. In this sport, the car is often more important than the driver. An average driver in the best car can win the championship, while a genius driver in a slow car might finish last. Because of this, the official standings do not always show who the best driver really is.

**Main Objective:**
The goal of this project is to create a new, fair ranking system. I want to separate the driver's skill from the car's performance. My analysis tries to answer one fundamental question: Who is the best driver on the grid, rather than who has the fastest car? To find the answer, I focus on two key metrics:
- Teammate Comparison: Comparing a driver against their teammate (the only person with the exact same car).
- Race Improvement: Checking if a driver can gain positions during the race and recover from a bad start.

**Tools Used:** I used the *Formula 1 World Championship (1950 - 2024)* [link](https://www.kaggle.com/datasets/rohanrao/formula-1-world-championship-1950-2020) dataset and DuckDB (SQL) to analyze the data. The project involves:
- Cleaning and filtering raw data
- Using Window Functions for advanced calculations
- Creating a final score to rank drivers from different teams


## Chapter 1: The Dominance Index (Teammate Comparison)

To begin the analysis, I will initialize a DuckDB database instance in memory. This approach allows for high-performance query execution without the need for a complex server setup.

Next, I will load the raw data (CSV files) into the database. This step converts the static files into queryable SQL tables.

In [13]:
%load_ext sql
%sql duckdb:///:memory:
%sql ROLLBACK

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


Success


In [14]:
%%sql
CREATE OR REPLACE TABLE circuits AS SELECT * FROM 'seeds/circuits.csv';
CREATE OR REPLACE TABLE constructors AS SELECT * FROM 'seeds/constructors.csv';
CREATE OR REPLACE TABLE drivers AS SELECT * FROM 'seeds/drivers.csv';
CREATE OR REPLACE TABLE races AS SELECT * FROM 'seeds/races.csv';
CREATE OR REPLACE TABLE results AS SELECT * FROM 'seeds/results.csv';
CREATE OR REPLACE TABLE status AS SELECT * FROM 'seeds/status.csv';
CREATE OR REPLACE TABLE qualifying AS SELECT * FROM 'seeds/qualifying.csv';
CREATE OR REPLACE TABLE pit_stops AS SELECT * FROM 'seeds/pit_stops.csv';
CREATE OR REPLACE TABLE lap_times AS SELECT * FROM 'seeds/lap_times.csv';

Count


### Data Overview

Before diving into the main analysis, let's briefly inspect the dataset structure.

The full dataset consists of 14 relational tables. However, to keep the analysis focused, I will use the 5 key tables that are relevant to driver performance:
- constructors: Team names and details.
- races: Information about each Grand Prix (year, date, circuit).
- results: The core table containing positions, points, and laptimes for every race entry.
- drivers: Driver names and personal information.
- status: Explains the race outcome (e.g., "Finished", "Collision", "Engine Failure").

#### Table: Races

This dimension table acts as the event calendar. It contains essential metadata for every Grand Prix, including the race date, official name, circuit location, and start time.

In [15]:
%%sql

SELECT * FROM races

raceId,year,round,circuitId,name,date,time,url,fp1_date,fp1_time,fp2_date,fp2_time,fp3_date,fp3_time,quali_date,quali_time,sprint_date,sprint_time
1,2009,1,1,Australian Grand Prix,2009-03-29,06:00:00,http://en.wikipedia.org/wiki/2009_Australian_Grand_Prix,\N,\N,\N,\N,\N,\N,\N,\N,\N,\N
2,2009,2,2,Malaysian Grand Prix,2009-04-05,09:00:00,http://en.wikipedia.org/wiki/2009_Malaysian_Grand_Prix,\N,\N,\N,\N,\N,\N,\N,\N,\N,\N
3,2009,3,17,Chinese Grand Prix,2009-04-19,07:00:00,http://en.wikipedia.org/wiki/2009_Chinese_Grand_Prix,\N,\N,\N,\N,\N,\N,\N,\N,\N,\N
4,2009,4,3,Bahrain Grand Prix,2009-04-26,12:00:00,http://en.wikipedia.org/wiki/2009_Bahrain_Grand_Prix,\N,\N,\N,\N,\N,\N,\N,\N,\N,\N
5,2009,5,4,Spanish Grand Prix,2009-05-10,12:00:00,http://en.wikipedia.org/wiki/2009_Spanish_Grand_Prix,\N,\N,\N,\N,\N,\N,\N,\N,\N,\N
6,2009,6,6,Monaco Grand Prix,2009-05-24,12:00:00,http://en.wikipedia.org/wiki/2009_Monaco_Grand_Prix,\N,\N,\N,\N,\N,\N,\N,\N,\N,\N
7,2009,7,5,Turkish Grand Prix,2009-06-07,12:00:00,http://en.wikipedia.org/wiki/2009_Turkish_Grand_Prix,\N,\N,\N,\N,\N,\N,\N,\N,\N,\N
8,2009,8,9,British Grand Prix,2009-06-21,12:00:00,http://en.wikipedia.org/wiki/2009_British_Grand_Prix,\N,\N,\N,\N,\N,\N,\N,\N,\N,\N
9,2009,9,20,German Grand Prix,2009-07-12,12:00:00,http://en.wikipedia.org/wiki/2009_German_Grand_Prix,\N,\N,\N,\N,\N,\N,\N,\N,\N,\N
10,2009,10,11,Hungarian Grand Prix,2009-07-26,12:00:00,http://en.wikipedia.org/wiki/2009_Hungarian_Grand_Prix,\N,\N,\N,\N,\N,\N,\N,\N,\N,\N


#### Table: Results

This is the central Fact Table of the dataset. It connects all other dimensions (Drivers, Races, Constructors) via Foreign Keys.
It contains the granular performance metrics for every entry, including grid position, finishing rank, points scored, and fastest lap times. The grain of this table is one row per driver per race.

In [16]:
%%sql
SELECT * FROM results

resultId,raceId,driverId,constructorId,number,grid,position,positionText,positionOrder,points,laps,time,milliseconds,fastestLap,rank,fastestLapTime,fastestLapSpeed,statusId
1,18,1,1,22,1,1,1,1,10.0,58,1:34:50.616,5690616,39,2,1:27.452,218.3,1
2,18,2,2,3,5,2,2,2,8.0,58,+5.478,5696094,41,3,1:27.739,217.586,1
3,18,3,3,7,7,3,3,3,6.0,58,+8.163,5698779,41,5,1:28.090,216.719,1
4,18,4,4,5,11,4,4,4,5.0,58,+17.181,5707797,58,7,1:28.603,215.464,1
5,18,5,1,23,3,5,5,5,4.0,58,+18.014,5708630,43,1,1:27.418,218.385,1
6,18,6,3,8,13,6,6,6,3.0,57,\N,\N,50,14,1:29.639,212.974,11
7,18,7,5,14,17,7,7,7,2.0,55,\N,\N,54,8,1:29.534,213.224,5
8,18,8,6,1,15,8,8,8,1.0,53,\N,\N,20,4,1:27.903,217.18,5
9,18,9,2,4,2,\N,R,9,0.0,47,\N,\N,15,9,1:28.753,215.1,4
10,18,10,7,12,18,\N,R,10,0.0,43,\N,\N,23,13,1:29.558,213.166,3


#### Table: Drivers

This dimension table stores personal information for every driver in the dataset. It includes details such as their forename, surname, date of birth, and nationality.

In [17]:
%%sql
SELECT * FROM drivers

driverId,driverRef,number,code,forename,surname,dob,nationality,url
1,hamilton,44,HAM,Lewis,Hamilton,1985-01-07,British,http://en.wikipedia.org/wiki/Lewis_Hamilton
2,heidfeld,\N,HEI,Nick,Heidfeld,1977-05-10,German,http://en.wikipedia.org/wiki/Nick_Heidfeld
3,rosberg,6,ROS,Nico,Rosberg,1985-06-27,German,http://en.wikipedia.org/wiki/Nico_Rosberg
4,alonso,14,ALO,Fernando,Alonso,1981-07-29,Spanish,http://en.wikipedia.org/wiki/Fernando_Alonso
5,kovalainen,\N,KOV,Heikki,Kovalainen,1981-10-19,Finnish,http://en.wikipedia.org/wiki/Heikki_Kovalainen
6,nakajima,\N,NAK,Kazuki,Nakajima,1985-01-11,Japanese,http://en.wikipedia.org/wiki/Kazuki_Nakajima
7,bourdais,\N,BOU,Sébastien,Bourdais,1979-02-28,French,http://en.wikipedia.org/wiki/S%C3%A9bastien_Bourdais
8,raikkonen,7,RAI,Kimi,Räikkönen,1979-10-17,Finnish,http://en.wikipedia.org/wiki/Kimi_R%C3%A4ikk%C3%B6nen
9,kubica,88,KUB,Robert,Kubica,1984-12-07,Polish,http://en.wikipedia.org/wiki/Robert_Kubica
10,glock,\N,GLO,Timo,Glock,1982-03-18,German,http://en.wikipedia.org/wiki/Timo_Glock


#### Constructors

This dimension table stores information about the teams (Constructors). It includes key details such as the team name, nationality, and the unique ID used to link teams to the race results.

In [18]:
%%sql
SELECT * FROM constructors

constructorId,constructorRef,name,nationality,url
1,mclaren,McLaren,British,http://en.wikipedia.org/wiki/McLaren
2,bmw_sauber,BMW Sauber,German,http://en.wikipedia.org/wiki/BMW_Sauber
3,williams,Williams,British,http://en.wikipedia.org/wiki/Williams_Grand_Prix_Engineering
4,renault,Renault,French,http://en.wikipedia.org/wiki/Renault_in_Formula_One
5,toro_rosso,Toro Rosso,Italian,http://en.wikipedia.org/wiki/Scuderia_Toro_Rosso
6,ferrari,Ferrari,Italian,http://en.wikipedia.org/wiki/Scuderia_Ferrari
7,toyota,Toyota,Japanese,http://en.wikipedia.org/wiki/Toyota_Racing
8,super_aguri,Super Aguri,Japanese,http://en.wikipedia.org/wiki/Super_Aguri_F1
9,red_bull,Red Bull,Austrian,http://en.wikipedia.org/wiki/Red_Bull_Racing
10,force_india,Force India,Indian,http://en.wikipedia.org/wiki/Racing_Point_Force_India


#### Status

This dimension table describes the final outcome of a driver's race. It provides crucial context by categorizing the result—whether the driver finished normally, or retired due to a specific issue like a collision, engine failure, or other mechanical problems.

In [19]:
%%sql
SELECT * FROM status

statusId,status
1,Finished
2,Disqualified
3,Accident
4,Collision
5,Engine
6,Gearbox
7,Transmission
8,Clutch
9,Hydraulics
10,Electrical


### Data Preparation: Creating the One Big Table (OBT)

The first step of the analysis is to denormalize the dataset. I will create a single, consolidated table called team_stats by joining data from all the dimension tables.

This "One Big Table" (OBT) approach is efficient because it eliminates the need to perform repetitive, complex joins in every subsequent query. Additionally, I am filtering the dataset to include only races from the year 2000 onwards, focusing the study on the modern era of Formula 1.

In [20]:
%%sql
CREATE OR REPLACE TABLE team_stats AS
WITH main_table AS (
    SELECT 
        r.driverId,
        r.raceId,
        r.constructorId,
        r.points,
        r.positionOrder as position, 
        r.grid,
        r.statusId,
        s.status,
        CONCAT(d.forename, ' ', d.surname) as driver_name,
        c.name as team,
        ra.name as race_name,
        ra.year as race_year,
        ra.date as race_date
    FROM results as r
    INNER JOIN constructors as c 
        ON r.constructorId = c.constructorId
    INNER JOIN drivers as d 
        ON r.driverId = d.driverId
    INNER JOIN races as ra 
        ON r.raceId = ra.raceId
    INNER JOIN status as s 
        ON r.statusId = s.statusId
    WHERE ra.year >= 2000
)

SELECT * FROM main_table;

Count


Here is a preview of the newly created consolidated dataset. By inspecting the first few rows, we can verify that the joins were successful and the data structure is ready for analysis.

In [21]:
%%sql
SELECT * FROM team_stats

driverId,raceId,constructorId,points,position,grid,statusId,status,driver_name,team,race_name,race_year,race_date
1,18,1,10.0,1,1,1,Finished,Lewis Hamilton,McLaren,Australian Grand Prix,2008,2008-03-16
2,18,2,8.0,2,5,1,Finished,Nick Heidfeld,BMW Sauber,Australian Grand Prix,2008,2008-03-16
3,18,3,6.0,3,7,1,Finished,Nico Rosberg,Williams,Australian Grand Prix,2008,2008-03-16
4,18,4,5.0,4,11,1,Finished,Fernando Alonso,Renault,Australian Grand Prix,2008,2008-03-16
5,18,1,4.0,5,3,1,Finished,Heikki Kovalainen,McLaren,Australian Grand Prix,2008,2008-03-16
6,18,3,3.0,6,13,11,+1 Lap,Kazuki Nakajima,Williams,Australian Grand Prix,2008,2008-03-16
7,18,5,2.0,7,17,5,Engine,Sébastien Bourdais,Toro Rosso,Australian Grand Prix,2008,2008-03-16
8,18,6,1.0,8,15,5,Engine,Kimi Räikkönen,Ferrari,Australian Grand Prix,2008,2008-03-16
9,18,2,0.0,9,2,4,Collision,Robert Kubica,BMW Sauber,Australian Grand Prix,2008,2008-03-16
10,18,7,0.0,10,18,3,Accident,Timo Glock,Toyota,Australian Grand Prix,2008,2008-03-16


## Chapter 2: Team wars

In Formula 1, the car's performance is the biggest variable. Therefore, a driver's only true benchmark is their teammate, who drives identical car.

In this module, I will analyze Driver Dominance. I aim to determine if both drivers contributed equally to the team's success or if one driver "carried" the team. I calculate this by measuring the percentage of the team's total points earned by each driver in a specific race.

To achieve this in SQL without complex self-joins, I will utilize Window Functions (specifically SUM() OVER (PARTITION BY...)) to calculate the team's total points dynamically for every row.

In [23]:
%%sql
CREATE OR REPLACE TABLE driver_dominance AS

WITH team_points AS(
    SELECT 
        *,
        SUM(points) OVER(PARTITION BY raceid,team) as team_points
    FROM team_stats
),


team_percentage AS (
    SELECT 
        *,
        CASE
            WHEN team_points = 0 THEN 0
            ELSE points / team_points 
        END as percentage_team_points
    FROM team_points
)

SELECT
    driver_name,
    race_year,
    team,
    COUNT(raceId) as race_count,
    SUM(points) as points_sum,

    ROUND(AVG(percentage_team_points),2) as avg_dominance 
FROM team_percentage
GROUP BY driver_name, race_year, team
HAVING race_count > 5 
ORDER BY avg_dominance DESC;

Count


With the metrics calculated, we can now generate the leaderboard. The following query retrieves the top 5 driver seasons characterized by the highest intra-team dominance scores. These are the instances where a single driver was responsible for the vast majority of their constructor's points.

In [24]:
%%sql
SELECT * FROM driver_dominance

driver_name,race_year,team,race_count,points_sum,avg_dominance
Max Verstappen,2024,Red Bull,24,399.0,0.77
Kimi Räikkönen,2012,Lotus F1,20,207.0,0.75
Robert Kubica,2010,Renault,19,136.0,0.73
Michael Schumacher,2002,Ferrari,17,144.0,0.71
Fernando Alonso,2012,Ferrari,20,278.0,0.71
Fernando Alonso,2013,Ferrari,19,242.0,0.7
Fernando Alonso,2014,Ferrari,19,161.0,0.7
Max Verstappen,2023,Red Bull,22,530.0,0.7
Fernando Alonso,2023,Aston Martin,22,198.0,0.69
Sebastian Vettel,2013,Red Bull,19,397.0,0.68


#TU KIEDYS ZROBIC DOGLEBNA ANALIZE TYCH PIERWSZYCH 3 MIEJSC!!!!!!!

After analyzing the data, we can observe clear outliers where a single driver carried the bulk of the team's performance:
- Max Verstappen (2024): The data confirms his absolute supremacy. He emerges as the most dominant driver in the dataset, securing the highest percentage of team points relative to his partner.
- Kimi Räikkönen (2012):  Upon his return to F1 with Lotus, Kimi delivered a masterclass in consistency (finishing every race), whereas his teammate (Romain Grosjean) struggled with volatility and incidents. This huge gap in stability propelled Kimi's dominance score.
- Robert Kubica (2010): A remarkable anomaly appears in the 3rd spot. Driving for Renault, Kubica achieved one of the highest dominance ratios in modern F1 history, significantly outperforming his rookie teammate (Vitaly Petrov) and demonstrating his ability to extract maximum value from the car.

## Chapter 3: Racecraft & The "Sunday Driver" Effect

In Formula 1, there is a distinct difference between "Qualifying Pace" (Saturday) and "Race Pace" (Sunday). Some drivers are incredibly fast over a single lap but struggle with tire degradation or strategy during the race. Others, known as "Sunday Drivers," may qualify poorly but possess excellent racecraft, allowing them to carve through the field and finish significantly higher than they started.

In this module, I will calculate the Position Delta for every race. The formula is:

**GridPosition−FinishingPosition**

- Positive Score (+): The driver improved their position (e.g., Start 10th → Finish 6th = +4).

- Negative Score (-): The driver lost positions (e.g., Start 2nd → Finish 5th = -3).

This metric helps to identify the most opportunistic and resilient drivers on the grid—those who turn bad situations into good points.

### Filtering: Mechanical Failures vs. Driver Performance

To accurately measure a driver's ability to gain positions, we must distinguish between human performance and machine reliability.

If a driver retires due to an engine failure, gearbox issue, or hydraulic leak, it is not a reflection of their skill—it is an engineering failure. Therefore, I will filter the dataset to include only "Valid Race Outcomes." A valid outcome is defined as:
- Classified Finish: The driver completed the race (Status ID: 1, 11-19).
- Driver-Related Incidents: The driver retired due to an accident, collision, or disqualification (Status ID: 3, 4, etc.). These are counted because avoiding trouble is part of "Racecraft."

Below I show the statuses from table 'status' that we will analyze:

In [178]:
%%sql
select * from status
where statusId in [1,2,3,4,11,12,13,14,15,16,17,18,19,85]
order by statusId 

statusId,status
1,Finished
2,Disqualified
3,Accident
4,Collision
11,+1 Lap
12,+2 Laps
13,+3 Laps
14,+4 Laps
15,+5 Laps
16,+6 Laps


### Computing the position gain delta

In this step, I perform the complete analysis pipeline in a single query to generate the final racecraft_stats table. Using a Common Table Expression (CTE), I first calculate the Position Delta (Grid−Position) for every individual race entry, strictly filtering for valid race outcomes (excluding mechanical failures).

Then, I immediately aggregate these results to calculate the Average Positions Gained per driver for each season. To ensure statistical significance and remove "one-off" drivers (e.g., substitutes), I apply a threshold to include only drivers who competed in more than 15 races in a season.

In [25]:
%%sql
CREATE OR REPLACE TABLE racecraft_stats AS
WITH position_calc AS(
SELECT
    t.driver_name,
    t.driverId,
    t.race_year,
    t.team,
    r.grid,
    r.positionOrder,
    r.statusId,
    (r.grid-r.positionOrder) as position_gained
from team_stats as t
inner join results as r on t.raceId=r.raceId AND t.driverId=r.driverId
where r.statusId in [1,2,3,4,11,12,13,14,15,16,17,18,19,85])

select driverId,driver_name,race_year,team,ROUND(AVG(position_gained),2) as avg_position_gained,count(*) as races_count from position_calc
GROUP BY driverId,driver_name,race_year,team
HAVING races_count > 15
ORDER BY avg_position_gained DESC

Count


In [33]:
%%sql
select * from racecraft_stats

driverId,driver_name,race_year,team,avg_position_gained,races_count
33,Tiago Monteiro,2005,Jordan,4.89,18
818,Jean-Éric Vergne,2012,Toro Rosso,3.89,18
838,Stoffel Vandoorne,2018,McLaren,3.89,18
816,Jérôme d'Ambrosio,2011,Virgin,3.71,17
21,Giancarlo Fisichella,2004,Sauber,3.56,18
825,Kevin Magnussen,2016,Renault,3.41,17
840,Lance Stroll,2019,Racing Point,3.32,19
8,Kimi Räikkönen,2005,McLaren,3.31,16
831,Felipe Nasr,2016,Sauber,3.29,17
1,Lewis Hamilton,2014,Mercedes,3.25,16


The resulting leaderboard reveals a fascinating insight - the Positions Gained metric does not necessarily highlight the fastest drivers, but rather the most resilient ones. The top of the table is dominated by drivers who maximized opportunities from lower grid slots.

Top Findings:

- Tiago Monteiro (2005) - Ideally, the leader of this metric is a midfield driver. However, Monteiro's #1 spot is a classic example of survivorship bias. Driving for Jordan (a backmarker team), he almost always started near the back (P18-P20). His high score comes from extreme reliability (finishing races while others retired) and the infamous 2005 US Grand Prix anomaly (where he started 17th and finished 3rd because only 6 cars raced).

- Jean-Éric Vergne (2012) - Vergne is the perfect validation of this model. In 2012, he was notoriously weak in qualifying compared to his teammate (Daniel Ricciardo), but his racecraft and tire management were superior. He consistently turned P16 starts into points finishes, earning him the 2nd spot on this list.

- Lewis Hamilton (2014) - It is rare to see a World Champion on this list (as they usually start P1). However, Hamilton appears here due to specific "recovery drives" in 2014 (e.g., Germany and Hungary), where mechanical issues forced him to start from the back. His ability to slice through the entire field boosted his average significantly, proving he has elite racecraft even when out of position.

# TU TEZ KIEDYS ZROBIC TAKA PORZADNA ANALIZE!!!!!!!

### Data Normalization

To combine the "Racecraft" metric with other indicators (Teammate Dominance), we need to bring them to a common scale. Currently, the values are raw numbers (e.g., +4.89 or -2.1). I will apply Min-Max Normalization to rescale the avg_position_gained into a 0 to 1 score. The formula is: 

$$ Score = \frac{x - \min(x)}{\max(x) - \min(x)} $$
​
Score = 1.0: Represents the best performance in the dataset (The maximum positions gained in each year).

Score = 0.0: Represents the lowest performance.

In [38]:
%%sql
CREATE OR REPLACE TABLE normalized_stats AS
WITH stats_per_year AS (
    SELECT 
        d.driver_name,
        d.race_year,
        d.team,
        d.avg_dominance, 
        r.avg_position_gained,      
        MIN(r.avg_position_gained) OVER (PARTITION BY d.race_year) as min_gain,
        MAX(r.avg_position_gained) OVER (PARTITION BY d.race_year) as max_gain
        
    FROM driver_dominance as d
    INNER JOIN racecraft_stats as r 
        ON d.driver_name = r.driver_name 
        AND d.race_year = r.race_year
        AND d.team = r.team
),

normalized_scores AS (
    SELECT 
        *,
 
        avg_dominance * 100 as score_dominance,
        
        ((avg_position_gained - min_gain) / NULLIF(max_gain - min_gain, 0)) * 100 as score_racecraft
        
    FROM stats_per_year
)

SELECT 
    *
FROM normalized_scores

Count


In [39]:
%%sql
SELECT * FROM normalized_stats

driver_name,race_year,team,avg_dominance,avg_position_gained,min_gain,max_gain,score_dominance,score_racecraft
Nico Rosberg,2009,Williams,0.65,0.38,-1.25,1.56,65.0,58.007117437722414
Jenson Button,2009,Brawn,0.55,1.35,-1.25,1.56,55.00000000000001,92.52669039145908
Rubens Barrichello,2009,Brawn,0.45,0.06,-1.25,1.56,45.0,46.619217081850536
Lewis Hamilton,2009,McLaren,0.39,0.56,-1.25,1.56,39.0,64.41281138790036
Mark Webber,2009,Red Bull,0.39,0.31,-1.25,1.56,39.0,55.51601423487544
Nick Heidfeld,2009,BMW Sauber,0.32,1.56,-1.25,1.56,32.0,100.0
Jarno Trulli,2009,Toyota,0.31,-1.25,-1.25,1.56,31.0,0.0
Kimi Räikkönen,2012,Lotus F1,0.75,2.0,-4.5,3.89,75.0,77.47318235995232
Fernando Alonso,2012,Ferrari,0.71,0.9,-4.5,3.89,71.0,64.36233611442192
Sebastian Vettel,2012,Red Bull,0.59,1.83,-4.5,3.89,59.0,75.44696066746125


## Chapter 4: Calculating the "True Driver" Score

In this final step, I will merge the two key metrics into a single ranking. The goal is to find the Complete Driver—someone who not only dominates their teammate in equal machinery but also excels at fighting through the pack on Sundays. The Formula: To calculate the final score, I will take the average of the two normalized metrics:

$FinalScore= (score_dominance*0,7)+(score_racecraft*0,3)$
​

Dominance Score: Raw speed and consistency against the teammate.

Racecraft Score: The ability to recover positions and manage the race.

In [40]:
%%sql

SELECT 
    race_year,
    driver_name,
    team,
    ROUND(score_dominance, 1) as points_dominance_in_team,
    ROUND(score_racecraft, 1) as points_overtake,
    
    ROUND(
        (score_dominance * 0.7) + (COALESCE(score_racecraft, 0) * 0.3)
    , 2) as TOTAL_RATING
    
FROM normalized_stats
WHERE score_racecraft IS NOT NULL 
ORDER BY TOTAL_RATING DESC
LIMIT 20;

race_year,driver_name,team,points_dominance_in_team,points_overtake,TOTAL_RATING
2012,Kimi Räikkönen,Lotus F1,75.0,77.5,75.74
2024,Max Verstappen,Red Bull,77.0,72.4,75.61
2023,Max Verstappen,Red Bull,70.0,83.1,73.94
2013,Kimi Räikkönen,Lotus F1,61.0,100.0,72.7
2010,Robert Kubica,Renault,73.0,69.4,71.92
2013,Fernando Alonso,Ferrari,70.0,68.8,69.63
2006,Michael Schumacher,Ferrari,58.0,96.7,69.62
2012,Fernando Alonso,Ferrari,71.0,64.4,69.01
2023,Fernando Alonso,Aston Martin,69.0,63.4,67.32
2014,Fernando Alonso,Ferrari,70.0,60.9,67.28


# Conclusions

The final ranking provides a unique perspective on modern Formula 1 history. By combining Teammate Dominance (raw speed) and Racecraft (resilience), I have identified the drivers who maximized their machinery's potential.

**The Winner: Kimi Räikkönen (2012)**

The "Iceman" takes the top spot with a Total Rating of 75.74. His 2012 season with Lotus is legendary. Returning from a two-year break, he finished every single race (remarkable reliability and racecraft) and completely outperformed his teammate, Romain Grosjean. He wasn't always the fastest on Saturday, but on Sunday, he was unstoppable.

**Max Verstappen (2024 & 2023)**

Max occupies both the 2nd and 3rd spots. This reflects his absolute crushing of Sergio Perez in the Red Bull era. While his car was superior, his dominance score shows that he was operating on a completely different level than the other driver in the same garage.

**Fernando Alonso**

Perhaps the most validating aspect of this model is the presence of Fernando Alonso. He appears in the Top 10 four times (2012, 2013, 2014, 2023). This confirms the widely held belief that Alonso is the master of "outperforming the car." Whether in a Ferrari or an Aston Martin, he consistently delivers results far above the statistical expectation for his machinery.