# 🏎 Formula 1 'Hybrid Era' Analysis

<img src="notebook_source/JPG-RGB-F1_70_Number_HotRed_Standard_RGB-1024x302.jpg">

### About the Dataset

#### Context

Formula 1 (a.k.a. F1 or Formula One) is the highest class of single-seater auto racing sanctioned by the Fédération Internationale de l'Automobile (FIA) and owned by the Formula One Group. The FIA Formula One World Championship has been one of the premier forms of racing around the world since its inaugural season in 1950. The word "formula" in the name refers to the set of rules to which all participants' cars must conform. A Formula One season consists of a series of races, known as Grands Prix, which take place worldwide on purpose-built circuits and on public roads.

#### Content

The dataset consists of all information on the Formula 1 races, drivers, constructors, qualifying, circuits, lap times, pit stops, championships from 1950 till the latest 2021 season.

## What is the Hybird Era?

### 2014-2021

The year 2014 ushered in the most significant rule changes in F1 history, with normally aspirated, 2.4-liter V8 engines replaced by new, 1.6-liter turbocharged V6 “power units” (no longer officially called engines) integrated with complex, hybrid energy recovery systems (ERS) that FIA claimed “gave the sport a much cleaner and greener image more relevant to developing road car technologies.”&#x20;

The new formula allows turbocharged engines, which last appeared in 1988. These have their efficiency improved through turbo-compounding by recovering energy from exhaust gases. The original proposal for four-cylinder turbocharged engines was not welcomed by the racing teams, in particular Ferrari. A compromise was reached, allowing V6 forced induction engines instead. The engines rarely exceed 12,000 rpm during qualifying and race, due to the new fuel flow restrictions.

Energy recovery systems such as KERS had a boost of 160 hp (120 kW) and 2 megajoules per lap. KERS was renamed Motor Generator Unit–Kinetic (MGU-K). Heat energyrecovery systems were also allowed, under the name Motor Generator Unit–Heat (MGU-H)

The 2015 season was an improvement on 2014, adding about 30–50 hp (20–40 kW) to most engines, the Mercedes engine being the most powerful with 870 hp (649 kW). In 2019, Renault's engine was claimed to have hit 1,000 hp in qualifying trim.

### 2022

In 2017, the FIA began negotiations with existing constructors and potential new manufacturers over the next generation of engines with a projected introduction date of 2021 but delayed to 2022 due to the effects of the COVID-19 pandemic. The initial proposal was designed to simplify engine designs, cut costs, promote new entries and address criticisms directed at the 2014 generation of engines. It called for the 1.6 L V6 configuration to be retained, but abandoned the complex Motor Generator Unit–Heat (MGU-H) system. The Motor Generator Unit–Kinetic (MGU-K) would be more powerful, with a greater emphasis on driver deployment and a more flexible introduction to allow for tactical use.&#x20;

The proposal also called for the introduction of standardised components and design parameters to make components produced by all manufacturers compatible with one another in a system dubbed "plug in and play". A further proposal to allow four-wheel drive cars was also made, with the front axle driven by an MGU-K unit—as opposed to the traditional driveshaft—that functioned independently of the MGU-K providing power to the rear axle, mirroring the system developed by Porsche for the 919 Hybrid race car.

However, mostly due to no engine supplier applying for F1 entry in 2021 and 2022, abolishment of the MGU-H, a more powerful MGU-K and a four-wheel drive system were all shelved with the possibility of their re-introduction for 2026. Instead, the teams and FIA agreed to a radical change in body/chassis aerodynamics to promote more battles on the course at closer distances to each other. They further agreed to an increase in alcohol content from 5.75% to 10% of fuel, and to implement a freeze on power unit design for 2022-2025, with the ICE, turbocharger and MGU-H being frozen on March 1st and the energy store, MGU-K and control electronics being frozen on September 1st during the 2022 season. Honda, the outgoing engine supplier in 2021, was keen to keep the MGU-H, and Red Bull, who took over the engine production project, backed that opinion. The 4WD system was planned to be based on Porsche 919 Hybrid system, but Porsche ended up not becoming an F1 engine supplier for 2021-2022.

### How to run the code

This is an executable [*Jupyter notebook*](https://jupyter.org) hosted on [Jovian.ml](https://www.jovian.ml), a platform for sharing data science projects. You can run and experiment with the code in a couple of ways: *using free online resources* (recommended) or *on your own computer*.

#### Option 1: Running using free online resources (1-click, recommended)

The easiest way to start executing this notebook is to click the "Run" button at the top of this page, and select "Run on Binder". This will run the notebook on [mybinder.org](https://mybinder.org), a free online service for running Jupyter notebooks. You can also select "Run on Colab" or "Run on Kaggle".


#### Option 2: Running on your computer locally

1. Install Conda by [following these instructions](https://conda.io/projects/conda/en/latest/user-guide/install/index.html). Add Conda binaries to your system `PATH`, so you can use the `conda` command on your terminal.

2. Create a Conda environment and install the required libraries by running these commands on the terminal:

```
conda create -n zerotopandas -y python=3.8 
conda activate zerotopandas
pip install jovian jupyter numpy pandas matplotlib seaborn opendatasets --upgrade
```

3. Press the "Clone" button above to copy the command for downloading the notebook, and run it on the terminal. This will create a new directory and download the notebook. The command will look something like this:

```
jovian clone notebook-owner/notebook-id
```



4. Enter the newly created directory using `cd directory-name` and start the Jupyter notebook.

```
jupyter notebook
```

You can now access Jupyter's web interface by clicking the link that shows up on the terminal or by visiting http://localhost:8888 on your browser. Click on the notebook file (it has a `.ipynb` extension) to open it.


## Downloading the Dataset

We will be using pyergast to for access to the Formula 1 database provided by Ergast API. Pyergast gives us direct access to the database, with no need to downlowd a CSV or any other files. 

The "explore_pyergast.py" file will dive into pyergast's features and do some minor exploration in the data to gain familirity with the pyergast commands. 

In [8]:
%pip install -upgrade pip -quiet
%pip install pyergast -upgrade -quiet


Usage:   
  /Users/austintesch/opt/anaconda3/bin/python -m pip install [options] <requirement specifier> [package-index-options] ...
  /Users/austintesch/opt/anaconda3/bin/python -m pip install [options] -r <requirements file> [package-index-options] ...
  /Users/austintesch/opt/anaconda3/bin/python -m pip install [options] [-e] <vcs project url> ...
  /Users/austintesch/opt/anaconda3/bin/python -m pip install [options] [-e] <local project path> ...
  /Users/austintesch/opt/anaconda3/bin/python -m pip install [options] <archive url/path> ...

no such option: -u
Note: you may need to restart the kernel to use updated packages.

Usage:   
  /Users/austintesch/opt/anaconda3/bin/python -m pip install [options] <requirement specifier> [package-index-options] ...
  /Users/austintesch/opt/anaconda3/bin/python -m pip install [options] -r <requirements file> [package-index-options] ...
  /Users/austintesch/opt/anaconda3/bin/python -m pip install [options] [-e] <vcs project url> ...
  /Users/au

In [9]:
# pyergast requires both numpy and pandas
from __future__ import print_function

import pprint as pp
import sys
from unittest import result

import fastf1
import fastf1.plotting as plotting
import matplotlib.pyplot as plt
import pandas as pd
from fastf1.core import Laps
from IPython.utils import io
from timple.timedelta import strftimedelta

Let us save and upload our work to Jovian before continuing.

In [10]:
project_name = "hybrid-era-analysis"
project_file = "Hybrid_Era_Analysis.ipynb"

In [11]:
%pip install jovian --upgrade -q

Note: you may need to restart the kernel to use updated packages.


In [12]:
import jovian

In [13]:
jovian.commit(project=project_name, filename=project_file, files=["explore_ergast.py", "notebook_source"])

<IPython.core.display.Javascript object>

[jovian] Updating notebook "zoibderg/hybrid-era-analysis" on https://jovian.ai/[0m
[jovian] Uploading additional files...[0m
[jovian] Committed successfully! https://jovian.ai/zoibderg/hybrid-era-analysis[0m


'https://jovian.ai/zoibderg/hybrid-era-analysis'

## Data Preparation and Cleaning

**TODO** - Write some explanation here.



> Instructions (delete this cell):
>
> - Load the dataset into a data frame using Pandas
> - Explore the number of rows & columns, ranges of values etc.
> - Handle missing, incorrect and invalid data
> - Perform any additional steps (parsing dates, creating additional columns, merging multiple dataset etc.)

In [30]:
# DATA PREPARATION
class FormulaOneDataPreper:
    def __init__(self, start, end, cache=True):
        self.start = start
        self.end = end
        self.hybrid_schs = {}
        self.events = {}
        self.event_data = {}

        if cache:
            fastf1.Cache.enable_cache('./cache')   # optional but recommended

    def get_schs(self, **kwargs):
        year = kwargs.get('year')

        if year is None:
            # NO YEAR REQUESTED, FALL BACK TO  START AND END VALUES
            for i in range(self.start, self.end):
                self.hybrid_schs[i] = fastf1.get_event_schedule(i, include_testing=False)
        elif year in range(self.start, self.end):
            self.hybrid_schs[year] = fastf1.get_event_schedule(year, include_testing=False)
        else:
            print (f"Year {year} is outside of the data range {self.start} - {self.end}")

    def get_events(self, **kwargs):
        year = kwargs.get('year')
        event = kwargs.get('event')

        if year is None and event is None:
            for i in range(self.start, self.end):
                self.events[i] = list(self.hybrid_schs[i]['EventName'])
        elif year is None:
            return f"Year not specified for event {event}"
        elif event is None:
            self.events[year] = list(self.hybrid_schs[year]['EventName'])
        else:
            try:
                year in self.hybrid_schs[year]
            except Exception:
                # SINGLE YEAR REQUESTED IS OUTSIDE THE DATA RANGE
                print(f"Year {year} is outside of the data range {self.start} - {self.end}")
            else:
                if event in list(self.hybrid_schs[year]['EventName']):
                    self.events[year] = event
                else:
                    print(f"Event {event} not found in year {year}")

    def get_event_data(self, **kwargs):
        year = kwargs.get('year')
        event = kwargs.get('event')
        session = kwargs.get('session')

        if session is None:
            session = 'R'

        if year is None and event is None:
            for i in range(self.start, self.end):
                if i in self.events:
                    for event in self.events[i]:
                        data = fastf1.get_session(i, event, session)
                        data.load()
                        results = data.results
                        self.event_data[i] = {event: results}
                else:
                    print(f"Schedules for year {i} not found, please run get_schs() first.")
        elif year is None:
            for i in range(self.start, self.end):
                if i in self.events:
                    if event in self.events[i]:
                        data = fastf1.get_session(i, event, session)
                        data.load()
                        results = data.results
                        self.event_data[i] = {event: results}
                    else:
                        print(f"Event {event} not found in year {i}.")
                else:
                    print(f"Schedule for year {i} not found, check that you have the correct year (data will not be loaded).")
        elif event is None:
            if year in self.events:
                for event in self.events[year]:
                    data = fastf1.get_session(year, event, session)
                    data.load()
                    results = data.results
                    self.event_data[year] = {event: results}
            else:
                print(f"Schedule for year {year} not found, check that you have the correct year (data will not be loaded).")
        else:
            data = fastf1.get_session(year, event, session)
            data.load()
            results = data.results
            self.event_data[year] = {event: results}

In [31]:
# LOAD DATA FOR ENTIRE HYBRID ERA

"""
WARING: FASTF1 WILL DOWNLOAD DATA FROM THE INTERNET
FASTF1 WILL ALSO OUPUT A LOT OF INFORMATION TO THE CONSOLE; ENSURE OUPUT IS COLLAPSED

It is recommended to run the notebook in Google Colab. 
The notebook is also available on GitHub
"""

hybrid_data = FormulaOneDataPreper(2014, 2023)
hybrid_data.get_schs(year=2015)
hybrid_data.get_events(year=2015, event='Australian Grand Prix')
hybrid_data.get_event_data()

# print("schedule")
# print(hybrid_data.hybrid_schs)
# print("-----------------")
# print("events")
print(hybrid_data.events)
# print("-----------------")
# print("race data")
print(hybrid_data.event_data)

core           INFO 	Loading data for Belgian Grand Prix - Race [v2.3.0]


Schedules for year 2014 not found, please run get_schs() first.


core           INFO 	Finished loading data for 20 drivers: ['44', '6', '8', '26', '11', '19', '7', '33', '77', '9', '12', '5', '14', '22', '98', '28', '55', '3', '13', '27']
core           INFO 	Loading data for British Grand Prix - Race [v2.3.0]
core           INFO 	Finished loading data for 20 drivers: ['44', '6', '5', '19', '77', '26', '27', '7', '11', '14', '9', '98', '28', '55', '3', '33', '8', '13', '22', '12']
core           INFO 	Loading data for Belgian Grand Prix - Race [v2.3.0]
core           INFO 	Finished loading data for 20 drivers: ['44', '6', '8', '26', '11', '19', '7', '33', '77', '9', '12', '5', '14', '22', '98', '28', '55', '3', '13', '27']
core           INFO 	Loading data for Italian Grand Prix - Race [v2.3.0]
core           INFO 	Finished loading data for 20 drivers: ['44', '5', '19', '77', '7', '11', '27', '3', '9', '26', '55', '33', '12', '22', '28', '98', '6', '14', '8', '13']
core           INFO 	Loading data for Bahrain Grand Prix - Race [v2.3.0]
core        

Schedules for year 2016 not found, please run get_schs() first.
Schedules for year 2017 not found, please run get_schs() first.
Schedules for year 2018 not found, please run get_schs() first.
Schedules for year 2019 not found, please run get_schs() first.
Schedules for year 2020 not found, please run get_schs() first.
Schedules for year 2021 not found, please run get_schs() first.
Schedules for year 2022 not found, please run get_schs() first.
{2015: 'Australian Grand Prix'}
{2015: {'x':    DriverNumber BroadcastName Abbreviation        TeamName TeamColor  \
6             6                        ROS        Mercedes             
44           44                        HAM        Mercedes             
77           77                        BOT        Williams             
26           26                        KVY        Red Bull             
3             3                        RIC        Red Bull             
19           19                        MAS        Williams             
27 

In [None]:
# LETS MAKE ANOTHER JOVIAN COMMIT
jovian.commit(project=project_name, filename=project_file, files=["explore_ergast.py","explore_ergastpy.py", "notebook_source", "cache"])

In [None]:
import jovian

In [None]:
jovian.commit()

## Exploratory Analysis and Visualization

**TODO** - write some explanation here.



> Instructions (delete this cell)
> 
> - Compute the mean, sum, range and other interesting statistics for numeric columns
> - Explore distributions of numeric columns using histograms etc.
> - Explore relationship between columns using scatter plots, bar charts etc.
> - Make a note of interesting insights from the exploratory analysis

Let's begin by importing`matplotlib.pyplot` and `seaborn`.

In [None]:
import seaborn as sns
import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline

sns.set_style('darkgrid')
matplotlib.rcParams['font.size'] = 14
matplotlib.rcParams['figure.figsize'] = (9, 5)
matplotlib.rcParams['figure.facecolor'] = '#00000000'

**TODO** - Explore one or more columns by plotting a graph below, and add some explanation about it

**TODO** - Explore one or more columns by plotting a graph below, and add some explanation about it

**TODO** - Explore one or more columns by plotting a graph below, and add some explanation about it

**TODO** - Explore one or more columns by plotting a graph below, and add some explanation about it

**TODO** - Explore one or more columns by plotting a graph below, and add some explanation about it

Let us save and upload our work to Jovian before continuing

In [None]:
import jovian

In [None]:
jovian.commit()

## Asking and Answering Questions

TODO - write some explanation here.



> Instructions (delete this cell)
>
> - Ask at least 5 interesting questions about your dataset
> - Answer the questions either by computing the results using Numpy/Pandas or by plotting graphs using Matplotlib/Seaborn
> - Create new columns, merge multiple dataset and perform grouping/aggregation wherever necessary
> - Wherever you're using a library function from Pandas/Numpy/Matplotlib etc. explain briefly what it does



#### Q1: TODO - ask a question here and answer it below

#### Q2: TODO - ask a question here and answer it below

#### Q3: TODO - ask a question here and answer it below

#### Q4: TODO - ask a question here and answer it below

#### Q5: TODO - ask a question here and answer it below

Let us save and upload our work to Jovian before continuing.

In [None]:
import jovian

In [None]:
jovian.commit()

## Inferences and Conclusion

**TODO** - Write some explanation here: a summary of all the inferences drawn from the analysis, and any conclusions you may have drawn by answering various questions.

In [None]:
import jovian

In [None]:
jovian.commit()

## References and Future Work

**TODO** - Write some explanation here: ideas for future projects using this dataset, and links to resources you found useful.

> Submission Instructions (delete this cell)
> 
> - Upload your notebook to your Jovian.ml profile using `jovian.commit`.
> - **Make a submission here**: https://jovian.ml/learn/data-analysis-with-python-zero-to-pandas/assignment/course-project
> - Share your work on the forum: https://jovian.ml/forum/t/course-project-on-exploratory-data-analysis-discuss-and-share-your-work/11684
> - Share your work on social media (Twitter, LinkedIn, Telegram etc.) and tag [@JovianML](https://twitter.com/jovianml)
>
> (Optional) Write a blog post
> 
> - A blog post is a great way to present and showcase your work.  
> - Sign up on [Medium.com](https://medium.com) to write a blog post for your project.
> - Copy over the explanations from your Jupyter notebook into your blog post, and [embed code cells & outputs](https://medium.com/jovianml/share-and-embed-jupyter-notebooks-online-with-jovian-ml-df709a03064e)
> - Check out the Jovian.ml Medium publication for inspiration: https://medium.com/jovianml


 


## Resources:

* [**pyErgast**](https://github.com/weiranyu/pyErgast)**:** Python pandas wrapper for the [Ergast F1 API](http://ergast.com/mrd/). This package allows easy access to the Ergast API for anyone wishing to conduct analysis on Formula 1 data.
* [**Ergast API**](http://ergast.com/mrd/)**:** The Ergast Developer API is an experimental [web service](http://en.wikipedia.org/wiki/Web\_service) which provides a historical record of motor racing data for non-commercial purposes. Please read the [terms and conditions of use](http://ergast.com/mrd/terms). The API provides data for the [Formula One](http://en.wikipedia.org/wiki/Formula\_One)series, from the beginning of the world championships in 1950.

#### &#x20;                                                                                          Usage

Use of the Ergast API is completely free, but you are welcome to [contribute to the annual running costs](https://liberapay.com/ergast). Any contributions above the actual costs will be donated to the [Grand Prix Trust](https://www.grandprixtrust.com/).

<figure><img src="https://liberapay.com/assets/widgets/donate.svg" alt=""><figcaption></figcaption></figure>

In [None]:
import jovian

In [None]:
jovian.commit(project=project_name, filename=project_file, files=["README.md", "SUMMARY.md", "explore_ergast.py"])