# 1. Import Libraries

---

For this task, `FastF1` module will be used. FastF1 provides us with access to a significant amount of data associated with F1, including lap timing, car telemetry and position, tire information, and numerous other data points.

Because it is built using Pandas, Numpy, and Matplotlib, the module is user-friendly and provides ample opportunities for data analysis and visualization.

Please refer to the following website for additional information:
> [FastF1 documentation](theoehrly.github.io/Fast-F1)

> [FastF1 github page](https://github.com/theOehrly/Fast-F1)

In [None]:
import csv
import fastf1
import pandas as pd

print(f"fastf1 version is {fastf1.__version__}")

As of the moment when this notebook is being written, FastF1 version 2.3.1 is being used.

# 2. Set up:

---

Enabling the cache is not mandatory, but it is strongly recommended by developers of the library. You can provide the path to an empty folder on your system. Utilizing the cache will significantly increase the speed of data loading. Once you have enabled the cache, you can define the races and sessions from which you want to extract data.

In [None]:
# Enable the usage of the cache
path = './cache'
fastf1.Cache.enable_cache(path)

In the `races` list, you can specify the races and sessions from which you wish to extract data.
- `year` : represents the year in which the race is taking place.
- `name` : represents the name of the race event.
- `sessions` : a list of strings, which represents the different sessions of the race event for which data is to be extracted. The available sessions are `FP1` (Free Practice 1), `FP2` (Free Practice 2), `FP3` (Free Practice 3), `Q` (Qualifying) and `R` (Race).


In [None]:
# Define the races and sessions to extract data for
races = [
    {'year': 2023, 'name': 'Bahrain', 'sessions': ['FP1', 'FP2', 'FP3', 'Q', 'R']},
    {'year': 2023, 'name': 'Saudi Arabia', 'sessions': ['FP1', 'FP2', 'FP3', 'Q', 'R']},
    {'year': 2023, 'name': 'Australia', 'sessions': ['FP1', 'FP2', 'FP3', 'Q', 'R']}
]

# Create an empty list to store the data
data = []

# 3. Load the lap data:

---

For each race, we iterate over the sessions of that race. Then, We use the `fastf1` library to load the data for the specified race and session. Next step is filtering out inaccurate laps. For information on what constitutes an accurate lap, please refer to the [documentation](https://theoehrly.github.io/Fast-F1/core.html#laps) for `FastF1`. Then, the code iterates over the accurate laps and appends the relevant data for each lap to the its corresponding list. Finally, the extracted data is appended as a dictionary to the `data` list.


In summary, the code extracts telemetry data for laps during Formula 1 races and arranges the information into a list of dictionaries. Each dictionary represents a race session and contains information such as lap times, sector times, team, driver, and other relevant data.

In [None]:
# Load lap data and extract telemetry data for all sessions and races
for race in races:
    for session in race['sessions']:
        laps = fastf1.get_session(race['year'], race['name'], session)
        laps.load(weather=False, messages=False)

        # filtering out inaccurate (please check fastf1 documentation)
        accurate_laps = laps.laps.loc[laps.laps.IsAccurate == True]


        # Extract lap times and telemetry data for all laps
        lap_number = []
        lap_times = []
        sector1_times = []
        sector2_times = []
        sector3_times = []
        team = []
        driver = []
        driver_number = []
        compound = []
        tyre_life = []
        gear_shifts = []
        braking = []
        throttle = []

        for lap_num in range(len(accurate_laps)):
            lap = accurate_laps.iloc[lap_num]
            lap_number.append(lap.loc['LapNumber'])
            lap_times.append(lap.loc['LapTime'])
            sector1_times.append(lap.loc['Sector1Time'])
            sector2_times.append(lap.loc['Sector2Time'])
            sector3_times.append(lap.loc['Sector3Time'])
            team.append(lap.loc['Team'])
            driver.append(lap.loc['Driver'])
            driver_number.append(lap.loc['DriverNumber'])
            compound.append(lap.loc['Compound'])
            tyre_life.append(lap.loc['TyreLife'])

        data.append({
                        'year' : race['year'],
                        'race' : race['name'],
                        'session' : session,
                        'team' : team,
                        'driver' : driver,
                        'driver_number' : driver_number,
                        'compound' : compound,
                        'tyre_life' : tyre_life,
                        'lap_number' : lap_number,
                        'lap_times' : lap_times,
                        'sector1_times' : sector1_times,
                        'sector2_times' : sector2_times,
                        'sector3_times' : sector3_times,
                    })

# 4. Save the data into a CSV file: 

---


In this section, the code below writes the previously extracted data to a CSV file. It begins by creating a new file called "telemetry_data.csv" in write mode using the `csv.writer()` function from the Python `csv` module. Next, we write a header row to the file that includes the column names for our data.

Then, We iterate through each race's data in the `data` list and extracts the necessary information for each row in the CSV file. The code iterates through the data for each lap in every race and writes a new row to the CSV file that includes the relevant data for that specific lap.

In [None]:
# Open the CSV file in write mode
with open('telemetry_data.csv', mode='w', newline='') as file:

    # Create a writer object
    writer = csv.writer(file)

    # Write the header row
    writer.writerow([
                        'Year', 'Race', 'Session', 'Team', 'Driver', 
                        'Driver Number', 'Compound', 'Tyre Life', 'Lap Number', 
                        'Lap Times', 'Sector 1 Times', 'Sector 2 Times', 
                        'Sector 3 Times'
                    ])

    # Write the data rows
    for race_data in data:
        year = race_data['year']
        race = race_data['race']
        session = race_data['session']

        for i in range(len(race_data['team'])):
            team = race_data['team'][i]
            driver = race_data['driver'][i]
            driver_number = race_data['driver_number'][i]
            compound = race_data['compound'][i]
            tyre_life = race_data['tyre_life'][i]
            lap_number = race_data['lap_number'][i]
            lap_times = race_data['lap_times'][i]
            sector1_times = race_data['sector1_times'][i]
            sector2_times = race_data['sector2_times'][i]
            sector3_times = race_data['sector3_times'][i]

            writer.writerow([
                                year, race, session, team, driver, 
                                driver_number, compound, tyre_life, lap_number,
                                lap_times, sector1_times, sector2_times, 
                                sector3_times
                            ])


Now, Let's format the time columns as `minute:second.decimal`.

In [None]:
# Load CSV file into a DataFrame
df = pd.read_csv('telemetry_data.csv')

# Define a function to convert string to minute:second.decimal format
def convert_to_time(time_str):
    parts = time_str.split(' ')
    days = int(parts[0])
    time = parts[2]
    hours, minutes, seconds = map(float, time.split(':'))
    total_seconds = days * 24 * 60 * 60 + hours * 60 * 60 + minutes * 60 + seconds
    minutes = int(total_seconds // 60)
    seconds = total_seconds % 60
    return f'{minutes:02d}:{seconds:.3f}'  # Format the output as "minute:second.decimal"

# Apply the function to 'Lap Times', 'Sector 1 Times', 'Sector 2 Times', 'Sector 3 Times' columns
df['Lap Times'] = df['Lap Times'].apply(convert_to_time)
df['Sector 1 Times'] = df['Sector 1 Times'].apply(convert_to_time)
df['Sector 2 Times'] = df['Sector 2 Times'].apply(convert_to_time)
df['Sector 3 Times'] = df['Sector 3 Times'].apply(convert_to_time)

# Save the updated data to a new CSV file
df.to_csv('telemetry_data.csv', index=False)

For the finale we clear the cache.

In [None]:
# Delete the usage of the cache
fastf1.Cache.clear_cache(path, deep=True)