##<font color= "green"> The Project Logic </font>

In the following project, the program has been divided into code cells, each with a specific role within the execution logic.
To analyze the data contained in the file "formula1_data.csv," all elements necessary for the implementation of the required functions were first extracted from it.
Initially, the names of the drivers featured in the standings and the teams they race for were extracted.

Subsequently, two scripts were created, respectively required to calculate the score assigned to each driver—based on their position in the race—and to associate this score with their respective team.
To achieve this, dictionary logic was employed, the details of which are explained below.

Once this preliminary phase was completed, the required functions were implemented:

A function that takes a parameter as input, specifically the name of a driver, and returns the total points they have scored at the end of the championship, the number of races they have won, and the number of races in which they finished on the podium (in the top 3 positions);
A function that takes no parameters and returns a driver ranking, structured as a key-value dictionary where the keys are strings containing the names of the respective drivers, and the values are integers representing the total points each driver scored by the end of the races;
A function that takes no parameters and returns a constructor ranking, using the same logic as the second function.
The technical explanation of the project is included in the code cells dedicated to each part of the exercise.

In [1]:
#import the "csv" library, necessary for working with the file under analysis

import csv

In [2]:
#import the "os" library and the "path" module to verify that the file path is valid.

import os

from os import path

In [3]:
#check the current working directory

%pwd

'/content'

In [4]:
#set the working directory where the file is located

%cd sample_data

/content/sample_data


In [5]:
#prompt the user to input the name of the file to be analyzed

file_path= input("Inserisci il nome del file: ")

Inserisci il nome del file: formula1_data.csv


In [6]:
#define a check to ensure that the file exists

assert os.path.isfile(file_path), "Il file non esiste"

<font color="red"> Creation of the pilots list </font>

In [7]:
file_csv= open(file_path)

csv_reader= csv.DictReader(file_csv)

#initialize an empty list

pilots=[]

#populate the list with the names of the drivers

for row in csv_reader:

  if row["Driver"] in pilots:

    continue

  else:

    pilots.append(row["Driver"])

pilots

['Hamilton',
 'Massa',
 'Raikkonen',
 'Kubica',
 'Alonso',
 'Heidfeld',
 'Kovalainen',
 'Vettel',
 'Trulli',
 'Glock']

<font color="red"> Creation of the teams list </font>

In [8]:
file_csv= open(file_path)

csv_reader= csv.DictReader(file_csv)

#initialize an empty list

auto_racing_teams=[]

#populate the list with the names of the teams

for row in csv_reader:

  if row["Team"] in auto_racing_teams:

    continue

  else:

    auto_racing_teams.append(row["Team"])

auto_racing_teams

['McLaren', 'Ferrari', 'BMW', 'Renault', 'Toro Rosso', 'Toyota']

<font color="red"> Calculation of race scores </font>

In [9]:
#initialize an empty dictionary

position_points={}

#define a variable containing the maximum score that drivers can earn by winning a race

max_points=10

#populate the dictionary with the scores for positions outside the podium

for i in range(1, 9):

  #populate the dictionary with the podium scores

  if i in range(1, 4):

    if i ==1:

      position_points[i]= max_points

    else:

      max_points-=2

      position_points[i]= max_points

  else:

    max_points-=1

    position_points[i]= max_points

position_points

{1: 10, 2: 8, 3: 6, 4: 5, 5: 4, 6: 3, 7: 2, 8: 1}

<font color="red"> Association of drivers with their respective constructors </font>

In [10]:
#initialize an empty dictionary

team_pilot={}

for n in auto_racing_teams:

  #initialize the list that will contain the drivers for each team

  drivers_teams= []

  file_csv= open(file_path)

  csv_reader= csv.DictReader(file_csv)

  #populate the dictionary with the names of the teams and the drivers associated with each

  for row in csv_reader:

    if n in row["Team"]:

      if row["Driver"] in drivers_teams:

        continue

      drivers_teams.append(row["Driver"])

      team_pilot[row["Team"]]= drivers_teams

team_pilot

{'McLaren': ['Hamilton', 'Kovalainen'],
 'Ferrari': ['Massa', 'Raikkonen'],
 'BMW': ['Kubica', 'Heidfeld'],
 'Renault': ['Alonso'],
 'Toro Rosso': ['Vettel'],
 'Toyota': ['Trulli', 'Glock']}

In [11]:
def is_dict(dictionary):

  """
  Function to verify that the output of a function is of type "dict" (i.e., a dictionary)

  Parameters:

  - dictionary: the parameter is the function that returns the dictionary to be analyzed

  if no errors are detected, the function will not return any output

  """

  assert isinstance(dictionary, dict), "the function's output is not a dictionary"

In [12]:
def check_key_values(dictionary):

  """

  The function is used to verify that the keys and values of the dictionary passed as an argument are,

  respectively, strings and integers

  Parameters:

  - dictionary: the parameter is the function that returns the dictionary to be analyzed

  if no errors are detected, the function will not return any output

  """

  for key, values in dictionary.items():

    assert isinstance(key, str), "the keys are not strings"
    assert isinstance(values, int), "the values are not integers"

<font color="red"> Input </font>

In [13]:
#prompt the user to input the name of the driver whose information they want to retrieve

pilot= input("enter the name of a driver: ").capitalize()

enter the name of a driver: trulli


<font color="red"> Function 1: </br> Calculate driver information </font>

Example output:

[Total Score: num, Total Wins: num, Total Podiums: num]

In [14]:
def pilot_info(pilot):

  """
  The function calculates information about the driver (total points,

  total wins, and total podium finishes).

  Parameters:

  - pilot: a string containing the name of the driver

  """

  #define a check to ensure the input contains only alphabetic characters

  assert pilot.isalpha(), "Devi inserire una stringa di testo"

  #define a check to ensure that the input string provided to the function
  #is included in the list of drivers

  if pilot not in pilots:

    raise ValueError("Nome non in lista")

  pilot_stats=[[],[],[]]

  file_csv= open(file_path)

  csv_reader= csv.DictReader(file_csv)

  for row in csv_reader:

    if pilot in row["Driver"]:

      #if the position achieved by the driver in the given race matches one of the keys,
      #then the first sublist of the master list "stats" will be filled with the value
      #associated with the key, which represents the score assigned for that position
      #according to the race rules

      for key, values in position_points.items():

        if int(row["Position"])== key:

            pilot_stats[0].append(values)

      if int(row["Position"]) == 1:

          pilot_stats[1].append(int(row["Position"]))

      if int(row["Position"]) in range(1,4):

          pilot_stats[2].append(int(row["Position"]))


  #the driver's total score will be calculated by summing all the values in the
  #corresponding nested list. Regarding the subsequent two sublists, their lengths
  #will be determined based on the number of elements they contain, as their purpose
  #is to indicate how many times the driver finished first or, at least, on the podium

  pilot_stats= [f"Total Score: {sum(pilot_stats[0])}, Total Wins: {len(pilot_stats[1])}, Total Podiums: {len(pilot_stats[2])}"]

  return pilot_stats

pilot_info_result=pilot_info(pilot)

pilot_info_result

['Total Score: 31, Total Wins: 0, Total Podiums: 1']

In [15]:
#add a check to verify that the output is indeed a list
#if no errors are detected, no output will be generated

assert isinstance(pilot_info_result, list), "the output is not a list"

<font color="red"> Function 2: </br> Calculate driver ranking </font>

Example output:

Driver Name1: total score, </br>
Driver Name2: total score, </br>
...


In [16]:
def pilots_ranking():

  """
  The function calculates the driver ranking

  It does not take any arguments

  """

  ranking={}

  for i in pilots:

    total_points=0

    #the file must be opened at each interaction to prevent it from
    #being depleted after the first cycle

    file_csv= open(file_path)

    csv_reader= csv.DictReader(file_csv)

    for row in csv_reader:

      if i in row["Driver"]:

        #for each key-value pair in the "position_points" dictionary (the dictionary
        #containing position-score pairs), if the position achieved by the driver
        #in the given race matches one of the keys, the variable "total_points"
        #will increment by the value of the key during each iteration,
        #until the total score of the driver is obtained

        for key, values in position_points.items():

          if int(row["Position"])== key:

            total_points+=values

            ranking[row["Driver"]]= total_points

  return ranking

pilots_ranking_result=pilots_ranking()

pilots_ranking_result

{'Hamilton': 98,
 'Massa': 97,
 'Raikkonen': 75,
 'Kubica': 75,
 'Alonso': 61,
 'Heidfeld': 60,
 'Kovalainen': 53,
 'Vettel': 35,
 'Trulli': 31,
 'Glock': 25}

In [17]:
#verify that the output is a dictionary

is_dict(pilots_ranking_result)

In [18]:
#verify that the keys are strings and the values are integers

check_key_values(pilots_ranking_result)

<font color="red"> Save the ranking to a text file </font>

In [19]:
txt_file=open("formula_1.txt", "w+", encoding="utf-8")

txt_file.write("Drivers Standings 2008 Formula 1\n\n")

for key, values in pilots_ranking_result.items():

  txt_file.write(key+ ": "+str(values)+"\n")

  txt_file.read()

<font color="red"> Function 3: </br> Calculate team ranking </font>

Example output:

Constructor Name1: total score, </br>
Constructor Name2: total score, </br>
...

In [20]:
def auto_racing_teams_ranking():

  """
  The function calculates the constructor ranking

  It does not take any arguments

  """

  ranking={}

  for key, values in team_pilot.items():

    txt_file.seek(0)

    total_points=0

    for line in txt_file:

      #each line, represented as a text string, is split based on ":",
      #resulting in a list with two elements: the driver's name and their total score.
      #Therefore, "line.split(":")[0]" will correspond to the driver's name,
      #while "line.split(":")[1]" will correspond to the driver's total points

      if line.split(":")[0] in values:


          total_points+=int((line.split(":")[1]))

          ranking[key]= total_points

  return ranking

auto_racing_teams_ranking=auto_racing_teams_ranking()

auto_racing_teams_ranking

{'McLaren': 151,
 'Ferrari': 172,
 'BMW': 135,
 'Renault': 61,
 'Toro Rosso': 35,
 'Toyota': 56}

In [21]:
#verify that the output is a dictionary

is_dict(auto_racing_teams_ranking)

In [22]:
#verify that the keys are strings and the values are integers

check_key_values(auto_racing_teams_ranking)

In [23]:
#close the text file

txt_file.close()

##<font color= "green"> Conclusions </font>

In this project, I employed a modus operandi focused on a step-by-step data analysis approach.

* The first step involved loading the dataset into the program and verifying that the file exists within the working directory;
* The second step consisted of "data cleaning," implemented by extracting and isolating the elements relevant to the analysis according to my approach (drivers, teams, and finishing positions) from those that were unnecessary (the city and country where the race took place);
* The third and final step was the implementation of the three functions, which were used to obtain the statistics required by the task: the selected driver's information, the driver ranking, and the constructor ranking.