![imagen](./img/ejercicios.png)

# INFORMACIÓN DEL PROYECTO

### TÍTULO

"Análisis comparativo de la Fórmula 1: Dominio de equipos y pilotos (2015-2024)"

### TEMA

"Este proyecto compara los equipos de Fórmula 1 entre 2015 y 2024. El objetivo principal es analizar la evolución del rendimiento de los equipos a lo largo de estos años y, mediante este análisis, validar o refutar las hipótesis planteadas. " \
"El objetivo es comparar el periodo de dominio de Mercedes y Red Bull, identificando los cambios en el panorama competitivo de la Fórmula 1 durante la última década." \
" Además, el proyecto analiza la evolución individual de los pilotos Max Verstappen y Lewis Hamilton a lo largo de estos 10 años, con una comparación detallada de los primeros 5 años de la carrera de cada uno, para determinar quién tuvo el inicio más prometedor."

### HIPÓTESIS

Define aquí lo que será la hipótesis de tu proyecto.
Deberás tener más de una, pero tu proyecto lo harás teniendo una principal hipótesis.

Hipotesis Principal: 
- "La escudería Mercedes logró mejores resultados que Red Bull en términos de puntos y campeonatos de constructores en la Fórmula 1 entre 2015 y 2021. A partir de 2021, Red Bull ha dominado el campeonato bajo los mismos criterios."

Hipotesis secundarias: 
- "Max Verstappen tuvo el mejor año de un piloto en la Fórmula 1 en la temporada 2023, en términos de puntos y victorias."
- "Lewis Hamilton tuvo un mejor inicio de carrera (primeros 5 años) que Max Verstappen, considerando puntos obtenidos y victorias conseguidas."

## OBTENCIÓN DE LOS DATOS

### DATASETS Y FUENTES ALTERNATIVAS DE DATOS

Incluye aquí una vista del dataset o datasets de los que partirás para poder evaluar tu hipótesis. <br>
También incluye el origen de estos datos y su fuente.

## 1. Importing necessary libraries

In [None]:
#from plotly.offline import init_notebook_mode, iplot, plot
import plotly.graph_objs as go
import pandas as pd
pd.set_option("display.max_rows", 500)
df_1 = pd.DataFrame()
fuente_1 = ""


## 2. Loading datasets

We load the CSV files containing information about races, race results, drivers, and constructors.

In [3]:
df_races = pd.read_csv(r'C:\Users\Renan Muniz\Online_Env\REPO_TEST\ONLINE_DS_THEBRIDGE_RENAN_MUNIZ\bootcamp_renan\EDA PROJECT\archive (2)\races.csv')
df_results = pd.read_csv(r'C:\Users\Renan Muniz\Online_Env\REPO_TEST\ONLINE_DS_THEBRIDGE_RENAN_MUNIZ\bootcamp_renan\EDA PROJECT\archive (2)\results.csv')
df_drivers = pd.read_csv(r'C:\Users\Renan Muniz\Online_Env\REPO_TEST\ONLINE_DS_THEBRIDGE_RENAN_MUNIZ\bootcamp_renan\EDA PROJECT\archive (2)\drivers.csv')
df_constructors = pd.read_csv(r'C:\Users\Renan Muniz\Online_Env\REPO_TEST\ONLINE_DS_THEBRIDGE_RENAN_MUNIZ\bootcamp_renan\EDA PROJECT\archive (2)\constructors.csv')

## 3. Merging datasets and keeping relevant columns

In [4]:
df_rrc = df_results.merge(df_races, on='raceId', how='left') \
                      .merge(df_constructors, on='constructorId', how='left')
df_rrc = df_rrc[[ 'raceId', 'number', 'grid',
       'position', 'positionText', 'positionOrder', 'points', 'year', 'round', 'circuitId', 'name_x',
       'date', 'time_y','constructorRef', 'name_y', 'nationality']]

## 4. Filtering data for recent years (2015 onwards)

In [5]:

df_15_24 = df_rrc[df_rrc.year >= 2015]

## 5. Aggregating points by year and constructor

In [6]:
df_y_n = df_15_24.groupby(['year', 'name_y'])
df_points_per_year = df_y_n[['points']].sum()
df_points_per_year = df_points_per_year.reset_index()
df_points_per_year

Unnamed: 0,year,name_y,points
0,2015,Ferrari,428.0
1,2015,Force India,136.0
2,2015,Lotus F1,78.0
3,2015,Manor Marussia,0.0
4,2015,McLaren,27.0
5,2015,Mercedes,703.0
6,2015,Red Bull,187.0
7,2015,Sauber,36.0
8,2015,Toro Rosso,67.0
9,2015,Williams,257.0


## 6. Merging race results with driver information

In [7]:
df_rrd = df_results.merge(df_races, on='raceId', how='left') \
                      .merge(df_drivers, on='driverId', how='left')
df_rrd = df_rrd[[ 'grid','positionOrder', 'points', 'rank', 
       'year', 'round', 'name',
       'forename', 'surname', 'nationality']]

## 7. Filtering data for recent years and specific drivers

In [8]:
df_year = df_rrd[df_rrd['year'] >= 2015].copy()
df_max_ham_per_race = df_year[(df_year['forename'] == 'Max') | (df_year['forename'] == 'Lewis')].copy()

## 8. Creating a wins indicator column

In [9]:
df_max_ham_per_race = df_max_ham_per_race.assign(wins=df_max_ham_per_race['positionOrder'] == 1)


## 9. Aggregating points and wins by driver and year

In [10]:
df_max_ham_total = df_max_ham_per_race.groupby(['year', 'forename', 'surname']).agg({
    'points': 'sum',
    'wins': 'sum'
}).reset_index()

df_max_ham_total

Unnamed: 0,year,forename,surname,points,wins
0,2015,Lewis,Hamilton,381.0,10
1,2015,Max,Verstappen,49.0,0
2,2016,Lewis,Hamilton,380.0,10
3,2016,Max,Verstappen,204.0,1
4,2017,Lewis,Hamilton,363.0,9
5,2017,Max,Verstappen,168.0,2
6,2018,Lewis,Hamilton,408.0,11
7,2018,Max,Verstappen,249.0,2
8,2019,Lewis,Hamilton,413.0,11
9,2019,Max,Verstappen,278.0,3


## 10. Creating wins column for the full dataframe (all drivers)

In [11]:
df_rrd['wins'] = df_rrd['positionOrder'] == 1
df_rrd['wins'].sum()

np.int64(1128)

## 11. Filtering Max Verstappen’s data from 2015 to 2019 for early career analysis

In [12]:
df_max = df_rrd[((df_rrd['forename']== 'Max') & ((df_rrd['year'] >= 2015) & (df_rrd['year'] < 2020 )))]

vers = df_max.groupby(['year', 'forename', 'surname']).agg({
    'points': 'sum',
    'wins': 'sum'
}).reset_index() 

vers

Unnamed: 0,year,forename,surname,points,wins
0,2015,Max,Verstappen,49.0,0
1,2016,Max,Verstappen,204.0,1
2,2017,Max,Verstappen,168.0,2
3,2018,Max,Verstappen,249.0,2
4,2019,Max,Verstappen,278.0,3


## 12. Grouping Verstappen’s points and wins by year

In [13]:
df_ham = df_rrd[((df_rrd['forename']== 'Lewis') & ((df_rrd['year'] >= 2007) & (df_rrd['year'] < 2012 )))]

ham = df_ham.groupby(['year', 'forename', 'surname']).agg({
    'points': 'sum',
    'wins': 'sum'
}).reset_index() 

ham

Unnamed: 0,year,forename,surname,points,wins
0,2007,Lewis,Hamilton,109.0,4
1,2008,Lewis,Hamilton,98.0,5
2,2009,Lewis,Hamilton,49.0,2
3,2010,Lewis,Hamilton,240.0,3
4,2011,Lewis,Hamilton,227.0,3


## DATA FRAMES

In [14]:

df_points_per_year

Unnamed: 0,year,name_y,points
0,2015,Ferrari,428.0
1,2015,Force India,136.0
2,2015,Lotus F1,78.0
3,2015,Manor Marussia,0.0
4,2015,McLaren,27.0
5,2015,Mercedes,703.0
6,2015,Red Bull,187.0
7,2015,Sauber,36.0
8,2015,Toro Rosso,67.0
9,2015,Williams,257.0


In [15]:
vers

Unnamed: 0,year,forename,surname,points,wins
0,2015,Max,Verstappen,49.0,0
1,2016,Max,Verstappen,204.0,1
2,2017,Max,Verstappen,168.0,2
3,2018,Max,Verstappen,249.0,2
4,2019,Max,Verstappen,278.0,3


In [16]:
ham

Unnamed: 0,year,forename,surname,points,wins
0,2007,Lewis,Hamilton,109.0,4
1,2008,Lewis,Hamilton,98.0,5
2,2009,Lewis,Hamilton,49.0,2
3,2010,Lewis,Hamilton,240.0,3
4,2011,Lewis,Hamilton,227.0,3


## GRAPHS

In [None]:
traces = []
unico = df_points_per_year['name_y'].unique()

cores = {
    'Mercedes': '#00D2BE',     
    'Red Bull': '#1E41FF',     
    'Ferrari': '#DC0000',      
    'McLaren': '#FF8700',      
    'Alpine F1 Team': '#0090FF', 'Renault': '#0090FF', 
    'Williams': '#005AFF',     
    'Aston Martin': '#006F62', 'Racing Point': '#006F62', 'Force India': '#006F62',
    'AlphaTauri': '#6699FF',  'RB F1 Team': '#6699FF', 'Toro Rosso': '#6699FF',
    'Haas F1 Team': '#B6BABD',        
    'Sauber': '#900000', 'Alfa Romeo': '#900000', 
    'Lotus F1': '#454545',   
    'Manor Marussia': '#454545',  
}

for team in unico:
    i = df_points_per_year[df_points_per_year['name_y'] == team]
    trace = go.Scatter(
                x = i['year'],
                y = i['points'],
                name = team,
              
                line = dict(color = cores.get(team, 'gray')),
                
    )
    traces.append(trace)


layout = dict(title = 'Constructors points per year',
             xaxis= dict(title= 'Year',ticklen= 5)
           )

fig = go.Figure(data = traces, layout=layout)
fig.update_layout(
    height=800, 
    title='Constructors points per year',
    xaxis=dict(title='Year', ticklen=5),
    yaxis=dict(title='Points')
)

fig.update_yaxes(rangemode="tozero")
fig.show()
fig.write_image("constructos.png")

In [23]:
df_verstappen = df_max_ham_total[df_max_ham_total['forename'] == 'Max']
df_hamilton = df_max_ham_total[df_max_ham_total['forename'] == 'Lewis']

trace1 = go.Bar(x = df_verstappen['year'],
               y = df_verstappen['points'],
               name = 'MAX',
               marker = dict(color = '#1E41FF',
                            line = dict(color='rgb(0,0,0)', width = 1.5)),
               text = df_verstappen['wins'])

trace2 = go.Bar(x = df_hamilton['year'],
               y = df_hamilton['points'],
               name = 'LEWIS',
               marker = dict(color = '#00D2BE',
                            line = dict(color='rgb(0,0,0)', width = 1.5)),
               text = df_hamilton['wins'])


data = [trace1, trace2]

layout = go.Layout(barmode = "group")

fig = go.Figure(data = data, layout = layout)
fig.update_layout(
    height=800,  
    title='Max vs Lewis (2015 - 2024)',
    xaxis=dict(title='Year', ticklen=5),
    yaxis=dict(title='Points')
)


fig.show()
fig.write_image("maxvsham.png")

In [24]:
trace_ham = go.Bar(
    x = ham['year'],
    y = ham['points'],
    text= ham['wins'],
    name = 'HAM',
    marker = dict(color = '#00D2BE',
                  line = dict(color='rgb(0,0,0)', width = 1.5)),
    xaxis= 'x1',
    yaxis= 'y1'

)
trace_vers = go.Bar(
    x = vers['year'],
    y = vers['points'],
    text= vers['wins'],
    name = 'MAX',
    marker = dict(color = '#1E41FF',
                line = dict(color='rgb(0,0,0)', width = 1.5)),
    xaxis= 'x2',
    yaxis= 'y2'
)



layout = go.Layout(
    xaxis=dict(
        domain=[0, 0.45],
        anchor='y1'
    ),
    yaxis=dict(
        domain=[0, 1],
        anchor='x1'
    ),
    xaxis2=dict(
        domain=[0.55, 1],
        anchor='y2'
    ),
     yaxis2=dict(
        domain=[0, 1],
        anchor='x2'
))

data = [trace_ham, trace_vers]

fig = go.Figure(data = data, layout = layout)
fig.update_layout(yaxis2=dict(matches='y1'))
fig.update_layout(
    height=800,  # aumenta a altura total do gráfico
    title='Lewis (2007 - 2011) vs Max (2015 - 2019). Firts 5 seasons.',
    xaxis=dict(title='Year', ticklen=5),
    yaxis=dict(title='Points')
)


fig.show()
fig.write_image("maxvsham5.png")