# Looking for 1-2 Finishes

Let's look at the F1 race data and try to find the 1-2 Finishes in the sport. I have two ideas for doing this:

1. Looking for first and second place finishes for a team at each race. 
1. Look at the average race finish for each team at each race. a 1-2 finish gives an average of 1.5. Or, in other words, the smaller the average, the better the team did in that race.


## 1. Looking for all the first and second place finishes

In [1]:
import pandas as pd
import numpy as np

In [2]:
races = pd.read_csv("../data/f1db_results.csv")

In [3]:
races.head()

Unnamed: 0,raceId2,prixName,year,round,prixDate,constructorName,driverName,grid,positionText,positionOrder,points,status
0,1,British Grand Prix,1950,1,1950-05-13,Alfa Romeo,Nino Farina,1,1,1,9.0,Finished
1,1,British Grand Prix,1950,1,1950-05-13,Alfa Romeo,Luigi Fagioli,2,2,2,6.0,Finished
2,1,British Grand Prix,1950,1,1950-05-13,Alfa Romeo,Reg Parnell,4,3,3,4.0,Finished
3,1,British Grand Prix,1950,1,1950-05-13,Talbot-Lago,Yves Cabantous,6,4,4,3.0,+2 Laps
4,1,British Grand Prix,1950,1,1950-05-13,Talbot-Lago,Louis Rosier,9,5,5,2.0,+2 Laps


In [4]:
races.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 24277 entries, 0 to 24276
Data columns (total 12 columns):
raceId2            24277 non-null int64
prixName           24277 non-null object
year               24277 non-null int64
round              24277 non-null int64
prixDate           24277 non-null object
constructorName    24277 non-null object
driverName         24277 non-null object
grid               24277 non-null int64
positionText       24277 non-null object
positionOrder      24277 non-null int64
points             24277 non-null float64
status             24277 non-null object
dtypes: float64(1), int64(5), object(6)
memory usage: 2.2+ MB


In [5]:
one_two = races[races.positionOrder <= 2]

In [6]:
one_two.head(10000)

Unnamed: 0,raceId2,prixName,year,round,prixDate,constructorName,driverName,grid,positionText,positionOrder,points,status
0,1,British Grand Prix,1950,1,1950-05-13,Alfa Romeo,Nino Farina,1,1,1,9.0,Finished
1,1,British Grand Prix,1950,1,1950-05-13,Alfa Romeo,Luigi Fagioli,2,2,2,6.0,Finished
23,2,Monaco Grand Prix,1950,2,1950-05-21,Alfa Romeo,Juan Fangio,1,1,1,9.0,Finished
24,2,Monaco Grand Prix,1950,2,1950-05-21,Ferrari,Alberto Ascari,7,2,2,6.0,+1 Lap
44,3,Indianapolis 500,1950,3,1950-05-30,Kurtis Kraft,Johnnie Parsons,5,1,1,9.0,Finished
45,3,Indianapolis 500,1950,3,1950-05-30,Deidt,Bill Holland,10,2,2,6.0,+1 Lap
79,4,Swiss Grand Prix,1950,4,1950-06-04,Alfa Romeo,Nino Farina,2,1,1,9.0,Finished
80,4,Swiss Grand Prix,1950,4,1950-06-04,Alfa Romeo,Luigi Fagioli,3,2,2,6.0,Finished
97,5,Belgian Grand Prix,1950,5,1950-06-18,Alfa Romeo,Juan Fangio,2,1,1,8.0,Finished
98,5,Belgian Grand Prix,1950,5,1950-06-18,Alfa Romeo,Luigi Fagioli,3,2,2,6.0,Finished


In [7]:
groups = one_two.groupby("raceId2")

In [8]:
range(1002)[-1]

1001

In [9]:
groups

<pandas.core.groupby.generic.DataFrameGroupBy object at 0x10fcaad30>

In [10]:
group = groups.get_group(1).reset_index()
group

Unnamed: 0,index,raceId2,prixName,year,round,prixDate,constructorName,driverName,grid,positionText,positionOrder,points,status
0,0,1,British Grand Prix,1950,1,1950-05-13,Alfa Romeo,Nino Farina,1,1,1,9.0,Finished
1,1,1,British Grand Prix,1950,1,1950-05-13,Alfa Romeo,Luigi Fagioli,2,2,2,6.0,Finished


In [11]:
group.columns

Index(['index', 'raceId2', 'prixName', 'year', 'round', 'prixDate',
       'constructorName', 'driverName', 'grid', 'positionText',
       'positionOrder', 'points', 'status'],
      dtype='object')

In [12]:
len(group)

2

In [13]:
group.loc[0].constructorName

'Alfa Romeo'

In [14]:
first = group.loc[0]
second = group.loc[1]

first.constructorName == second.constructorName

True

Let's turn all of this into a function.or at least a more cohesive script

In [15]:
temp_df = pd.DataFrame()
groups = one_two.groupby("raceId2")
for i in range(1,1002):
    group = groups.get_group(i).reset_index()
    first = group.loc[0]
    second = group.loc[1]
    if (first.constructorName == second.constructorName):
        print(i, first.constructorName == second.constructorName)
        temp_df = temp_df.append(first)

1 True
4 True
5 True
6 True
7 True
9 True
11 True
14 True
16 True
18 True
19 True
20 True
21 True
22 True
24 True
25 True
26 True
27 True
31 True
36 True
37 True
44 True
45 True
46 True
47 True
48 True
49 True
52 True
53 True
54 True
57 True
59 True
61 True
68 True
71 True
77 True
79 True
81 True
82 True
84 True
87 True
89 True
90 True
92 True
93 True
94 True
96 True
97 True
99 True
109 True
112 True
119 True
122 True
139 True
145 True
148 True
155 True
157 True
158 True
160 True
162 True
171 True
178 True
182 True
193 True
195 True
197 True
202 True
204 True
216 True
220 True
225 True
230 True
231 True
233 True
239 True
242 True
243 True
250 True
267 True
269 True
271 True
303 True
304 True
306 True
310 True
311 True
315 True
316 True
317 True
323 True
326 True
341 True
342 True
343 True
344 True
361 True
365 True
368 True
375 True
385 True
390 True
399 True
401 True
404 True
409 True
415 True
419 True
424 True
429 True
433 True
439 True
442 True
443 True
446 True
450 True
452 True
45

In [16]:
temp_df

Unnamed: 0,constructorName,driverName,grid,index,points,positionOrder,positionText,prixDate,prixName,raceId2,round,status,year
0,Alfa Romeo,Nino Farina,1.0,0.0,9.00,1.0,1,1950-05-13,British Grand Prix,1.0,1.0,Finished,1950.0
0,Alfa Romeo,Nino Farina,2.0,79.0,9.00,1.0,1,1950-06-04,Swiss Grand Prix,4.0,4.0,Finished,1950.0
0,Alfa Romeo,Juan Fangio,2.0,97.0,8.00,1.0,1,1950-06-18,Belgian Grand Prix,5.0,5.0,Finished,1950.0
0,Alfa Romeo,Juan Fangio,1.0,111.0,9.00,1.0,1,1950-07-02,French Grand Prix,6.0,6.0,Finished,1950.0
0,Alfa Romeo,Nino Farina,3.0,131.0,8.00,1.0,1,1950-09-03,Italian Grand Prix,7.0,7.0,Finished,1950.0
0,Kurtis Kraft,Lee Wallard,2.0,181.0,9.00,1.0,1,1951-05-30,Indianapolis 500,9.0,2.0,Finished,1951.0
0,Alfa Romeo,Luigi Fagioli,7.0,228.0,4.00,1.0,1,1951-07-01,French Grand Prix,11.0,4.0,Finished,1951.0
0,Ferrari,Alberto Ascari,3.0,296.0,8.00,1.0,1,1951-09-16,Italian Grand Prix,14.0,7.0,Finished,1951.0
0,Ferrari,Piero Taruffi,2.0,339.0,9.00,1.0,1,1952-05-18,Swiss Grand Prix,16.0,1.0,Finished,1952.0
0,Ferrari,Alberto Ascari,1.0,394.0,9.00,1.0,1,1952-06-22,Belgian Grand Prix,18.0,3.0,Finished,1952.0


In [17]:
temp_df.head()

Unnamed: 0,constructorName,driverName,grid,index,points,positionOrder,positionText,prixDate,prixName,raceId2,round,status,year
0,Alfa Romeo,Nino Farina,1.0,0.0,9.0,1.0,1,1950-05-13,British Grand Prix,1.0,1.0,Finished,1950.0
0,Alfa Romeo,Nino Farina,2.0,79.0,9.0,1.0,1,1950-06-04,Swiss Grand Prix,4.0,4.0,Finished,1950.0
0,Alfa Romeo,Juan Fangio,2.0,97.0,8.0,1.0,1,1950-06-18,Belgian Grand Prix,5.0,5.0,Finished,1950.0
0,Alfa Romeo,Juan Fangio,1.0,111.0,9.0,1.0,1,1950-07-02,French Grand Prix,6.0,6.0,Finished,1950.0
0,Alfa Romeo,Nino Farina,3.0,131.0,8.0,1.0,1,1950-09-03,Italian Grand Prix,7.0,7.0,Finished,1950.0


In [19]:
onetwos = temp_df[["index", "raceId2", "year", "prixName", "prixDate", "round", "constructorName", "driverName", "grid", "positionOrder", "positionText", "points", "status"]]
onetwos.head()

Unnamed: 0,index,raceId2,year,prixName,prixDate,round,constructorName,driverName,grid,positionOrder,positionText,points,status
0,0.0,1.0,1950.0,British Grand Prix,1950-05-13,1.0,Alfa Romeo,Nino Farina,1.0,1.0,1,9.0,Finished
0,79.0,4.0,1950.0,Swiss Grand Prix,1950-06-04,4.0,Alfa Romeo,Nino Farina,2.0,1.0,1,9.0,Finished
0,97.0,5.0,1950.0,Belgian Grand Prix,1950-06-18,5.0,Alfa Romeo,Juan Fangio,2.0,1.0,1,8.0,Finished
0,111.0,6.0,1950.0,French Grand Prix,1950-07-02,6.0,Alfa Romeo,Juan Fangio,1.0,1.0,1,9.0,Finished
0,131.0,7.0,1950.0,Italian Grand Prix,1950-09-03,7.0,Alfa Romeo,Nino Farina,3.0,1.0,1,8.0,Finished


In [20]:
a = onetwos

In [21]:
a

Unnamed: 0,index,raceId2,year,prixName,prixDate,round,constructorName,driverName,grid,positionOrder,positionText,points,status
0,0.0,1.0,1950.0,British Grand Prix,1950-05-13,1.0,Alfa Romeo,Nino Farina,1.0,1.0,1,9.00,Finished
0,79.0,4.0,1950.0,Swiss Grand Prix,1950-06-04,4.0,Alfa Romeo,Nino Farina,2.0,1.0,1,9.00,Finished
0,97.0,5.0,1950.0,Belgian Grand Prix,1950-06-18,5.0,Alfa Romeo,Juan Fangio,2.0,1.0,1,8.00,Finished
0,111.0,6.0,1950.0,French Grand Prix,1950-07-02,6.0,Alfa Romeo,Juan Fangio,1.0,1.0,1,9.00,Finished
0,131.0,7.0,1950.0,Italian Grand Prix,1950-09-03,7.0,Alfa Romeo,Nino Farina,3.0,1.0,1,8.00,Finished
0,181.0,9.0,1951.0,Indianapolis 500,1951-05-30,2.0,Kurtis Kraft,Lee Wallard,2.0,1.0,1,9.00,Finished
0,228.0,11.0,1951.0,French Grand Prix,1951-07-01,4.0,Alfa Romeo,Luigi Fagioli,7.0,1.0,1,4.00,Finished
0,296.0,14.0,1951.0,Italian Grand Prix,1951-09-16,7.0,Ferrari,Alberto Ascari,3.0,1.0,1,8.00,Finished
0,339.0,16.0,1952.0,Swiss Grand Prix,1952-05-18,1.0,Ferrari,Piero Taruffi,2.0,1.0,1,9.00,Finished
0,394.0,18.0,1952.0,Belgian Grand Prix,1952-06-22,3.0,Ferrari,Alberto Ascari,1.0,1.0,1,9.00,Finished


In [22]:
a[["index", "raceId2"]] = a[["index", "raceId2"]].astype(int)

In [23]:
a

Unnamed: 0,index,raceId2,year,prixName,prixDate,round,constructorName,driverName,grid,positionOrder,positionText,points,status
0,0,1,1950.0,British Grand Prix,1950-05-13,1.0,Alfa Romeo,Nino Farina,1.0,1.0,1,9.00,Finished
0,79,4,1950.0,Swiss Grand Prix,1950-06-04,4.0,Alfa Romeo,Nino Farina,2.0,1.0,1,9.00,Finished
0,97,5,1950.0,Belgian Grand Prix,1950-06-18,5.0,Alfa Romeo,Juan Fangio,2.0,1.0,1,8.00,Finished
0,111,6,1950.0,French Grand Prix,1950-07-02,6.0,Alfa Romeo,Juan Fangio,1.0,1.0,1,9.00,Finished
0,131,7,1950.0,Italian Grand Prix,1950-09-03,7.0,Alfa Romeo,Nino Farina,3.0,1.0,1,8.00,Finished
0,181,9,1951.0,Indianapolis 500,1951-05-30,2.0,Kurtis Kraft,Lee Wallard,2.0,1.0,1,9.00,Finished
0,228,11,1951.0,French Grand Prix,1951-07-01,4.0,Alfa Romeo,Luigi Fagioli,7.0,1.0,1,4.00,Finished
0,296,14,1951.0,Italian Grand Prix,1951-09-16,7.0,Ferrari,Alberto Ascari,3.0,1.0,1,8.00,Finished
0,339,16,1952.0,Swiss Grand Prix,1952-05-18,1.0,Ferrari,Piero Taruffi,2.0,1.0,1,9.00,Finished
0,394,18,1952.0,Belgian Grand Prix,1952-06-22,3.0,Ferrari,Alberto Ascari,1.0,1.0,1,9.00,Finished


In [24]:
onetwos[["index", "raceId2", "year", "round", "grid", "positionOrder", "points"]] = onetwos[["index", "raceId2", "year", "round", "grid", "positionOrder", "points"]].astype(int)

In [25]:
onetwos

Unnamed: 0,index,raceId2,year,prixName,prixDate,round,constructorName,driverName,grid,positionOrder,positionText,points,status
0,0,1,1950,British Grand Prix,1950-05-13,1,Alfa Romeo,Nino Farina,1,1,1,9,Finished
0,79,4,1950,Swiss Grand Prix,1950-06-04,4,Alfa Romeo,Nino Farina,2,1,1,9,Finished
0,97,5,1950,Belgian Grand Prix,1950-06-18,5,Alfa Romeo,Juan Fangio,2,1,1,8,Finished
0,111,6,1950,French Grand Prix,1950-07-02,6,Alfa Romeo,Juan Fangio,1,1,1,9,Finished
0,131,7,1950,Italian Grand Prix,1950-09-03,7,Alfa Romeo,Nino Farina,3,1,1,8,Finished
0,181,9,1951,Indianapolis 500,1951-05-30,2,Kurtis Kraft,Lee Wallard,2,1,1,9,Finished
0,228,11,1951,French Grand Prix,1951-07-01,4,Alfa Romeo,Luigi Fagioli,7,1,1,4,Finished
0,296,14,1951,Italian Grand Prix,1951-09-16,7,Ferrari,Alberto Ascari,3,1,1,8,Finished
0,339,16,1952,Swiss Grand Prix,1952-05-18,1,Ferrari,Piero Taruffi,2,1,1,9,Finished
0,394,18,1952,Belgian Grand Prix,1952-06-22,3,Ferrari,Alberto Ascari,1,1,1,9,Finished
