In [1]:
%run prepareStats.py

Last GP in the database: the 2022 Belgian Grand Prix


In [2]:
totalLocations = pd.Series(results.groupby(["name"])["location"].nunique(), name="totalLocations")
results = results.merge(totalLocations, on="name", how="outer")

In [3]:
driversIndyOnly = results[(results["totalLocations"] == 1) & (results["location"] == "Indianapolis")].name.drop_duplicates().tolist()

## The shiniest debuts

(Without the drivers participating in the very first F1 race, and without the drivers participating only in Indianapolis 500.)

In [4]:
results = results.sort_values(by="date").reset_index()

In [5]:
firstRaces = results.iloc[results.groupby(["name"])["date"].agg(pd.Series.idxmin)]

firstRaces = firstRaces[~firstRaces.name.isin(firstGPdrivers)]

firstRaces = firstRaces[~firstRaces.name.isin(driversIndyOnly)]

firstRaces = firstRaces.sort_values(by=["year", "position"])

In [6]:
firstRaces[["name", "year", "location", "position"]].sort_values(by=["position", "year"]).head(50)

Unnamed: 0,name,year,location,position
2300,Giancarlo Baghetti,1961,Reims,1.0
28,Alberto Ascari,1950,Monte-Carlo,2.0
144,Dorino Serafini,1950,Monza,2.0
894,Karl Kling,1954,Reims,2.0
15282,Jacques Villeneuve,1996,Melbourne,2.0
22143,Kevin Magnussen,2014,Melbourne,2.0
357,Jean Behra,1952,Bern,3.0
1431,Masten Gregory,1957,Monte-Carlo,3.0
4377,Reine Wisell,1970,New York State,3.0
4636,Mark Donohue,1971,Ontario,3.0


## The most successful farewells

(Without the drivers participating in the very last race.)

In [7]:
results = results.sort_values(by="date", ascending=True).reset_index(drop=True)

In [8]:
lastRaces = results.loc[results.groupby(["name"])["date"].agg(pd.Series.idxmax)]

In [9]:
lastRaces = lastRaces[~lastRaces.name.isin(lastGPdrivers)]
lastRaces = lastRaces[~lastRaces.name.isin(driversIndyOnly)]

In [10]:
lastRaces[["name", "year", "location", "position"]].sort_values(by=["position", "year"]).head(20)

Unnamed: 0,name,year,location,position
234,Luigi Fagioli,1951,Reims,1.0
3719,Jim Clark,1968,Midrand,1.0
148,Dorino Serafini,1950,Monza,2.0
1296,Paul Frère,1956,Spa,2.0
1787,Mike Hawthorn,1958,Casablanca,2.0
4673,Jo Siffert,1971,New York State,2.0
14399,Alain Prost,1993,Adelaide,2.0
22113,Mark Webber,2013,São Paulo,2.0
23375,Nico Rosberg,2016,Abu Dhabi,2.0
2431,Tony Brooks,1961,New York State,3.0


It's safe to say not a single driver has ever knowingly ended his F1 career with a satisfying victory. 

Luigi Fagioli's farewell win was a shared drive, [Fagioli finishing his part 11th](https://en.wikipedia.org/wiki/Luigi_Fagioli):

> His only Grand Prix of 1951 was his last, but he nevertheless won the French Grand Prix with Juan-Manuel Fangio, earning the distinction of being the oldest person to ever win a Formula One race. During the race, the Alfa Romeo team manager ordered him to hand over his healthy car to Fangio while Fagioli would drive Fangio's car, which was plagued with engine problems. Ferrari had done the same, ordering José Froilán González to hand over to the quicker and more experienced Alberto Ascari; this was common practice in Grand Prix racing before 1957. Fangio battled hard with Ascari and took victory while Fagioli finished 11th and last in Fangio's original car, 22 laps down. Fagioli was so incensed by this that he retired from Grand Prix racing after this race.

Jim Clark was killed in a F2 race before the 1968 Spanish Grand Prix, little did he know the 1968 South African Grand Prix was his farewell win.

## Careers crowned with the best result in the very last race

In [11]:
bestRaces = results.sort_values(by="date", ascending=True)

bestRaces = bestRaces.loc[bestRaces.groupby(["name"])["position"].agg(pd.Series.idxmin).dropna()]

bestRaces = bestRaces[["name", "year", "location", "position"]].sort_values(by=["position", "year"])

bestLastRaces = lastRaces.merge(bestRaces, on="name", how="right")

In [12]:
bestLastRaces = bestLastRaces[bestLastRaces["position_x"] == bestLastRaces["position_y"]]

bestLastRaces = bestLastRaces[bestLastRaces["location_x"] == bestLastRaces["location_y"]]

bestLastRaces = bestLastRaces[bestLastRaces["entries"] > 1]

In [13]:
bestLastRaces[["name", "year_x", "location_x", "position_x", "entries"]].dropna().sort_values(by="entries", ascending=False).head(10)

Unnamed: 0,name,year_x,location_x,position_x,entries
336,Jan Magnussen,1998.0,Montreal,6.0,25.0
587,Jérôme d'Ambrosio,2012.0,Monza,13.0,20.0
384,Corrado Fabi,1984.0,Dallas,7.0,18.0
204,Michael Andretti,1993.0,Monza,3.0,13.0
124,Paul Frère,1956.0,Spa,2.0,11.0
416,Bruce Halford,1960.0,Reims,8.0,9.0
6,Luigi Fagioli,1951.0,Reims,1.0,8.0
374,Ingo Hoffmann,1977.0,São Paulo,7.0,6.0
626,Skip Barber,1972.0,New York State,16.0,6.0
423,Sam Tingle,1969.0,Midrand,8.0,6.0


## Careers launched with the best result, never to be achieved again 

This is kind of funny: Jan Magnussen ended his career with his best result (6th place). 14 years later, his son Kevin started his own F1 career with his own best result (2nd place). Both have the longest careers among the drivers with similar fates.

In [14]:
bestRaces = results.sort_values(by="date", ascending=False)

bestRaces = bestRaces.loc[bestRaces.groupby(["name"])["position"].agg(pd.Series.idxmin).dropna()]

bestRaces = bestRaces[["name", "year", "location", "position", "entries"]].sort_values(by=["position", "year"])

bestFirstRaces = firstRaces.merge(bestRaces, on="name", how="right")

In [15]:
bestFirstRaces = bestFirstRaces[bestFirstRaces["position_x"] == bestFirstRaces["position_y"]]

bestFirstRaces = bestFirstRaces[bestFirstRaces["location_x"] == bestFirstRaces["location_y"]]

bestFirstRaces = bestFirstRaces[bestFirstRaces["entries_y"] > 1]

In [16]:
bestFirstRaces[["name", "location_x", "year_x", "position_x", "entries_y"]].dropna().sort_values(by="entries_y", ascending=False).head(10)

Unnamed: 0,name,location_x,year_x,position_x,entries_y
155,Kevin Magnussen,Melbourne,2014.0,2.0,134
301,Felipe Nasr,Melbourne,2015.0,5.0,40
22,Giancarlo Baghetti,Reims,1961.0,1.0,26
188,Reine Wisell,New York State,1970.0,3.0,23
221,Ken Wharton,Bern,1952.0,4.0,16
190,Mark Donohue,Ontario,1971.0,3.0,15
238,Vic Elford,Rouen,1968.0,4.0,13
118,Karl Kling,Reims,1954.0,2.0,12
602,Ian Ashley,Nürburg,1974.0,14.0,11
263,Alan Brown,Bern,1952.0,5.0,9


## Longest time in F1 after the last win or podium

(Without the drivers participating in the very last race.)

In [17]:
lastRaces = results.sort_values(by="date", ascending=True).reset_index(drop=True)
lastRaces = results.loc[results.groupby(["name"])["date"].agg(pd.Series.idxmax)]

In [18]:
lastRaces = lastRaces[~lastRaces.name.isin(lastGPdrivers)]
lastRaces = lastRaces[~lastRaces.name.isin(driversIndyOnly)]

In [19]:
lastWins = wins.sort_values(by="date", ascending=False).reset_index(drop=True)
lastWins = lastWins.loc[lastWins.groupby(["name"])["date"].agg(pd.Series.idxmax)]

In [20]:
afterLastWin = lastWins.merge(lastRaces, on="name", how="right")

In [21]:
afterLastWin["era"] = (afterLastWin["date_y"] - afterLastWin["date_x"])

In [22]:
afterLastWin.groupby(["name"]).agg({"era": max}).sort_values(by="era", ascending=False).head(10)

Unnamed: 0_level_0,era
name,Unnamed: 1_level_1
Robert Kubica,4844 days
Jo Bonnier,4508 days
Michele Alboreto,3388 days
Felipe Massa,3311 days
Jacques Villeneuve,3227 days
Olivier Panis,3066 days
Troy Ruttman,2922 days
Jarno Trulli,2744 days
Jochen Mass,2646 days
Jacky Ickx,2625 days


In [23]:
lastPodiums = podiums.sort_values(by="date", ascending=False).reset_index(drop=True)
lastPodiums = lastPodiums.loc[lastPodiums.groupby(["name"])["date"].agg(pd.Series.idxmax)]

In [24]:
afterLastPodium = lastPodiums.merge(lastRaces, on="name", how="right")

In [25]:
afterLastPodium["era"] = (afterLastPodium["date_y"] - afterLastPodium["date_x"])

In [26]:
afterLastPodium.groupby(["name"]).agg({"era": max}).sort_values(by="era", ascending=False).head(10)

Unnamed: 0_level_0,era
name,Unnamed: 1_level_1
Jo Bonnier,4508 days
Robert Kubica,4032 days
Derek Warwick,3381 days
Jos Verstappen,3332 days
Bruno Giacomelli,3270 days
Rolf Stommelen,2975 days
Troy Ruttman,2922 days
Louis Chiron,2919 days
Olivier Panis,2695 days
Hans Herrmann,2541 days


## "Entry" teams

In [27]:
firstRaces.groupby("constructor")["resultId"].nunique().nlargest(5)

constructor
Maserati         46
Ferrari          32
Cooper-Climax    30
Team Lotus       29
Lotus-Climax     23
Name: resultId, dtype: int64

In [28]:
firstRaces[firstRaces["year"] > 2006].groupby("constructor")["resultId"].nunique().nlargest(5)

constructor
Toro Rosso        10
Williams           9
Manor Marussia     5
Renault            5
Sauber             5
Name: resultId, dtype: int64

In [29]:
lastDebutCurrentTeams = firstRaces.loc[firstRaces[firstRaces.constructor.isin(currentConstructors)].groupby(["constructor"])["date"].agg(pd.Series.idxmax)]

In [30]:
lastDebutCurrentTeams[["constructor", "date", "name"]].set_index("constructor").sort_values(by="date",ascending=False).head(60)

Unnamed: 0_level_0,date,name
constructor,Unnamed: 1_level_1,Unnamed: 2_level_1
Alfa Romeo,2022-03-20,Guanyu Zhou
AlphaTauri,2021-03-28,Yuki Tsunoda
Haas F1 Team,2021-03-28,Mick Schumacher
Williams,2020-12-06,Jack Aitken
McLaren,2019-03-17,Lando Norris
Red Bull,2005-04-24,Vitantonio Liuzzi
Ferrari,1972-07-15,Arturo Merzario
Mercedes,1954-07-04,Karl Kling


## Old men yelling in the team radio

Best results achieved after 15 and 20 years in Formula One.

In [31]:
dateOfDebut = pd.Series(results.groupby(["driverId"])["date"].min(), name="dateOfDebut")
results = results.merge(dateOfDebut, on = ["driverId"], how = "right")

In [32]:
results["siceDebut"] = results["date"] - results["dateOfDebut"]

In [33]:
from datetime import timedelta

In [34]:
fifteenYears = results[results["siceDebut"] > timedelta(days=5475)]

In [35]:
fifteenYears.groupby("name")["position"].min().nsmallest(20)

name
Kimi Räikkönen         1.0
Michael Schumacher     1.0
Riccardo Patrese       1.0
Rubens Barrichello     1.0
Jack Brabham           2.0
Lewis Hamilton         2.0
Fernando Alonso        3.0
Felipe Massa           6.0
Graham Hill            6.0
Jenson Button          6.0
Sebastian Vettel       8.0
Jo Bonnier            10.0
Luca Badoer           14.0
Robert Kubica         14.0
Name: position, dtype: float64

In [36]:
fifteenYears.groupby("name")["points"].sum().nlargest(20)

name
Kimi Räikkönen        699.0
Fernando Alonso       253.0
Michael Schumacher    228.0
Lewis Hamilton        145.0
Rubens Barrichello    139.0
Riccardo Patrese       52.0
Felipe Massa           43.0
Jenson Button          37.0
Sebastian Vettel        7.0
Jack Brabham            6.0
Graham Hill             1.0
Jo Bonnier              0.0
Luca Badoer             0.0
Robert Kubica           0.0
Name: points, dtype: float64

In [37]:
fifteenYears.groupby("name")["raceId"].nunique().nlargest(20)

name
Kimi Räikkönen        120
Fernando Alonso        96
Rubens Barrichello     73
Michael Schumacher     63
Jenson Button          41
Graham Hill            30
Riccardo Patrese       27
Felipe Massa           19
Lewis Hamilton         14
Jack Brabham            7
Sebastian Vettel        6
Jo Bonnier              2
Luca Badoer             2
Robert Kubica           2
Name: raceId, dtype: int64

In [38]:
twentyYears = results[results["siceDebut"] > timedelta(days=7300)]

In [39]:
twentyYears.groupby("name")["position"].min().nsmallest(20)

name
Fernando Alonso       3.0
Michael Schumacher    3.0
Kimi Räikkönen        8.0
Name: position, dtype: float64

In [40]:
twentyYears.groupby("name")["points"].sum().nlargest(20)

name
Fernando Alonso       132.0
Michael Schumacher     93.0
Kimi Räikkönen         10.0
Name: points, dtype: float64

In [41]:
twentyYears.groupby("name")["raceId"].nunique().nlargest(20)

name
Fernando Alonso       36
Michael Schumacher    28
Kimi Räikkönen        20
Name: raceId, dtype: int64