# World Marathon Majors - Winners

The world marathon majors consist of six major city marathons (https://en.wikipedia.org/wiki/World_Marathon_Majors):

List of all historic winners can be found via their individual wikipedia pages:

    - Tokyo (https://en.wikipedia.org/wiki/Tokyo_Marathon)
    - Boston (https://en.wikipedia.org/wiki/List_of_winners_of_the_Boston_Marathon)
    - London (https://en.wikipedia.org/wiki/List_of_winners_of_the_London_Marathon)
    - Berlin (https://en.wikipedia.org/wiki/Berlin_Marathon)
    - Chicago (https://en.wikipedia.org/wiki/List_of_winners_of_the_Chicago_Marathon)
    - New York (https://en.wikipedia.org/wiki/List_of_winners_of_the_New_York_City_Marathon)
    
Using the Wikipedia API, lets see if we can collect and compile a list of winners for each race on both the male and female runners and wheelchair athletes races and process to an easy to read form.

In [426]:
import pandas as pd
import wikipedia as wp
import re
import numpy as np

### Find each page and make sure each exists

In [35]:
Tokyo_page=wp.page('Tokyo Marathon')
Boston_page=wp.page('List of winners of the Boston Marathon')
London_page=wp.page('List of winners of the London Marathon')
Berlin_page=wp.page('Berlin Marathon')
Chicago_page=wp.page('List of winners of the Chicago Marathon')
New_York_page=wp.page('List of winners of the New York City Marathon')

pages=[Tokyo_page,Boston_page,London_page,Berlin_page,Chicago_page,New_York_page]

for page in pages:
    print(page.title)

Tokyo Marathon
List of winners of the Boston Marathon
List of winners of the London Marathon
Berlin Marathon
List of winners of the Chicago Marathon
List of winners of the New York City Marathon


For each page, we'll use the pandas read_html() method to get all relevant tables.

We first need to identify which tables we need by manually inspecting the wikipedia pages for each race.

## The London Marathon

https://en.wikipedia.org/wiki/List_of_winners_of_the_London_Marathon

In [36]:
London_page.summary

"The London Marathon, one of the six World Marathon Majors, has been contested by men and women annually since 29 March 1981. Set over a largely flat course around the River Thames, the marathon is 26.2 miles (42.2 km) in length and generally regarded as a competitive and unpredictable event, and conducive to fast times.The inaugural marathon had 7,741 entrants, 6,255 of whom completed the race. The first Men's Elite Race was tied between American Dick Beardsley and Norwegian Inge Simonsen, who crossed the finish line holding hands in 2 hours, 11 minutes, 48 seconds. The first Women's Elite Race was won by Briton Joyce Smith in 2:29:57. In 1983, the first wheelchair races took place. Organized by the British Sports Association for the Disabled (BASD), 19 people competed and 17 finished. Gordon Perry of the United Kingdom won the Men's Wheelchair Race, coming in at 3:20:07, and Denise Smith, also of the UK, won the Women's Wheelchair Race in 4:29:03.Twenty athletes representing the Unit

By manually inspecting this page, we see that the results we're looking for are the first to fourth tables on the page.

In [148]:
html = wp.page("List of winners of the London Marathon").html()
male_elite_London = pd.read_html(html)[0]
female_elite_London = pd.read_html(html)[1]
male_wheel_London = pd.read_html(html)[2]
female_wheel_London = pd.read_html(html)[3]

In [149]:
male_elite_London.head()

Unnamed: 0,0,1,2,3,4
0,Year,Athlete,Nationality,Time(h:m:s),Notes
1,1981,Dick Beardsley (Tie),United States,2:11:48,Course record
2,Inge Simonsen (Tie),Norway,,,
3,1982,Hugh Jones,United Kingdom,2:09:24,Course record
4,1983,Mike Gratton,United Kingdom,2:09:43,


### Cleaning up

London is famous for having a tie in the men's race on its first edition in 1981. This is reflected in the first table taken from the wiki page so let's clean that up first. It's only one instance, so let's clean this manually.

In [150]:
male_elite_London.iloc[1][1]='Dick Beardsley and Inge Simonson (Tie)'
male_elite_London.iloc[1][2]='United States and Norway'
male_elite_London=male_elite_London.reindex(male_elite_London.index.drop(2)).reset_index(drop=True)
male_elite_London.head()

Unnamed: 0,0,1,2,3,4
0,Year,Athlete,Nationality,Time(h:m:s),Notes
1,1981,Dick Beardsley and Inge Simonson (Tie),United States and Norway,2:11:48,Course record
2,1982,Hugh Jones,United Kingdom,2:09:24,Course record
3,1983,Mike Gratton,United Kingdom,2:09:43,
4,1984,Charlie Spedding,United Kingdom,2:09:57,


Now that initial cleaning is done, let's standardise these tables somewhat making the year the index and change the column headings out of the first row. 

Let's also convert that time to datetime format.

In [152]:
def standardise_table(df):
    df.columns = df.iloc[0]
    df=df.reindex(df.index.drop(0))
    df=df.set_index('Year')
    df['Time(h:m:s)']=pd.to_datetime(df['Time(h:m:s)'],format='%H:%M:%S').dt.time
    return(df)

In [153]:
male_elite_London=standardise_table(male_elite_London)
female_elite_London=standardise_table(female_elite_London)
male_wheel_London=standardise_table(male_wheel_London)
female_wheel_London=standardise_table(female_wheel_London)

male_elite_London.head()

Unnamed: 0_level_0,Athlete,Nationality,Time(h:m:s),Notes
Year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1981,Dick Beardsley and Inge Simonson (Tie),United States and Norway,02:11:48,Course record
1982,Hugh Jones,United Kingdom,02:09:24,Course record
1983,Mike Gratton,United Kingdom,02:09:43,
1984,Charlie Spedding,United Kingdom,02:09:57,
1985,Steve Jones,United Kingdom,02:08:16,Course record


## Boston Marathon

https://en.wikipedia.org/wiki/List_of_winners_of_the_Boston_Marathon

In [154]:
Boston_page.summary

'The Boston Marathon is an annual marathon held in the Greater Boston area in Massachusetts. The event is held on Patriots Day, the third Monday of April. The Boston Marathon has been held annually since 1897 and is the oldest annual marathon in the world.'

In [427]:
html = wp.page("List of winners of the Boston Marathon").html()
male_elite_Boston = pd.read_html(html)[1]
female_elite_Boston = pd.read_html(html)[2]
male_wheel_Boston = pd.read_html(html)[3]
female_wheel_Boston = pd.read_html(html)[4]

In [428]:
male_wheel_Boston.head()

Unnamed: 0,0,1,2,3,4
0,Year,Athlete,Country/State,Time,Notes
1,1975,"Hall, RobertRobert Hall",United States United States (MA),2:58:00,
2,1976,zzzNone,,,
3,1977,"Hall, RobertRobert Hall",United States United States (MA),2:40:10,2nd victory
4,1978,"Murray, GeorgeGeorge Murray",United States United States (FL),2:26:57,


There was no mens wheelchair race in 1976 so we'll remove that from this table.

In [429]:
male_wheel_Boston=male_wheel_Boston.reindex(male_wheel_Boston.index.drop(2)).reset_index(drop=True)
male_wheel_Boston.head()

Unnamed: 0,0,1,2,3,4
0,Year,Athlete,Country/State,Time,Notes
1,1975,"Hall, RobertRobert Hall",United States United States (MA),2:58:00,
2,1977,"Hall, RobertRobert Hall",United States United States (MA),2:40:10,2nd victory
3,1978,"Murray, GeorgeGeorge Murray",United States United States (FL),2:26:57,
4,1979,"Archer, KenKen Archer",United States United States (OH),2:38:59,


For both the mens and women's wheelchair race, there are also time containing citations. We'll need to remove these as well so we can convert the time to datetime format. We can do that with a simple regex.

In [430]:
male_wheel_Boston.tail()

Unnamed: 0,0,1,2,3,4
39,2014,Ernst van Dyk,South Africa,1:20:36,10th victory
40,2015,Marcel Hug,Switzerland,1:29:53,
41,2016,Marcel Hug,Switzerland,1:24:01,2nd victory
42,2017,Marcel Hug,Switzerland,1:18:03,3rd victory
43,2018,Marcel Hug,Switzerland,1:46:26[6],4th victory


In [431]:
def remove_citation(time):
    return(re.sub('\[.*?\]','',str(time)))

In [432]:
male_wheel_Boston[3]=male_wheel_Boston[3].apply(remove_citation)
female_wheel_Boston[3]=female_wheel_Boston[3].apply(remove_citation)
male_wheel_Boston.tail()

Unnamed: 0,0,1,2,3,4
39,2014,Ernst van Dyk,South Africa,1:20:36,10th victory
40,2015,Marcel Hug,Switzerland,1:29:53,
41,2016,Marcel Hug,Switzerland,1:24:01,2nd victory
42,2017,Marcel Hug,Switzerland,1:18:03,3rd victory
43,2018,Marcel Hug,Switzerland,1:46:26,4th victory


### Standardise tables

As with the London Marathon tables, we can now make the year the index and change the column headings out of the first row. We'll also convert the time to datetime format.

In [433]:
def standardise_table(df):
    df.columns = df.iloc[0]
    df=df.reindex(df.index.drop(0))
    df=df.set_index('Year')
    df['Time']=pd.to_datetime(df['Time'],format='%H:%M:%S',errors='coerce').dt.time
    return(df)

In [434]:
male_elite_Boston=standardise_table(male_elite_Boston)
female_elite_Boston=standardise_table(female_elite_Boston)
male_wheel_Boston=standardise_table(male_wheel_Boston)
female_wheel_Boston=standardise_table(female_wheel_Boston)

male_elite_Boston.head()

Unnamed: 0_level_0,Athlete,Country/State or Province,Time,Notes
Year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1897,"McDermott, John J.John J. McDermott",United States United States(NY),02:55:10,
1898,"MacDonald, Ronald J.Ronald J. MacDonald",Canada Canada (NS),02:42:00,
1899,"Brignolia, LawrenceLawrence Brignolia",United States United States (MA),02:54:38,
1900,"Caffery, JohnJohn ""Jack"" Caffery",Canada Canada (ON),02:39:44,
1901,"Caffery, JohnJohn ""Jack"" Caffery",Canada Canada (ON),02:29:23,2nd victory


### Cleaning up

For the Boston results, we have an odd situation where many of the names in the tables were hyperlinked so we have repetitions where the cell of the table in the html contained text both in the span and in the anchor text of the url. For this we'll use regex.

In [435]:
male_elite_Boston['Athlete'].iloc[1]

'MacDonald, Ronald J.Ronald J. MacDonald'

In [436]:
def clean_names(athlete):
    #find all words in name starting with a capital letter followed by lower case
    regex='[A-Z][a-z]+'
    names=re.findall(regex,athlete)
    #join 'Mc' and 'Mac' to second part of the name and take care of double barrels
    exceptions=['Mc','Mac','De','Van','Cable']
    names_amended=[]
    for i,name in enumerate(names):
        if name in exceptions:
            names_amended.append(name+str(names[i+1]))
            names.remove(names[i+1])
        else:
            names_amended.append(name)
    #the full name should now be the last two words in the list
    names_amended
    name=' '.join(names_amended[len(names_amended)-2:])
    return(name)

In [437]:
male_elite_Boston['Athlete']=male_elite_Boston['Athlete'].apply(clean_names)
female_elite_Boston['Athlete']=female_elite_Boston['Athlete'].apply(clean_names)
male_wheel_Boston['Athlete']=male_wheel_Boston['Athlete'].apply(clean_names)
female_wheel_Boston['Athlete']=female_wheel_Boston['Athlete'].apply(clean_names)
male_elite_Boston.head()

Unnamed: 0_level_0,Athlete,Country/State or Province,Time,Notes
Year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1897,John McDermott,United States United States(NY),02:55:10,
1898,Ronald MacDonald,Canada Canada (NS),02:42:00,
1899,Lawrence Brignolia,United States United States (MA),02:54:38,
1900,Jack Caffery,Canada Canada (ON),02:39:44,
1901,Jack Caffery,Canada Canada (ON),02:29:23,2nd victory


The final cleaning step for the Boston results is to separate out the Country from the state for US and Canadian athletes.

In [438]:
male_elite_Boston['Country/State or Province'].iloc[0]

'United States United States(NY)'

In [439]:
test=male_elite_Boston['Country/State or Province'].iloc[0]

In [440]:
def find_state(entry):
    state=re.findall('\(.*?\)',entry)
    if state != []:
        state=state[0]
        state=state.replace(')','')
        state=state.replace('(','')
    else:
        state=np.NaN
    return(state)

In [441]:
male_elite_Boston['State']=male_elite_Boston['Country/State or Province'].apply(find_state)
female_elite_Boston['State']=female_elite_Boston['Country/State'].apply(find_state)
male_wheel_Boston['State']=male_wheel_Boston['Country/State'].apply(find_state)
female_wheel_Boston['State']=female_wheel_Boston['Country/State'].apply(find_state)
male_elite_Boston.head()

Unnamed: 0_level_0,Athlete,Country/State or Province,Time,Notes,State
Year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1897,John McDermott,United States United States(NY),02:55:10,,NY
1898,Ronald MacDonald,Canada Canada (NS),02:42:00,,NS
1899,Lawrence Brignolia,United States United States (MA),02:54:38,,MA
1900,Jack Caffery,Canada Canada (ON),02:39:44,,ON
1901,Jack Caffery,Canada Canada (ON),02:29:23,2nd victory,ON


In [442]:
def clean_country(country):
    country=re.sub('\(.*?\)','',country)
    country=' '.join(set(country.split(' ')))
    return(country)

In [443]:
male_elite_Boston['Country/State or Province']=male_elite_Boston['Country/State or Province'].apply(clean_country)
female_elite_Boston['Country/State']=female_elite_Boston['Country/State'].apply(clean_country)
male_wheel_Boston['Country/State']=male_wheel_Boston['Country/State'].apply(clean_country)
female_wheel_Boston['Country/State']=female_wheel_Boston['Country/State'].apply(clean_country)
male_elite_Boston.head()

Unnamed: 0_level_0,Athlete,Country/State or Province,Time,Notes,State
Year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1897,John McDermott,United States,02:55:10,,NY
1898,Ronald MacDonald,Canada,02:42:00,,NS
1899,Lawrence Brignolia,United States,02:54:38,,MA
1900,Jack Caffery,Canada,02:39:44,,ON
1901,Jack Caffery,Canada,02:29:23,2nd victory,ON


## Chicago Marathon

https://en.wikipedia.org/wiki/List_of_winners_of_the_Chicago_Marathon

In [515]:
Chicago_page.summary

"The Chicago Marathon, one of the six World Marathon Majors, has been contested by men and women annually since 1977.  Since 1983, it has been held annually in October.  The United States had been represented by the most Chicago Marathon winners (nine men and twelve women).  After a seventh consecutive win by a Kenyan man in 2009, Kenyan men have won more times (ten) than men representing any other country. The United Kingdom is in third place in total victories (eight), victories by men (five) and victories by women (three).  All four of Brazil's victors have been men, and all three of Portugal's winners have been women."

In [606]:
html = wp.page("List of winners of the Chicago Marathon").html()
all_elite_Chicago = pd.read_html(html)[1]
all_wheel_Chicago = pd.read_html(html)[2]

In [607]:
all_elite_Chicago.head()

Unnamed: 0,0,1,2,3,4,5,6
0,Date,Male athlete,Country,Time,Female athlete,Country,Time
1,"September 25, 1977",Dan Cloeter,United States,2:17:52,Dorothy Doolittle,United States,2:50:47
2,"September 24, 1978",Mark Stanforth,United States,2:19:20,Lynae Larson,United States,2:59:25
3,"October 21, 1979",Dan Cloeter,United States,2:23:20,Laura Michalek,United States,3:15:45
4,"September 28, 1980",Frank Richardson,United States,2:14:04,Sue Peterson,United States,2:45:03


In [608]:
all_wheel_Chicago

Unnamed: 0,0,1,2,3,4,5,6
0,Date,Male athlete,Country,Time,Female athlete,Country,Time
1,1984,Robert Fitch,United States,2:35:06,Jonnie Baylark,United States,3:29:10
2,1985,Robert Fitch,United States,2:23:41,Jayne Fortson,United States,2:52:22
3,1986,Bart Bardwell,United States,2:10:19,Jonnie Baylark,United States,3:23:32
4,1987[11],—,—,—,—,—,—
5,1988,Ken Luckenbaugh,United States,2:12:17,—,—,—


The Chicago Marathon results are formatted so the male and female athletes appear in the same table and both the elites and wheelchair tables will need different considerations in terms of cleaning and standardising so we'll need to treat each one separately. 

### Cleaning up - Elites table

Let's start by standardising the table, making the top row the column heading and changing the time to datetime format. We'll also need to change the 'Time' and 'Country column names to avoid duplicate column names.

In [609]:
def standardise_table(df):
    df[3].iloc[0]='Male Time'
    df[2].iloc[0]='Male Country'
    df[6].iloc[0]='Female Time'
    df[5].iloc[0]='Female Country'
    df.columns = df.iloc[0]
    df=df.reindex(df.index.drop(0))
    df['Male Time']=pd.to_datetime(df['Male Time'],format='%H:%M:%S',errors='coerce').dt.time
    df['Female Time']=pd.to_datetime(df['Female Time'],format='%H:%M:%S',errors='coerce').dt.time
    return(df)

In [610]:
all_elite_Chicago=standardise_table(all_elite_Chicago)
all_elite_Chicago.head()

Unnamed: 0,Date,Male athlete,Male Country,Male Time,Female athlete,Female Country,Female Time
1,"September 25, 1977",Dan Cloeter,United States,02:17:52,Dorothy Doolittle,United States,02:50:47
2,"September 24, 1978",Mark Stanforth,United States,02:19:20,Lynae Larson,United States,02:59:25
3,"October 21, 1979",Dan Cloeter,United States,02:23:20,Laura Michalek,United States,03:15:45
4,"September 28, 1980",Frank Richardson,United States,02:14:04,Sue Peterson,United States,02:45:03
5,"September 27, 1981",Phil Coppess,United States,02:16:13,Tina Gandy,United States,02:49:39


The year format is slightly different in the case of the elite athletes table as it give the full date of the race. We only want the year so we'll extract that first. 

In [611]:
def get_year(date):
    return(date[len(date)-4:])

In [612]:
all_elite_Chicago['Year']=all_elite_Chicago['Date'].apply(get_year)
all_elite_Chicago.head()

Unnamed: 0,Date,Male athlete,Male Country,Male Time,Female athlete,Female Country,Female Time,Year
1,"September 25, 1977",Dan Cloeter,United States,02:17:52,Dorothy Doolittle,United States,02:50:47,1977
2,"September 24, 1978",Mark Stanforth,United States,02:19:20,Lynae Larson,United States,02:59:25,1978
3,"October 21, 1979",Dan Cloeter,United States,02:23:20,Laura Michalek,United States,03:15:45,1979
4,"September 28, 1980",Frank Richardson,United States,02:14:04,Sue Peterson,United States,02:45:03,1980
5,"September 27, 1981",Phil Coppess,United States,02:16:13,Tina Gandy,United States,02:49:39,1981


We can then set the index of the elites table as the year and remove the date column.

In [613]:
def set_index(df):
    df=df.set_index('Year')
    df=df.drop(['Date'],axis=1)
    return(df)

In [614]:
all_elite_Chicago=set_index(all_elite_Chicago)
all_elite_Chicago.head()

Unnamed: 0_level_0,Male athlete,Male Country,Male Time,Female athlete,Female Country,Female Time
Year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1977,Dan Cloeter,United States,02:17:52,Dorothy Doolittle,United States,02:50:47
1978,Mark Stanforth,United States,02:19:20,Lynae Larson,United States,02:59:25
1979,Dan Cloeter,United States,02:23:20,Laura Michalek,United States,03:15:45
1980,Frank Richardson,United States,02:14:04,Sue Peterson,United States,02:45:03
1981,Phil Coppess,United States,02:16:13,Tina Gandy,United States,02:49:39


We can now finally split the table into male and female.

In [615]:
male_elite_Chicago=all_elite_Chicago[['Male athlete','Male Country','Male Time']]
male_elite_Chicago=male_elite_Chicago.rename(index=str, columns={"Male athlete": "Athlete", "Male Country": "Country", "Male Time":"Time"})
male_elite_Chicago.head()

Unnamed: 0_level_0,Athlete,Country,Time
Year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1977,Dan Cloeter,United States,02:17:52
1978,Mark Stanforth,United States,02:19:20
1979,Dan Cloeter,United States,02:23:20
1980,Frank Richardson,United States,02:14:04
1981,Phil Coppess,United States,02:16:13


In [616]:
female_elite_Chicago=all_elite_Chicago[['Female athlete','Female Country','Female Time']]
female_elite_Chicago=female_elite_Chicago.rename(index=str, columns={"Female athlete": "Athlete", "Female Country": "Country", "Female Time":"Time"})
female_elite_Chicago.head()

Unnamed: 0_level_0,Athlete,Country,Time
Year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1977,Dorothy Doolittle,United States,02:50:47
1978,Lynae Larson,United States,02:59:25
1979,Laura Michalek,United States,03:15:45
1980,Sue Peterson,United States,02:45:03
1981,Tina Gandy,United States,02:49:39


### Cleaning up - Wheelchair table

In [617]:
all_wheel_Chicago

Unnamed: 0,0,1,2,3,4,5,6
0,Date,Male athlete,Country,Time,Female athlete,Country,Time
1,1984,Robert Fitch,United States,2:35:06,Jonnie Baylark,United States,3:29:10
2,1985,Robert Fitch,United States,2:23:41,Jayne Fortson,United States,2:52:22
3,1986,Bart Bardwell,United States,2:10:19,Jonnie Baylark,United States,3:23:32
4,1987[11],—,—,—,—,—,—
5,1988,Ken Luckenbaugh,United States,2:12:17,—,—,—


The wheelchair race results of Chicago is admittedly incomplete on wikipedia but we'll tidy up what we have here. We'll first remove the 1987 blank result as this year was contested as a half marathon. We'll also remove the blank results for the female race in 1988.

In [618]:
all_wheel_Chicago=all_wheel_Chicago.drop([4]).reset_index(drop=True)
all_wheel_Chicago=all_wheel_Chicago.replace('—', np.NaN)
all_wheel_Chicago

Unnamed: 0,0,1,2,3,4,5,6
0,Date,Male athlete,Country,Time,Female athlete,Country,Time
1,1984,Robert Fitch,United States,2:35:06,Jonnie Baylark,United States,3:29:10
2,1985,Robert Fitch,United States,2:23:41,Jayne Fortson,United States,2:52:22
3,1986,Bart Bardwell,United States,2:10:19,Jonnie Baylark,United States,3:23:32
4,1988,Ken Luckenbaugh,United States,2:12:17,,,


We can now standardise the table in the same way as with the elites table and set the date column as the index.

In [619]:
all_wheel_Chicago=standardise_table(all_wheel_Chicago)
all_wheel_Chicago=all_wheel_Chicago.set_index('Date')
all_wheel_Chicago

Unnamed: 0_level_0,Male athlete,Male Country,Male Time,Female athlete,Female Country,Female Time
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1984,Robert Fitch,United States,02:35:06,Jonnie Baylark,United States,03:29:10
1985,Robert Fitch,United States,02:23:41,Jayne Fortson,United States,02:52:22
1986,Bart Bardwell,United States,02:10:19,Jonnie Baylark,United States,03:23:32
1988,Ken Luckenbaugh,United States,02:12:17,,,


And now split between male and female.

In [621]:
male_wheel_Chicago=all_wheel_Chicago[['Male athlete','Male Country','Male Time']]
male_wheel_Chicago=male_wheel_Chicago.rename(index=str, columns={"Male athlete": "Athlete", "Male Country": "Country", "Male Time":"Time"})
male_wheel_Chicago

Unnamed: 0_level_0,Athlete,Country,Time
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1984,Robert Fitch,United States,02:35:06
1985,Robert Fitch,United States,02:23:41
1986,Bart Bardwell,United States,02:10:19
1988,Ken Luckenbaugh,United States,02:12:17


In [626]:
female_wheel_Chicago=all_wheel_Chicago[['Female athlete','Female Country','Female Time']]
female_wheel_Chicago=female_wheel_Chicago.drop(['1988'])
female_wheel_Chicago=female_wheel_Chicago.rename(index=str, columns={"Female athlete": "Athlete", "Female Country": "Country", "Female Time":"Time"})
female_wheel_Chicago

Unnamed: 0_level_0,Athlete,Country,Time
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1984,Jonnie Baylark,United States,03:29:10
1985,Jayne Fortson,United States,02:52:22
1986,Jonnie Baylark,United States,03:23:32


## New York City Marathon