# História das Olimpíadas  
_(créditos ao prof. Rafael Moreira)_

Após um ano de atraso por conta da pandemia de Covid-19, as atenções do mundo todo se voltaram para Tóquio, no Japão, para acompanhar mais uma edição das Olimpíadas.

No Brasil não foi diferente, e muitos se uniram para torcer por nossos atletas em diferentes competições, tanto em esportes onde o Brasil já possui tradição quanto em novos esportes.

Vamos aproveitar o clima para estudar um pouco das Olimpíadas! Utilizaremos um _dataset_ com 120 anos de dados históricos das Olimpíadas, cobrindo desde os jogos de Atenas 1896 até Rio 2016. 

Faça o download do _dataset_ em https://www.kaggle.com/heesoo37/120-years-of-olympic-history-athletes-and-results e carregue o arquivo ```athlete_events.csv``` para um DataFrame utilizando Pandas. Aproveite para explorar seu DataFrame e se familiarizar com a sua estrutura. 

OBS: Fique à vontade para acrescentar mais células Python conforme necessário em qualquer etapa do exercício.

## 1. O Brasil nas Olimpíadas

Vamos começar estudando o desempenho do nossos próprio país. Gere um DataFrame novo contendo apenas as informações sobre atletas brasileiros.

In [14]:
!pip install pandas



In [15]:
import pandas as pd

In [16]:
df = pd.read_csv('../data/athlete_events.csv')

brazil = df.loc[df['Team'] == 'Brazil']

brazil.head()

Unnamed: 0,ID,Name,Sex,Age,Height,Weight,Team,NOC,Games,Year,Season,City,Sport,Event,Medal
665,386,Alexandre Abeid,M,22.0,194.0,92.0,Brazil,BRA,1972 Summer,1972,Summer,Munich,Volleyball,Volleyball Men's Volleyball,
666,386,Alexandre Abeid,M,26.0,194.0,92.0,Brazil,BRA,1976 Summer,1976,Summer,Montreal,Volleyball,Volleyball Men's Volleyball,
668,388,Abel Carlos da Silva Braga,M,19.0,190.0,73.0,Brazil,BRA,1972 Summer,1972,Summer,Munich,Football,Football Men's Football,
781,451,Diana Monteiro Abla,F,21.0,175.0,75.0,Brazil,BRA,2016 Summer,2016,Summer,Rio de Janeiro,Water Polo,Water Polo Women's Water Polo,
1005,565,Glauclio Serro Abreu,M,26.0,185.0,75.0,Brazil,BRA,2004 Summer,2004,Summer,Athina,Boxing,Boxing Men's Middleweight,


### Medalhistas

Vamos focar um pouco nos casos de sucesso do Brasil. Use o seu DataFrame anterior para filtrar apenas informações sobre **medalhistas** brasileiros. 

**DICA:** observe como a coluna ```Medal``` é representada quando o atleta não ganhou medalha.

In [17]:
brazil_with_medals = brazil.dropna(subset = ['Medal'])

brazil_with_medals.head()

Unnamed: 0,ID,Name,Sex,Age,Height,Weight,Team,NOC,Games,Year,Season,City,Sport,Event,Medal
1651,918,Ademir Roque Kaefer,M,24.0,179.0,74.0,Brazil,BRA,1984 Summer,1984,Summer,Los Angeles,Football,Football Men's Football,Silver
1652,918,Ademir Roque Kaefer,M,28.0,179.0,74.0,Brazil,BRA,1988 Summer,1988,Summer,Seoul,Football,Football Men's Football,Silver
1668,925,Adenzia Aparecida Ferreira da Silva,F,25.0,187.0,65.0,Brazil,BRA,2012 Summer,2012,Summer,London,Volleyball,Volleyball Women's Volleyball,Gold
1733,966,Daniel Adler,M,26.0,180.0,72.0,Brazil,BRA,1984 Summer,1984,Summer,Los Angeles,Sailing,Sailing Mixed Three Person Keelboat,Silver
1856,1020,Adriana Aparecida dos Santos,F,25.0,180.0,61.0,Brazil,BRA,1996 Summer,1996,Summer,Atlanta,Basketball,Basketball Women's Basketball,Silver


### Verão vs Inverno

Você deve ter notado que temos duas categorias distintas de jogos olímpicos, representados pela estação: temos os jogos de verão e os jogos de inverno, que ocorrem de maneira intercalada.

Agora que já conhecemos os medalhistas brasileiros, resposta: quantos atletas brasileiros receberam medalha nos jogos de verão e quantos receberam nos jogos de inverno?

In [18]:
brazil_with_medals.groupby(['Season']).size()

Season
Summer    449
dtype: int64

Os jogos de verão são bem mais populares do que os jogos de inverno no Brasil. Portanto, deste ponto em diante iremos focar apenas nos jogos de verão. Descarte de seu DataFrame os dados dos jogos de inverno.



In [19]:
brazil_with_medals_summer = brazil_with_medals.loc[brazil_with_medals['Season'] == 'Summer']

brazil_with_medals_summer.head()

Unnamed: 0,ID,Name,Sex,Age,Height,Weight,Team,NOC,Games,Year,Season,City,Sport,Event,Medal
1651,918,Ademir Roque Kaefer,M,24.0,179.0,74.0,Brazil,BRA,1984 Summer,1984,Summer,Los Angeles,Football,Football Men's Football,Silver
1652,918,Ademir Roque Kaefer,M,28.0,179.0,74.0,Brazil,BRA,1988 Summer,1988,Summer,Seoul,Football,Football Men's Football,Silver
1668,925,Adenzia Aparecida Ferreira da Silva,F,25.0,187.0,65.0,Brazil,BRA,2012 Summer,2012,Summer,London,Volleyball,Volleyball Women's Volleyball,Gold
1733,966,Daniel Adler,M,26.0,180.0,72.0,Brazil,BRA,1984 Summer,1984,Summer,Los Angeles,Sailing,Sailing Mixed Three Person Keelboat,Silver
1856,1020,Adriana Aparecida dos Santos,F,25.0,180.0,61.0,Brazil,BRA,1996 Summer,1996,Summer,Atlanta,Basketball,Basketball Women's Basketball,Silver


### Atletas do Brasil

Vamos conhecer um pouco melhor nossos atletas. Descubra a altura e peso médio de nossos medalhistas.

Imaginamos que diferentes esportes podem beneficiar diferentes tipos físicos, certo? Então refaça a análise anterior, mas obtendo os valores médios **por esporte**.

In [20]:
print(f'Altura media: {round(brazil_with_medals_summer["Height"].mean(), 2)} cm')
print(f'Peso medio: {round(brazil_with_medals_summer["Weight"].mean(), 2)} Kg')

Altura media: 182.36 cm
Peso medio: 76.62 Kg


In [21]:
brazil_with_medals_summer_groupby_sport_height = round(brazil_with_medals_summer.groupby(["Sport"])["Height"].mean(), 2)

print(f'Altura media por esporte:\n {brazil_with_medals_summer_groupby_sport_height}')

Altura media por esporte:
 Sport
Athletics            181.00
Basketball           185.61
Boxing               170.00
Canoeing             175.00
Equestrianism        179.67
Football             175.80
Gymnastics           162.75
Judo                 176.67
Modern Pentathlon    166.00
Sailing              181.59
Shooting             175.00
Swimming             189.11
Taekwondo            184.00
Volleyball           190.59
Name: Height, dtype: float64


In [22]:
brazil_with_medals_summer_groupby_sport_weight = round(brazil_with_medals_summer.groupby(["Sport"])["Weight"].mean(), 2)

print(f'Peso medio por esporte:\n {brazil_with_medals_summer_groupby_sport_weight}')

Peso medio por esporte:
 Sport
Athletics            74.58
Basketball           78.48
Boxing               64.00
Canoeing             83.25
Equestrianism        75.00
Football             69.96
Gymnastics           63.75
Judo                 86.29
Modern Pentathlon    55.00
Sailing              80.41
Shooting             69.00
Swimming             81.56
Taekwondo            79.50
Volleyball           81.17
Name: Weight, dtype: float64


Será que os dados acima influenciaram no interesse geral dos atletas pelo esporte ou realmente impactaram no desempenho deles? Podemos tentar descobrir se há algum tipo de correlação.

Você ainda possui o dataframe original contendo todos os atletas brasileiros, incluindo os sem medalha? Obtenha os valores médios de peso e altura por esporte daquele dataframe e compare-o com os dos medalhistas. Há alguma diferença significativa em algum esporte?

In [23]:
brazil_summer = brazil.loc[brazil['Season'] == 'Summer']

brazil_summer_groupby_sport_weight = round(brazil_summer.groupby(["Sport"])["Weight"].mean(), 2)

series = {
    'With Medals': brazil_with_medals_summer_groupby_sport_weight,
    'Without Medals': brazil_summer_groupby_sport_weight
}

pd.DataFrame(series)

Unnamed: 0_level_0,With Medals,Without Medals
Sport,Unnamed: 1_level_1,Unnamed: 2_level_1
Archery,,71.7
Art Competitions,,
Athletics,74.58,67.8
Badminton,,74.0
Basketball,78.48,85.9
Boxing,64.0,64.11
Canoeing,83.25,77.4
Cycling,,66.87
Diving,,64.97
Equestrianism,75.0,72.31


In [24]:
brazil_summer = brazil.loc[brazil['Season'] == 'Summer']

brazil_summer_groupby_sport_height = round(brazil_summer.groupby(["Sport"])["Height"].mean(), 2)

series = {
    'With Medals': brazil_with_medals_summer_groupby_sport_height,
    'Without Medals': brazil_summer_groupby_sport_height
}

pd.DataFrame(series)

Unnamed: 0_level_0,With Medals,Without Medals
Sport,Unnamed: 1_level_1,Unnamed: 2_level_1
Archery,,172.6
Art Competitions,,
Athletics,181.0,176.2
Badminton,,175.5
Basketball,185.61,190.91
Boxing,170.0,171.99
Canoeing,175.0,177.79
Cycling,,174.24
Diving,,167.44
Equestrianism,179.67,177.43


Existe um detalhe importante passando batido até agora em nossa análise: as categorias esportivas costumam ser divididas por gênero justamente por conta de diferenças físicas entre homens e mulheres que poderiam influenciar no desempenho. Compare a altura e peso médios de atletas brasileiros por esporte segmentado por sexo.

In [25]:
brazil_summer_groupby_sport_and_sex_height = round(brazil_summer.groupby(["Sport", "Sex"])["Height"].mean(), 2)
brazil_summer_groupby_sport_and_sex_height

Sport             Sex
Archery           F      162.86
                  M      177.85
Art Competitions  M         NaN
Athletics         F      167.49
                  M      180.07
                          ...  
Water Polo        M      181.42
Weightlifting     F      161.50
                  M      171.78
Wrestling         F      169.33
                  M      182.14
Name: Height, Length: 63, dtype: float64

In [26]:
brazil_summer_groupby_sport_and_sex_weight = round(brazil_summer.groupby(["Sport", "Sex"])["Weight"].mean(), 2)
brazil_summer_groupby_sport_and_sex_weight

Sport             Sex
Archery           F      59.71
                  M      78.15
Art Competitions  M        NaN
Athletics         F      59.81
                  M      71.55
                         ...  
Water Polo        M      85.72
Weightlifting     F      62.75
                  M      90.00
Wrestling         F      66.67
                  M      97.29
Name: Weight, Length: 63, dtype: float64

Qual foi (ou quais foram) o maior medalhista brasileiro em quantidade total de medalhas?

In [27]:
brazil_with_medals['Name'].value_counts()[brazil_with_medals['Name'].value_counts() == brazil_with_medals['Name'].value_counts().max()]

Torben Schmidt Grael    5
Robert Scheidt          5
Name: Name, dtype: int64

E o(s) maior(es) em quantidade de medalhas de ouro?

In [28]:
brazil_with_medals_only_golds = brazil_with_medals[brazil_with_medals['Medal'] == 'Gold']

brazil_with_medals_only_golds['Name'].value_counts()[brazil_with_medals_only_golds['Name'].value_counts() == brazil_with_medals_only_golds['Name'].value_counts().max()]

Jaqueline Maria "Jaque" Pereira de Carvalho Endres    2
Paula Renata Marques Pequeno                          2
Giovane Farinazzo Gvio                                2
Adhemar Ferreira da Silva                             2
Torben Schmidt Grael                                  2
Marcelo Bastos Ferreira                               2
Fabiana Marcelino Claudino                            2
Maurcio Camargo Lima                                  2
Robert Scheidt                                        2
Sheilla Tavares de Castro Blassioli                   2
Thasa Daher de Menezes                                2
Srgio "Escadinha" Dutra dos Santos                    2
Fabiana "Fabi" Alvim de Oliveira                      2
Name: Name, dtype: int64

Qual esporte rendeu mais medalhas de ouro para o Brasil? E qual rendeu mais medalhas no total?

**DICA:** tome muito cuidado nessa análise: cada **evento esportivo** rende 1 medalha. Por exemplo, quando a equipe de futebol vence, isso é considerado 1 medalha, mesmo tendo cerca de 20 atletas medalhistas na equipe. 

In [30]:
brazil_with_medals_groupby_sport = brazil_with_medals.groupby(by=['Sport', 'Event', 'Year']).size()

brazil_with_medals_groupby_sport_cumcount = brazil_with_medals_groupby_sport.groupby('Sport').cumcount() + 1

brazil_with_medals_groupby_sport_cumcount.max()


22

Cada "categoria" dentro de um esporte é considerado um evento. Por exemplo, dentro de "atletismo", temos uma competição de 100m masculina, uma de 100m feminino, um revezamento 4 x 100m masculino, um revezamento 4 x 100m feminino, uma competição de 400m masculino, uma de 400m feminino, uma maratona masculina, uma maratona feminina, e assim sucessivamente.

Sabendo disso, qual evento esportivo mais rendeu medalhas de ouro para o Brasil? E total de medalhas?

In [439]:
brazil_with_medals_groupby_event_and_year = brazil_with_medals.groupby(by=['Event', 'Year']).size()

brazil_with_medals_groupby_event_cumcount = brazil_with_medals_groupby_event_and_year.groupby('Event').cumcount() + 1

max_event = brazil_with_medals_groupby_event_cumcount.max()

max_events = brazil_with_medals_groupby_event_cumcount[brazil_with_medals_groupby_event_cumcount == max_event]

max_events_names = [event_name[0] for event_name in max_events.index]

max_events_names

["Athletics Men's Triple Jump",
 "Football Men's Football",
 "Volleyball Men's Volleyball"]

Para finalizar sobre o Brasil: obtenha o total de medalhas de ouro, prata, bronze e total por ano.

In [440]:
brazil_with_medals_by_gold = brazil_with_medals[(brazil_with_medals.Medal == 'Gold')]

brazil_with_medals_by_gold = brazil_with_medals_by_gold.groupby(['Sport', 'Event', 'Year'])['Medal'].size()

brazil_with_medals_by_gold = brazil_with_medals_by_gold.groupby('Sport').cumcount() + 1

print(f'Quantidade de medalhas de ouro do Brasil: {len(brazil_with_medals_by_gold)}')

Quantidade de medalhas de ouro do Brasil: 27


In [441]:
brazil_with_medals_by_silver = df.loc[(df['Team'] == 'Brazil') & (df['Medal'] == 'Silver')]

brazil_with_medals_by_silver = brazil_with_medals_by_silver.groupby(['Sport', 'Event', 'Year'])['Medal'].size()

brazil_with_medals_by_silver = brazil_with_medals_by_silver.groupby('Sport').cumcount() + 1

print(f'Quantidade de medalhas de prata do Brasil: {len(brazil_with_medals_by_silver)}')

Quantidade de medalhas de prata do Brasil: 29


In [442]:
brazil_with_medals_by_bronze = df.loc[(df['Team'] == 'Brazil') & (df['Medal'] == 'Bronze')]

brazil_with_medals_by_bronze = brazil_with_medals_by_bronze.groupby(['Sport', 'Event', 'Year'])['Medal'].size()

brazil_with_medals_by_bronze = brazil_with_medals_by_bronze.groupby('Sport').cumcount() + 1

print(f'Quantidade de medalhas de bronze do Brasil: {len(brazil_with_medals_by_bronze)}')

Quantidade de medalhas de bronze do Brasil: 59


## 2. O mundo nos jogos de verão

Vamos agora analisar um pouquinho do que aconteceu nas Olimpíadas de verão em todo o mundo.

Retome o DataFrame original e descarte as informações sobre os jogos de inverno.

In [443]:
summer = df.loc[df['Season'] == 'Summer']

summer.head()

Unnamed: 0,ID,Name,Sex,Age,Height,Weight,Team,NOC,Games,Year,Season,City,Sport,Event,Medal
0,1,A Dijiang,M,24.0,180.0,80.0,China,CHN,1992 Summer,1992,Summer,Barcelona,Basketball,Basketball Men's Basketball,
1,2,A Lamusi,M,23.0,170.0,60.0,China,CHN,2012 Summer,2012,Summer,London,Judo,Judo Men's Extra-Lightweight,
2,3,Gunnar Nielsen Aaby,M,24.0,,,Denmark,DEN,1920 Summer,1920,Summer,Antwerpen,Football,Football Men's Football,
3,4,Edgar Lindenau Aabye,M,34.0,,,Denmark/Sweden,DEN,1900 Summer,1900,Summer,Paris,Tug-Of-War,Tug-Of-War Men's Tug-Of-War,Gold
26,8,"Cornelia ""Cor"" Aalten (-Strannood)",F,18.0,168.0,,Netherlands,NED,1932 Summer,1932,Summer,Los Angeles,Athletics,Athletics Women's 100 metres,


Obtenha a lista de todos os esportes já disputados nas olimpíadas de verão.

In [444]:
summer['Sport'].unique()

array(['Basketball', 'Judo', 'Football', 'Tug-Of-War', 'Athletics',
       'Swimming', 'Badminton', 'Sailing', 'Gymnastics',
       'Art Competitions', 'Handball', 'Weightlifting', 'Wrestling',
       'Water Polo', 'Hockey', 'Rowing', 'Fencing', 'Equestrianism',
       'Shooting', 'Boxing', 'Taekwondo', 'Cycling', 'Diving', 'Canoeing',
       'Tennis', 'Modern Pentathlon', 'Golf', 'Softball', 'Archery',
       'Volleyball', 'Synchronized Swimming', 'Table Tennis', 'Baseball',
       'Rhythmic Gymnastics', 'Rugby Sevens', 'Trampolining',
       'Beach Volleyball', 'Triathlon', 'Rugby', 'Lacrosse', 'Polo',
       'Cricket', 'Ice Hockey', 'Racquets', 'Motorboating', 'Croquet',
       'Figure Skating', 'Jeu De Paume', 'Roque', 'Basque Pelota',
       'Alpinism', 'Aeronautics'], dtype=object)

Obtenha a lista de todas as modalidades esportivas já disputadas nas olimpíadas de verão.

In [445]:
summer['Event'].unique()

array(["Basketball Men's Basketball", "Judo Men's Extra-Lightweight",
       "Football Men's Football", "Tug-Of-War Men's Tug-Of-War",
       "Athletics Women's 100 metres",
       "Athletics Women's 4 x 100 metres Relay",
       "Swimming Men's 400 metres Freestyle", "Badminton Men's Singles",
       "Sailing Women's Windsurfer",
       "Swimming Men's 200 metres Breaststroke",
       "Swimming Men's 400 metres Breaststroke",
       "Gymnastics Men's Individual All-Around",
       "Gymnastics Men's Team All-Around",
       "Gymnastics Men's Floor Exercise", "Gymnastics Men's Horse Vault",
       "Gymnastics Men's Parallel Bars",
       "Gymnastics Men's Horizontal Bar", "Gymnastics Men's Rings",
       "Gymnastics Men's Pommelled Horse", "Athletics Men's Shot Put",
       'Art Competitions Mixed Sculpturing, Unknown Event',
       "Handball Women's Handball",
       "Weightlifting Women's Super-Heavyweight",
       "Wrestling Men's Light-Heavyweight, Greco-Roman",
       "Gymnastics M

Obtenha a lista de todos os países que já disputaram olimpíadas.

In [446]:
summer['Team'].unique()

array(['China', 'Denmark', 'Denmark/Sweden', ..., 'Dow Jones', 'China-3',
       'Digby'], dtype=object)

Qual atleta foi o maior medalhista (em medalhas totais) da história das olimpíadas de verão?

In [447]:
summer['Name'].value_counts().idxmax()

'Robert Tait McKenzie'

Qual atleta foi o maior medalhista de ouro da história das olimpíadas de verão?

In [448]:
summer_golds = summer.loc[summer['Medal'] == 'Gold']

summer_golds['Name'].value_counts().idxmax()

'Michael Fred Phelps, II'

Qual país foi o maior medalhista de ouro da história das olimpíadas de verão? Lembre-se da questão do evento esportivo, para não considerar múltiplas medalhas para um mesmo evento (ex: uma equipe de futebol fazendo parecer que mais de 20 medalhas foram distribuídas).

In [449]:
summer_golds_by_event = summer_golds.groupby(['Sport', 'Event', 'Year', 'Team'])['Medal'].count()

summer_golds_by_event = summer_golds_by_event.groupby('Team').cumcount() + 1

print(f'País: {summer_golds_by_event.idxmax()[3]} | Medalhas de ouro: {summer_golds_by_event.max()}')

País: United States | Medalhas de ouro: 997


Qual país foi o maior medalhista em medalhas totais na história das olimpíadas de verão?

In [450]:
summer_medals = df.dropna(subset = ['Medal'])

summer_medals_by_event = summer_medals.groupby(['Sport', 'Event', 'Year', 'Team'])['Medal'].count()

summer_medals_by_event_by_team = summer_medals_by_event.groupby('Team').cumcount() + 1

print(f'País: {summer_medals_by_event_by_team.idxmax()[3]} | Total de medalhas: {summer_medals_by_event_by_team.max()}')

País: United States | Total de medalhas: 2026


Obtenha o total de medalhas de ouro, prata e total por edição das Olimpíadas de verão. Lembre-se da questão do evento esportivo.

In [451]:
summer_medals_by_event = summer_medals.groupby(['Sport', 'Event', 'Year', 'Medal']).count()

summer_medals_by_event_by_medal = summer_medals_by_event.groupby(['Medal']).cumcount() + 1

summer_medals_by_event_by_medal.groupby(level=3).apply(max)

Medal
Bronze    6046
Gold      6146
Silver    6101
dtype: int64

## 3. Brasil vs Mundo

Para finalizar, vamos fazer algumas comparações entre Brasil e mundo. Qual o ranking do Brasil em cada edição das olimpíadas? Lembrando que o ranking é ordenado por medalhas de ouro.

In [452]:
summer_medals_golds = summer_medals.loc[summer_medals['Medal'] == 'Gold']

summer_medals_golds_by_event = summer_medals_golds.groupby(by = ['Sport', 'Event', 'Year', 'Team'])['Medal'].count()

summer_medals_golds_by_event_by_team = summer_medals_golds_by_event.groupby(['Team']).cumcount() + 1

summer_medals_golds_by_event_by_team = summer_medals_golds_by_event_by_team.groupby(level=3).apply(max)

summer_medals_golds_by_event_by_team_sorted = summer_medals_golds_by_event_by_team.sort_values(ascending=False)

brazil_position = [(index + 1) for (index, (team, medals)) in enumerate(summer_medals_golds_by_event_by_team_sorted.iteritems()) if team == 'Brazil'][0]

print(f'Posicao do Brasil em numero de medalhas de ouro: {brazil_position}ª')

Posicao do Brasil em numero de medalhas de ouro: 36ª


Compare o maior medalhista em ouros do Brasil com o maior medalhista em ouros do mundo.

In [453]:
summer_golds_brazil = summer.loc[(summer['Medal'] == 'Gold') & (summer['Team'] == 'Brazil')]

biggest_gold_medalist_in_the_world = (summer_golds['Name'].value_counts().idxmax(), summer_golds['Name'].value_counts().max())

biggest_gold_medalist_in_brazil = (summer_golds_brazil['Name'].value_counts().idxmax(), summer_golds_brazil['Name'].value_counts().max())

print(biggest_gold_medalist_in_the_world)
print(biggest_gold_medalist_in_brazil)

('Michael Fred Phelps, II', 23)
('Marcelo Bastos Ferreira', 2)


Compare o maior medalhista em total de medalhas do Brasil com o maior medalhista em total de medalhas do mundo.

In [456]:
summer_with_medals = summer.dropna(subset = ['Medal'])

summer_brazil = summer_with_medals.loc[(summer_with_medals['Medal'] == 'Gold') & (summer_with_medals['Team'] == 'Brazil')]

biggest_medalist_in_the_world = (summer_with_medals['Name'].value_counts().idxmax(), summer_with_medals['Name'].value_counts().max())

biggest_medalist_in_brazil = (summer_brazil['Name'].value_counts().idxmax(), summer_brazil['Name'].value_counts().max())

print(biggest_medalist_in_the_world)
print(biggest_medalist_in_brazil)

('Michael Fred Phelps, II', 28)
('Marcelo Bastos Ferreira', 2)


Compare o maior medalhista em ouros do Brasil com o maior medalhista do mundo no mesmo esporte.

In [610]:
summer_golds_brazil_by_name = summer_golds_brazil.groupby(['Sport'])['Name'].value_counts()

biggest_medalist_by_gold_in_brazil_by_sport_name = summer_golds_brazil_by_name.groupby(['Sport']).idxmax()

biggest_medalist_by_gold_in_brazil_by_sport_name = biggest_medalist_by_gold_in_brazil_by_sport_name.apply(lambda x: x[1])

biggest_medalist_by_gold_in_brazil_by_sport_name = biggest_medalist_by_gold_in_brazil_by_sport_name.reset_index()

biggest_medalist_by_gold_in_brazil_by_sport_medal = summer_golds_brazil_by_name.groupby(['Sport']).max()

biggest_medalist_by_gold_in_brazil_by_sport_medal = biggest_medalist_by_gold_in_brazil_by_sport_medal.reset_index()

biggest_medalist_by_gold_in_brazil_by_sport_medal.columns = ['Sport', 'Medal']

biggest_medalist_by_gold_in_brazil_by_sport = pd.merge(biggest_medalist_by_gold_in_brazil_by_sport_name, biggest_medalist_by_gold_in_brazil_by_sport_medal)

biggest_medalist_by_gold_in_brazil_by_sport

Unnamed: 0,Sport,Name,Medal
0,Athletics,Adhemar Ferreira da Silva,2
1,Boxing,Robson Donato Conceio,1
2,Equestrianism,Rodrigo de Paula Pessoa,1
3,Football,Douglas dos Santos Justino de Melo,1
4,Gymnastics,Arthur Nabarrete Zanetti,1
5,Judo,Aurlio Fernndez Miguel,1
6,Sailing,Marcelo Bastos Ferreira,2
7,Shooting,Guilherme Paraense,1
8,Swimming,Csar Augusto Cielo Filho,1
9,Volleyball,"Fabiana ""Fabi"" Alvim de Oliveira",2


In [617]:
summer_golds_world_by_name = summer_medals_golds.groupby(['Sport'])['Name'].value_counts()

biggest_medalist_by_gold_in_world_by_sport_name = summer_golds_world_by_name.groupby(['Sport']).idxmax()

biggest_medalist_by_gold_in_world_by_sport_name = biggest_medalist_by_gold_in_world_by_sport_name.apply(lambda x: x[1])

biggest_medalist_by_gold_in_world_by_sport_name = biggest_medalist_by_gold_in_world_by_sport_name.reset_index()

biggest_medalist_by_gold_in_world_by_sport_medal = summer_golds_world_by_name.groupby(['Sport']).max()

biggest_medalist_by_gold_in_world_by_sport_medal = biggest_medalist_by_gold_in_world_by_sport_medal.reset_index()

biggest_medalist_by_gold_in_world_by_sport_medal.columns = ['Sport', 'Medal']

biggest_medalist_by_gold_in_world_by_sport = pd.merge(biggest_medalist_by_gold_in_world_by_sport_name, biggest_medalist_by_gold_in_world_by_sport_medal)

biggest_medalist_by_gold_in_world_by_sport

Unnamed: 0,Sport,Name,Medal
0,Aeronautics,Hermann Schreiber,1
1,Alpine Skiing,Janica Kosteli,4
2,Alpinism,Antarge Sherpa,1
3,Archery,Gerard Theodor Hubert Van Innis,6
4,Art Competitions,Jean Lucien Nicolas Jacoby,2
...,...,...,...
61,Tug-Of-War,Edwin Archer Mills,2
62,Volleyball,Alejandrina Mireya Luis Hernndez,3
63,Water Polo,Charles Sydney Smith,3
64,Weightlifting,Akakios Kakiasvili,3


In [622]:
aux_biggest_medalist_by_gold_in_world_by_sport = biggest_medalist_by_gold_in_world_by_sport.copy()

aux_biggest_medalist_by_gold_in_world_by_sport.columns = ['Sport', 'Name (World)', 'Medal (World)']

aux_biggest_medalist_by_gold_in_brazil_by_sport = biggest_medalist_by_gold_in_brazil_by_sport.copy()

aux_biggest_medalist_by_gold_in_brazil_by_sport.columns = ['Sport', 'Name (Brazil)', 'Medal (Brazil)']

biggest_medalist_by_gold_comparation = pd.merge(aux_biggest_medalist_by_gold_in_world_by_sport, aux_biggest_medalist_by_gold_in_brazil_by_sport, how='left')

biggest_medalist_by_gold_comparation

Unnamed: 0,Sport,Name (World),Medal (World),Name (Brazil),Medal (Brazil)
0,Aeronautics,Hermann Schreiber,1,,
1,Alpine Skiing,Janica Kosteli,4,,
2,Alpinism,Antarge Sherpa,1,,
3,Archery,Gerard Theodor Hubert Van Innis,6,,
4,Art Competitions,Jean Lucien Nicolas Jacoby,2,,
...,...,...,...,...,...
61,Tug-Of-War,Edwin Archer Mills,2,,
62,Volleyball,Alejandrina Mireya Luis Hernndez,3,"Fabiana ""Fabi"" Alvim de Oliveira",2.0
63,Water Polo,Charles Sydney Smith,3,,
64,Weightlifting,Akakios Kakiasvili,3,,


Compare o maior medalhista em total de medalhas do Brasil com o maior medalhista do mundo no mesmo esporte.

In [616]:
summer_medals_brazil = summer_medals.loc[summer_medals['Team'] == 'Brazil']

summer_medals_brazil_by_name = summer_medals_brazil.groupby(['Sport'])['Name'].value_counts()

biggest_medalist_in_brazil_by_sport_name = summer_medals_brazil_by_name.groupby(['Sport']).idxmax()

biggest_medalist_in_brazil_by_sport_name = biggest_medalist_in_brazil_by_sport_name.apply(lambda x: x[1])

biggest_medalist_in_brazil_by_sport_name = biggest_medalist_in_brazil_by_sport_name.reset_index()

biggest_medalist_in_brazil_by_sport_medal = summer_medals_brazil_by_name.groupby(['Sport']).max()

biggest_medalist_in_brazil_by_sport_medal = biggest_medalist_in_brazil_by_sport_medal.reset_index()

biggest_medalist_in_brazil_by_sport_medal.columns = ['Sport', 'Medal']

biggest_medalist_in_brazil_by_sport = pd.merge(biggest_medalist_in_brazil_by_sport_name, biggest_medalist_in_brazil_by_sport_medal)

biggest_medalist_in_brazil_by_sport

Unnamed: 0,Sport,Name,Medal
0,Athletics,Adhemar Ferreira da Silva,2
1,Basketball,Adriana Aparecida dos Santos,2
2,Boxing,Adriana dos Santos Arajo,1
3,Canoeing,Isaquias Queiroz dos Santos,3
4,Equestrianism,Rodrigo de Paula Pessoa,3
5,Football,Ademir Roque Kaefer,2
6,Gymnastics,Arthur Nabarrete Zanetti,2
7,Judo,Aurlio Fernndez Miguel,2
8,Modern Pentathlon,Yane Mrcia Campos da Fonseca Marques,1
9,Sailing,Robert Scheidt,5


In [623]:
summer_golds_world_by_name = summer_medals.groupby(['Sport'])['Name'].value_counts()

biggest_medalist_in_world_by_sport_name = summer_golds_world_by_name.groupby(['Sport']).idxmax()

biggest_medalist_in_world_by_sport_name = biggest_medalist_in_world_by_sport_name.apply(lambda x: x[1])

biggest_medalist_in_world_by_sport_name = biggest_medalist_in_world_by_sport_name.reset_index()

biggest_medalist_in_world_by_sport_medal = summer_golds_world_by_name.groupby(['Sport']).max()

biggest_medalist_in_world_by_sport_medal = biggest_medalist_in_world_by_sport_medal.reset_index()

biggest_medalist_in_world_by_sport_medal.columns = ['Sport', 'Medal']

biggest_medalist_in_world_by_sport = pd.merge(biggest_medalist_in_world_by_sport_name, biggest_medalist_in_world_by_sport_medal)

biggest_medalist_in_world_by_sport

Unnamed: 0,Sport,Name,Medal
0,Aeronautics,Hermann Schreiber,1
1,Alpine Skiing,Kjetil Andr Aamodt,8
2,Alpinism,Antarge Sherpa,1
3,Archery,Gerard Theodor Hubert Van Innis,10
4,Art Competitions,Alex Walter Diggelmann,3
...,...,...,...
61,Tug-Of-War,Edwin Archer Mills,3
62,Volleyball,Inna Valeryevna Ryskal,4
63,Water Polo,Dezs Gyarmati,5
64,Weightlifting,Nikolaj Pealov,4


In [624]:
aux_biggest_medalist_in_world_by_sport = biggest_medalist_in_world_by_sport.copy()

aux_biggest_medalist_in_world_by_sport.columns = ['Sport', 'Name (World)', 'Medal (World)']

aux_biggest_medalist_in_brazil_by_sport = biggest_medalist_in_brazil_by_sport.copy()

aux_biggest_medalist_in_brazil_by_sport.columns = ['Sport', 'Name (Brazil)', 'Medal (Brazil)']

biggest_medalist_comparation = pd.merge(aux_biggest_medalist_in_world_by_sport, aux_biggest_medalist_in_brazil_by_sport, how='left')

biggest_medalist_comparation

Unnamed: 0,Sport,Name (World),Medal (World),Name (Brazil),Medal (Brazil)
0,Aeronautics,Hermann Schreiber,1,,
1,Alpine Skiing,Kjetil Andr Aamodt,8,,
2,Alpinism,Antarge Sherpa,1,,
3,Archery,Gerard Theodor Hubert Van Innis,10,,
4,Art Competitions,Alex Walter Diggelmann,3,,
...,...,...,...,...,...
61,Tug-Of-War,Edwin Archer Mills,3,,
62,Volleyball,Inna Valeryevna Ryskal,4,"Srgio ""Escadinha"" Dutra dos Santos",4.0
63,Water Polo,Dezs Gyarmati,5,,
64,Weightlifting,Nikolaj Pealov,4,,


Calcule o percentual de medalhas de ouro, prata e bronze que o Brasil ganhou em cada olimpíada.

In [633]:
summer_medals_brazil_percentual = summer_medals_brazil.groupby(['Sport', 'Event', 'Year', 'Medal']).count()

summer_medals_brazil_percentual = summer_medals_brazil_percentual.groupby(['Medal']).cumcount() + 1

summer_medals_brazil_percentual = summer_medals_brazil_percentual.groupby(level=3).apply(max)

total_brazil = summer_medals_brazil_percentual[0] + summer_medals_brazil_percentual[1] + summer_medals_brazil_percentual[2]

summer_medals_brazil_percentual.apply(lambda x: f'{round(x / total_brazil * 100, 2)} %')

Medal
Bronze     51.3 %
Gold      23.48 %
Silver    25.22 %
dtype: object