# How have the Olympic games athletes changed over time?

## Introduction

**Business Context.** You work for a company that specializes in analyzing data for a variety of clients in the sports industry. Some questions that you frequently encounter include determining if a new player is promising enough to invest money in their development, which teams are the most likely to win certain matches, what events will be the most attractive to advertisers, etc.

**Business Problem.** As part of one of your projects, you have been asked to perform an exploratory data analysis of historical data to **detect patterns in the provenance, physical profile, and other characteristics of the athletes who compete in the Olympic games**. The conclusions of your analysis will help the rest of the team prepare a report for a new client who helps sports gear manufacturers find advertising opportunities.

**Analytical Context.** You have scraped a dataset from the Internet, which contains data for all the Olympic games from Norway 1994 to Rio 2016. It comprises data for 46,533 individual athletes and has 13 columns for each one of them. There are 68,848 rows instead of 46,533 rows in the `olympics_data` worksheet because some athletes have won multiple medals:

* **ID**: A unique number assigned to each athlete
* **Name**: The athlete's name
* **Sex**: The athlete's sex
* **Age**: The athlete's age at the moment of the games
* **Height**: The athlete's height in centimeters
* **Weight**: The athlete's weight in kilograms
* **Team**: The athlete's team (country)
* **Year**: The year
* **Season**: The season
* **City**: The host city
* **Sport**: The sport the athlete competed in
* **Medal**: The medal that the athlete won, if any (can be Gold, Silver, Bronze, or NA)
* **Won medal?**: 1 if the athlete won a medal, 0 otherwise

The dataset can be downloaded from [this link](data/olympics_fellow.xlsx).

**Note:** Please write all your formulas in the `calculations` worksheet unless explicitly asked to do it in another sheet, clearly indicating the exercise they belong to. You will need to submit the Excel file along with this notebook.

## Height, weight, and age

### Exercise 1

#### 1.1

Calculate the average height, weight, and age of athletes in Rio 2016 across all sports.

**Answer.**

Whith no duplicate names in the table: Height: 176.65 cm  Weight: 71.81Kg Age: 26.39 years

With duplicate names in the table: Age:26.39 Weight:71.81 Height:176.65

#### 1.2

Repeat Exercise 1.1 but for Sydney 2000. Have the averages changed noticeably?

Average Height: 177.06cm  Weight: 72.43kg  Age: 25.84 years
The avergaes have changed but not significantly. The averages are very similar.

-------

## Geographic representation

### Exercise 2

This is a chart of the number of countries that participated in the games from 1998 to 2016. What can you conclude from it?

![Teams per year](data/images/teams_per_year.png)

**Hint:** Keep in mind that Summer and Winter games are not held in the same year. In the Winter games, the number of teams is typically lower than in the Summer games.

The most teams competed in the summer olympics in 2008. Less than half of the teams that competed in the summer olympics also competed in the winter olympics. The lowest number of teams to compete in the winter olympics was in 1998.The number of teams that competed in 2000 and 2012 is very close. The number of teams that competed in 2002, 2006, 2010, and 2014 was also very close.

-------

### Exercise 3

These are the top 10 countries by number of athletes sent for all the games between 1998 and 2016. What patterns can you spot?

![1998](data/images/top_1998.png)
![2000](data/images/top_2000.png)
![2002](data/images/top_2002.png)
![2004](data/images/top_2004.png)
![2006](data/images/top_2006.png)
![2008](data/images/top_2008.png)
![2010](data/images/top_2010.png)
![2012](data/images/top_2012.png)
![2014](data/images/top_2014.png)
![2016](data/images/top_2016.png)

The U.S. sent either the most or second most athletes 9/10 of the games. A country has never sent more than 250 athletes to the winter olympics.A country hasn't sent more than 700 athletes to the summer games.The country that is hosting the games sends a large amount more athletes compared to when they aren't hosting. Canada (colder climate) sends a large amount of athletes to the games in the winter but not that many in the summer when compared to the other countries. Germany has been in the top 10 all 10 olympics. The 2016 summer olympics had a decline in athletes sent from most countries shown.

-------

## Athletes by gender

These pie charts show the number of athletes by gender in Sydney 2000 and Rio 2016:

![Male and female Sydney](data/images/male_female_sydney.png)
![Male and female Rio](data/images/male_female_rio.png)

### Exercise 4

#### 4.1

We need to put labels on these pie charts. How many male and female athletes were there in Rio 2016 and Sydney 2000?

**Hint:** You can use the **`COUNTIF()`** function to solve this exercise. This function works very similarly to the `COUNTA()` function, with the difference that it only counts those cells that meet a certain condition. Feel free to look this function up on the Internet!

      	     
Sydney had 2550 females and 4231 males. Rio had 5031 femals and 6143 males.

-------

#### 4.2

Use Excel to calculate the ratio of $\frac{male}{female}$ athletes in both Rio and Sydney. Has it changed?

The ratio of males to females in Sydney was 1.6 to 1 and 1.22 to 1 in Rio. It has changed. The ratio is closer to equal at the Rio games.

-------

#### 4.3

Complete the table in the `gender_sport` worksheet. 

**Hint:** Use the **`COUNTIFS()`** function. It works like `COUNTIF()` but allows you to have more than one condition over more than one column.

#### 4.4

Interpret the results of your completed table from Exercise 4.3. What can you say?

There are some sports where females were not allowed to compete in 2000 but were able to in 2016 (boxing and wrestling). Rhythmic gymnastics and synchronized swimming remained all female sports. The only sports that the percentage of female athletes in 2016 was lower was diving, Judo, and tble tennis. You can tell that they elimanted softball from the olympics because it is in the table in 2000 but not 2016. Only one more sport was female dominated (greater than 50% female) in 2016 compared to 2000.

-------

## Medals

### Exercise 5

#### 5.1

How many medals were awarded between 1998 and 2016?

**Hint:** Use the `Won a medal?` column.

**Answer.**

12023, .259 medals per athlete.

#### 5.2

How many medals per athlete were awarded in Rio 2016?

.174 medals per athlete.

-------

#### 5.3

How about in Sydney? Is it lower or higher?

.283 medals per athlete, this is a higher amount than Rio 2016. Suggesting either a lower amount of medals given out or more athletes. There was a lot more athletes in 2016.

-------

### Exercise 6

#### 6.1

Complete the table in the `medals_sport` worksheet (you can use `COUNTIF()` and `SUMIF()`). What are the top 10 sports with the most medals per athlete in Rio 2016? How about in Sydney 2000?

**Hint:** To find the top tens, you will need to sort the table by multiple columns, namely year and medals per athlete. Again, use the Internet to help you with how to do this!

Rio: 1.Water Polo 2.Rowing 3.Basketball 4.Synchronized Swimming 5.Volleyball 6.Taekwondo 7.Hockey 8. Handball 9.Football 10.Wrestling

Sydney: 1.Baseball 2.Handball 3.Softball 4.Water Polo 5.Football 6.Rythmic Gymnastics 7.Hockey 8.Basketball 9.Rowing 10.Synchronized Swimming

-------

#### 6.2

Which sports are included in both rankings? What could be the reason that these sports show up in both tables?

Water Polo, Rowing, Basketball, Synchronized Swimming, Hockey, Handball, and Football. These sports are team sports so each person on the team gets a medal. There is a larger number of athletes but there is also a larger number of medals given out. Whereas in archery for both games had a substantial amount of athletes but very few number of medals given out since it is not a team sport.

-------

## Attribution

"120 years of Olympic history: athletes and results", June 15, 2018, Kaggle (user rgriffin, with data from www.sports-reference.com), Sports Reference [Terms of Use](https://www.sports-reference.com/termsofuse.html), https://www.kaggle.com/heesoo37/120-years-of-olympic-history-athletes-and-results