# Data Manipulation and Analysis with Pandas

**_Author: Favio Vázquez_**

**_Reviewer: Jessica Cervi_**

**Expected time = 3 hrs**

**Total points = 110 points**


## Assignment Overview

In this assignment you will manipulate and analyze data with Pandas. You will begin by reviewing how to read data into a series or dataframe, then you will use additional functionalities of Pandas objects. After that, you will index, select and edit data inside dataframes. In the final parts of the assignment you will be combining, grouping and aggregating dataframes.

This assignment is designed to build your familiarity and comfort coding in Python while also helping you review key topics from each module. As you progress through the assignment, answers will get increasingly complex. It is important that you adopt a data scientist's mindset when completing this assignment. **Remember to run your code from each cell before submitting your assignment.** Running your code beforehand will notify you of errors and give you a chance to fix your errors before submitting. You should view your Vocareum submission as if you are delivering a final project to your manager or client. 

***Vocareum Tips***
- Do not add arguments or options to functions unless you are specifically asked to. This will cause an error in Vocareum.
- Do not use a library unless you are explicitly asked to in the question. 
- You can download the Grading Report after submitting the assignment. This will include feedback and hints on incorrect questions.


### Learning Objectives

- Use Pandas to build, extract, filter, and transform DataFrames.
- Describe Pandas data structures: DataFrames and Series.  
- Use Pandas objects for analyses. 

## Index:

#### Data Manipulation and Analysis with Pandas

- [Question 1](#Question-1)
- [Question 2](#Question-2)
- [Question 3](#Question-3)
- [Question 4](#Question-4)
- [Question 5](#Question-5)
- [Question 6](#Question-6)
- [Question 7](#Question-7)
- [Question 8](#Question-8)
- [Question 9](#Question-9)
- [Question 10](#Question-10)
- [Question 11](#Question-11)
- [Question 12](#Question-12)
- [Question 13](#Question-13)
- [Question 14](#Question-14)

## Data Manipulation and Analysis with Pandas

In [6]:
# Let's start by importing Pandas
import pandas as pd

# Avoid warnings
import warnings
warnings.filterwarnings("ignore")

### Importing data

We will begin this assignment with a review of how to import data with Pandas. For several parts of this assignment we will be using two datasets coming from the past 120 years of Olympic history: athletes and results. This dataset is saved inside the folder `/data`; more information about this dataset can be found on Kaggle at this link:

https://www.kaggle.com/heesoo37/120-years-of-olympic-history-athletes-and-results

This is a historical dataset on the modern Olympic Games, including all the Games from Athens 1896 to Rio 2016. 

The file `athlete_events.csv` contains 271116 rows and 15 columns. Each row corresponds to an individual athlete competing in an individual Olympic event (athlete-events). The columns are:

- ID - Unique number for each athlete
- Name - Athlete's name
- Sex - M or F
- Age - Athlete's age
- Height - In centimeters
- Weight - In kilograms
- Team - Team name
- NOC - National Olympic Committee 3-letter code
- Games - Year and season
- Year - Year of game
- Season - Summer or Winter
- City - Host city
- Sport - Sport
- Event - Event
- Medal - Gold, Silver, Bronze, or NA

The file `noc_regions.csv` contains 230 rows and 3 columns. Each row contains information about the different National Olympic Committee (NOC). The columns are:

- NOC - National Olympic Committee abreviation
- region - Name of country in NOC
- notes - Notes about the region and NOC


[Back to top](#Index:) 

### Question 1
*5 points*

Read the CSV file named `"athlete_events.csv"` contained in the `data/` folder and assign it to a dataframe called `df`.

In [87]:
### GRADED
# Let's start by importing Pandas
/##import pandas as pd

### YOUR SOLUTION HERE
##df = pd.read_csv('/mnt/vocwork3/ddd_v1_w_BcA_1916662/asn1408629_6/asn1408630_1/work/data/athlete_events.csv')

file_path = "data/athlete_events.csv"

# Use the pandas read_csv function to read the CSV file into a DataFrame
df = pd.read_csv(file_path)
###
### YOUR CODE HERE
###


In [8]:
###
### AUTOGRADER TEST - DO NOT REMOVE
###


In [9]:
# Let's take a look at our dataframe df
df.head()

Unnamed: 0,ID,Name,Sex,Age,Height,Weight,Team,NOC,Games,Year,Season,City,Sport,Event,Medal
0,1,A Dijiang,M,24.0,180.0,80.0,China,CHN,1992 Summer,1992,Summer,Barcelona,Basketball,Basketball Men's Basketball,
1,2,A Lamusi,M,23.0,170.0,60.0,China,CHN,2012 Summer,2012,Summer,London,Judo,Judo Men's Extra-Lightweight,
2,3,Gunnar Nielsen Aaby,M,24.0,,,Denmark,DEN,1920 Summer,1920,Summer,Antwerpen,Football,Football Men's Football,
3,4,Edgar Lindenau Aabye,M,34.0,,,Denmark/Sweden,DEN,1900 Summer,1900,Summer,Paris,Tug-Of-War,Tug-Of-War Men's Tug-Of-War,Gold
4,5,Christine Jacoba Aaftink,F,21.0,185.0,82.0,Netherlands,NED,1988 Winter,1988,Winter,Calgary,Speed Skating,Speed Skating Women's 500 metres,


In [10]:
# Let's see the shape of out dataframe df
print("Number of rows: {}, number of columns: {}".format(df.shape[0],df.shape[1]))

Number of rows: 271116, number of columns: 15


[Back to top](#Index:) 

### Question 2
*5 points*

Read the CSV file named `"noc_regions.csv"` in the `data/` folder and assign it to a dataframe called `regions`.

In [88]:
### GRADED

### YOUR SOLUTION HERE
file_pathn = "data/noc_regions.csv"

regions = pd.read_csv(file_pathn)

###
### YOUR CODE HERE
###


In [89]:
###
### AUTOGRADER TEST - DO NOT REMOVE
###


In [13]:
# Let's take a look at our dataframe regions
regions.head()

Unnamed: 0,NOC,region,notes
0,AFG,Afghanistan,
1,AHO,Curacao,Netherlands Antilles
2,ALB,Albania,
3,ALG,Algeria,
4,AND,Andorra,


In [14]:
# Let's see the shape of out dataframe regions
print("Number of rows: {}, number of columns: {}".format(regions.shape[0],regions.shape[1]))

Number of rows: 230, number of columns: 3


### Pandas Objects

In this part of the assignment we will begin studying the two most important objects exposed by Pandas: Series and Dataframes. As you remember:
- **Series** is a 1 dimensional data structure in Pandas.
- **DataFrame** is a 2 dimentional data structure in Pandas, made up of columns and rows.

[Back to top](#Index:) 

### Question 3
*5 points*

Select a series from the dataframe `df` with the contents of the column `Height` and store it in a variable called `height`. 

In [81]:
### GRADED

### YOUR SOLUTION HERE
height = df["Height"]

###
### YOUR CODE HERE
###


In [82]:
height

0         180.0
1         170.0
2           NaN
3           NaN
4         185.0
          ...  
271111    179.0
271112    176.0
271113    176.0
271114    185.0
271115    185.0
Name: Height, Length: 271116, dtype: float64

In [17]:
###
### AUTOGRADER TEST - DO NOT REMOVE
###


[Back to top](#Index:) 

### Question 4
*10 points*

In the videos, you have seen how you can use the function `map` to transform the entries of a `pandas` `series`.
In a similar way, you can use the function `rename` to replace the entries of a series. Like `map`, `rename` takes as argument a lambda function executing the desired transformation on the `series`.

The syntax is as follows
```Python
new_series = series.rename(lambda x: your function)
```

Use the function `rename` to rename the index (or labels) of the series `height` by raising each old label to the power of two, like so:

$$
0 \rightarrow 0 \\
1 \rightarrow 1 \\
2 \rightarrow 4 \\
3 \rightarrow 9 \\
\vdots
$$

Save this new series in a variable called `height_new`.

In [18]:
### GRADED

### YOUR SOLUTION HERE
height_new = height.rename(lambda x: x**2)
###
### YOUR CODE HERE
###
print(height_new)

0              180.0
1              170.0
4                NaN
9                NaN
16             185.0
               ...  
73501174321    179.0
73501716544    176.0
73502258769    176.0
73502800996    185.0
73503343225    185.0
Name: Height, Length: 271116, dtype: float64


In [19]:
###
### AUTOGRADER TEST - DO NOT REMOVE
###


[Back to top](#Index:) 

### Question 5
*5 points*

Select a series from the dataframe `regions` with the contents of the column `region` and store it in a variable called `reg`.

In [83]:
### GRADED

### YOUR SOLUTION HERE
reg = regions["region"]

###
### YOUR CODE HERE
###


In [84]:
reg

0      Afghanistan
1          Curacao
2          Albania
3          Algeria
4          Andorra
          ...     
225          Yemen
226          Yemen
227         Serbia
228         Zambia
229       Zimbabwe
Name: region, Length: 230, dtype: object

In [22]:
###
### AUTOGRADER TEST - DO NOT REMOVE
###


[Back to top](#Index:) 

### Question 6
*5 points*
    
You can also select multiple columns at once from a dataframe to create a new dataframe.


Create a new dataframe from the dataframe `df`, that only contain the columns `ID`, `Age`, `Height`, `Weight` and `Sex` in this specific order. Name this new dataframe `df_subset`.

In [85]:
### GRADED

### YOUR SOLUTION HERE
df_subset = df[["ID", "Age", "Height", "Weight", "Sex"]]

###
### YOUR CODE HERE
###


In [86]:
df_subset.head()

Unnamed: 0,ID,Age,Height,Weight,Sex
0,1,24.0,180.0,80.0,M
1,2,23.0,170.0,60.0,M
2,3,24.0,,,M
3,4,34.0,,,M
4,5,21.0,185.0,82.0,F


In [25]:
###
### AUTOGRADER TEST - DO NOT REMOVE
###


Let's have a look at the dataframe `df_subset` by using the command `.head()`

In [26]:
df_subset.head()

Unnamed: 0,ID,Age,Height,Weight,Sex
0,1,24.0,180.0,80.0,M
1,2,23.0,170.0,60.0,M
2,3,24.0,,,M
3,4,34.0,,,M
4,5,21.0,185.0,82.0,F


[Back to top](#Index:) 

### Question 7
*10 points*
    
Observe the dataframe `df_subset`, above. You see that the column `Sex` contains entries `M` and `F` based on whether an athlete was a male or a female, respectively.

Create a new column, `New sex`, in our dataframe. Fill this column by using the function `map` to change the entries of the column `sex` from `M` to `male` and from `F` to `female`.

The syntax is as follows
```Python
dataframe['column'] = dataframe.column.map(lambda x: your function)
```

**HINT: Notice that we are adding a new column, not replacing an existing one!**


In [27]:
### GRADED

### YOUR SOLUTION HERE
df_subset['New sex'] = df_subset['Sex'].map(lambda x: 'male' if x== 'M' else 'female')
###
### YOUR CODE HERE
###


In [28]:
df_subset.head()

Unnamed: 0,ID,Age,Height,Weight,Sex,New sex
0,1,24.0,180.0,80.0,M,male
1,2,23.0,170.0,60.0,M,male
2,3,24.0,,,M,male
3,4,34.0,,,M,male
4,5,21.0,185.0,82.0,F,female


In [29]:
###
### AUTOGRADER TEST - DO NOT REMOVE
###


### Indexing and selecting data from Dataframes

In this part of the assignment we will work with the dataframes from above to select specific data using Pandas different methods and attributes. You have learned to use `loc[]` and `iloc[]` to do this.

[Back to top](#Index:) 

### Question 8
*5 points*

Create a new dataframe called `df_1` by selecting the following from the dataframe `df`:
- the rows with labels from 3 through 11.
- the columns from `ID` to `Height`.

In [30]:
### GRADED

### YOUR SOLUTION HERE
df_1 = df.loc[3:11, 'ID':'Height']

###
### YOUR CODE HERE
###


In [31]:
df_1

Unnamed: 0,ID,Name,Sex,Age,Height
3,4,Edgar Lindenau Aabye,M,34.0,
4,5,Christine Jacoba Aaftink,F,21.0,185.0
5,5,Christine Jacoba Aaftink,F,21.0,185.0
6,5,Christine Jacoba Aaftink,F,25.0,185.0
7,5,Christine Jacoba Aaftink,F,25.0,185.0
8,5,Christine Jacoba Aaftink,F,27.0,185.0
9,5,Christine Jacoba Aaftink,F,27.0,185.0
10,6,Per Knut Aaland,M,31.0,188.0
11,6,Per Knut Aaland,M,31.0,188.0


In [32]:
###
### AUTOGRADER TEST - DO NOT REMOVE
###


[Back to top](#Index:) 

### Question 9
*10 points*

Select all the rows from the dataframe `df` when the `Year` is greater than 1980. Assign this dataframe to `df_year`.

Next, select all the rows in `df_year` where `Team` is equal to "China", "United States", "Italy" or "Spain". Save your results in a dataframe called `df_country`.

**HINT**: To select only the desired countries, create a list containing the contries and use the function `isin` like so:

```Python
df.Team.isin([list with countries])
```

In [42]:
### GRADED

### YOUR SOLUTION HERE
df_year = df[df['Year'] > 1980]
countries = ["China", "United States", "Italy", "Spain"]
df_country =df_year[df_year['Team'].isin(countries)]

###
### YOUR CODE HERE
###


In [43]:
df_country

Unnamed: 0,ID,Name,Sex,Age,Height,Weight,Team,NOC,Games,Year,Season,City,Sport,Event,Medal
0,1,A Dijiang,M,24.0,180.0,80.0,China,CHN,1992 Summer,1992,Summer,Barcelona,Basketball,Basketball Men's Basketball,
1,2,A Lamusi,M,23.0,170.0,60.0,China,CHN,2012 Summer,2012,Summer,London,Judo,Judo Men's Extra-Lightweight,
10,6,Per Knut Aaland,M,31.0,188.0,75.0,United States,USA,1992 Winter,1992,Winter,Albertville,Cross Country Skiing,Cross Country Skiing Men's 10 kilometres,
11,6,Per Knut Aaland,M,31.0,188.0,75.0,United States,USA,1992 Winter,1992,Winter,Albertville,Cross Country Skiing,Cross Country Skiing Men's 50 kilometres,
12,6,Per Knut Aaland,M,31.0,188.0,75.0,United States,USA,1992 Winter,1992,Winter,Albertville,Cross Country Skiing,Cross Country Skiing Men's 10/15 kilometres Pu...,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
270850,135458,Rami Zur,M,27.0,175.0,77.0,United States,USA,2004 Summer,2004,Summer,Athina,Canoeing,"Canoeing Men's Kayak Doubles, 500 metres",
270851,135458,Rami Zur,M,31.0,175.0,77.0,United States,USA,2008 Summer,2008,Summer,Beijing,Canoeing,"Canoeing Men's Kayak Singles, 500 metres",
270852,135458,Rami Zur,M,31.0,175.0,77.0,United States,USA,2008 Summer,2008,Summer,Beijing,Canoeing,"Canoeing Men's Kayak Singles, 1,000 metres",
270891,135471,Jos Zurera Alberca,M,22.0,162.0,52.0,Spain,ESP,1988 Summer,1988,Summer,Seoul,Weightlifting,Weightlifting Men's Bantamweight,


In [46]:
df.shape

(271116, 15)

In [None]:
###
### AUTOGRADER TEST - DO NOT REMOVE
###


[Back to top](#Index:) 

### Question 10
*5 points*

Using the function `iloc()` select the rows with index 0, 10, 20, 40, 43, 66 and the columns with index 0, 3, 5 from the dataframe `df`. Store your results in a dataframe called `df_3`.

In [49]:
### GRADED

### YOUR SOLUTION HERE
row_indices = [0, 10, 20, 40, 43, 66]
column_indices = [0, 3, 5]
df_3 = df.iloc[row_indices, column_indices]

###
### YOUR CODE HERE
###


In [50]:
df_3

Unnamed: 0,ID,Age,Weight
0,1,24.0,80.0
10,6,31.0,75.0
20,7,31.0,72.0
40,16,28.0,85.0
43,17,28.0,64.0
66,20,22.0,85.0


In [None]:
###
### AUTOGRADER TEST - DO NOT REMOVE
###


### Editing data in DataFrames and Combining DataFrames

In this section we will modify the internal structure and data of dataframes, deleting some of its columns and transforming others. We will also be combining our dataframes `df` and `regions` and learn different ways of working with them.

[Back to top](#Index:) 

### Question 11
*10 points*
    
Use a `left` join to combine the dataframes `df` and `regions`, in this particular order, into a new dataframe called `merged`. Set the column `NOC` as the key column.

**HINT**: Use the `pandas` function `merge`.

In [53]:
### GRADED

# Let's read our data again to have the original datasets
df = pd.read_csv("data/athlete_events.csv")
regions = pd.read_csv("data/noc_regions.csv")

### YOUR SOLUTION HERE
merged = pd.merge(df,regions, on ='NOC', how = 'left')

###
### YOUR CODE HERE
###


In [55]:
merged.head()

Unnamed: 0,ID,Name,Sex,Age,Height,Weight,Team,NOC,Games,Year,Season,City,Sport,Event,Medal,region,notes
0,1,A Dijiang,M,24.0,180.0,80.0,China,CHN,1992 Summer,1992,Summer,Barcelona,Basketball,Basketball Men's Basketball,,China,
1,2,A Lamusi,M,23.0,170.0,60.0,China,CHN,2012 Summer,2012,Summer,London,Judo,Judo Men's Extra-Lightweight,,China,
2,3,Gunnar Nielsen Aaby,M,24.0,,,Denmark,DEN,1920 Summer,1920,Summer,Antwerpen,Football,Football Men's Football,,Denmark,
3,4,Edgar Lindenau Aabye,M,34.0,,,Denmark/Sweden,DEN,1900 Summer,1900,Summer,Paris,Tug-Of-War,Tug-Of-War Men's Tug-Of-War,Gold,Denmark,
4,5,Christine Jacoba Aaftink,F,21.0,185.0,82.0,Netherlands,NED,1988 Winter,1988,Winter,Calgary,Speed Skating,Speed Skating Women's 500 metres,,Netherlands,


In [None]:
###
### AUTOGRADER TEST - DO NOT REMOVE
###


### Grouping and Aggregating DataFrames

In this final section we will group and perform aggregations on our dataframes.

In [56]:
# Let's read our data again to have the original datasets
df = pd.read_csv("data/athlete_events.csv")
regions = pd.read_csv("data/noc_regions.csv")
regions.head()

Unnamed: 0,NOC,region,notes
0,AFG,Afghanistan,
1,AHO,Curacao,Netherlands Antilles
2,ALB,Albania,
3,ALG,Algeria,
4,AND,Andorra,


[Back to top](#Index:) 

### Question 12
*10 points*

Reshape the `regions` dataframe by setting:
- index = `region`
- columns = `NOC`
- values = `notes`

Assign the new dataframe to `regions_stacked`

In [57]:
### GRADED

### YOUR SOLUTION HERE

regions_stacked = regions.pivot(index='region', columns='NOC', values='notes')
###
### YOUR CODE HERE
###


In [58]:
regions_stacked

NOC,AFG,AHO,ALB,ALG,AND,ANG,ANT,ANZ,ARG,ARM,...,VIE,VIN,VNM,WIF,YAR,YEM,YMD,YUG,ZAM,ZIM
region,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
,,,,,,,,,,,...,,,,,,,,,,
Afghanistan,,,,,,,,,,,...,,,,,,,,,,
Albania,,,,,,,,,,,...,,,,,,,,,,
Algeria,,,,,,,,,,,...,,,,,,,,,,
American Samoa,,,,,,,,,,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
"Virgin Islands, British",,,,,,,,,,,...,,,,,,,,,,
"Virgin Islands, US",,,,,,,,,,,...,,,,,,,,,,
Yemen,,,,,,,,,,,...,,,,,North Yemen,,South Yemen,,,
Zambia,,,,,,,,,,,...,,,,,,,,,,


In [None]:
###
### AUTOGRADER TEST - DO NOT REMOVE
###


[Back to top](#Index:) 

### Question 13
*10 points*

Group the entries of the dataframe `df` by the columns `"Season"` and `"Medal"`. Assign the new dataframe to `season_medal`.

Also, answer the following question: In which Olympic games did the first group of athletes participate?
- a) 2014 Winter
- b) 1920 Summer
- c) 1994 Summer
- d) 1920 Winter

Assign the letter of the correct answer, as a string, to `ans13`.

For instance, if you believe the correct answer is *a) 2014 Winter*, your answer would be `ans13 = 'a'`.

In [60]:
### GRADED

### YOUR SOLUTION HERE
seasons = ['Season','Medal']
season_medal = df[seasons] 
ans13 =  'b'

###
### YOUR CODE HERE
###


In [70]:
df_sorted = df.sort_values(by='Games', ascending=True)

In [71]:
df_sorted

Unnamed: 0,ID,Name,Sex,Age,Height,Weight,Team,NOC,Games,Year,Season,City,Sport,Event,Medal
132219,66542,Leonidas Lanngakis,M,,,,Greece,GRE,1896 Summer,1896,Summer,Athina,Shooting,"Shooting Men's Free Rifle, Three Positions, 30...",
214353,107613,Carl Schuhmann,M,26.0,159.0,70.0,Germany,GER,1896 Summer,1896,Summer,Athina,Gymnastics,Gymnastics Men's Rings,
214352,107613,Carl Schuhmann,M,26.0,159.0,70.0,Germany,GER,1896 Summer,1896,Summer,Athina,Gymnastics,"Gymnastics Men's Horizontal Bar, Teams",Gold
24682,12929,John Mary Pius Boland,M,25.0,,,Great Britain,GBR,1896 Summer,1896,Summer,Athina,Tennis,Tennis Men's Singles,Gold
214351,107613,Carl Schuhmann,M,26.0,159.0,70.0,Germany,GER,1896 Summer,1896,Summer,Athina,Wrestling,"Wrestling Men's Unlimited Class, Greco-Roman",Gold
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
59978,30717,Duo Bujie,M,22.0,175.0,55.0,China,CHN,2016 Summer,2016,Summer,Rio de Janeiro,Athletics,Athletics Men's Marathon,
59970,30713,Thomas Dunstan,M,18.0,194.0,96.0,United States,USA,2016 Summer,2016,Summer,Rio de Janeiro,Water Polo,Water Polo Men's Water Polo,
185525,93274,Anthony Joseph Prez Cortesia,M,22.0,205.0,93.0,Venezuela,VEN,2016 Summer,2016,Summer,Rio de Janeiro,Basketball,Basketball Men's Basketball,
17623,9392,Kieran Philip Behan,M,27.0,163.0,65.0,Ireland,IRL,2016 Summer,2016,Summer,Rio de Janeiro,Gymnastics,Gymnastics Men's Rings,


In [None]:
###
### AUTOGRADER TEST - DO NOT REMOVE
###


[Back to top](#Index:) 

### Question 14
*15 points*

Perform the following aggregation operations on `df`:

- compute the `max` and the `min` on the column `Age`.
- Compute the `mean` on the column `Weight`.
- compute the `max`, `min` and `mean` on the column `Height`.

Assign the new dataframe to `df_aggr`.


In [76]:
### GRADED

### YOUR SOLUTION HERE

max_age = df['Age'].max()
min_age = df['Age'].min()

# Compute mean on the column 'Weight'
mean_weight = df['Weight'].mean()

# Compute max, min, and mean on the column 'Height'
max_height = df['Height'].max()
min_height = df['Height'].min()
mean_height = df['Height'].mean()

data = {'Max_Age': [max_age],
        'Min_Age': [min_age],
        'Mean_Weight': [mean_weight],
        'Max_Height': [max_height],
        'Min_Height': [min_height],
        'Mean_Height': [mean_height]}

df_aggr = pd.DataFrame(data)
###
### YOUR CODE HERE
###


In [78]:
df_aggr

Unnamed: 0,Max_Age,Min_Age,Mean_Weight,Max_Height,Min_Height,Mean_Height
0,97.0,10.0,70.702393,226.0,127.0,175.33897


In [None]:
###
### AUTOGRADER TEST - DO NOT REMOVE
###
