![](../docs/banner.png)

# Pandas

**Tomas Beuzen, September 2020**

These exercises complement [Chapter 7](../chapters/chapter7-pandas.ipynb).

## Exercises

In this set of practice exercises we'll be investigating the carbon footprint of different foods. We'll be leveraging a dataset compiled by [Kasia Kulma](https://r-tastic.co.uk/post/from-messy-to-tidy/) and contributed to [R's Tidy Tuesday project](https://github.com/rfordatascience/tidytuesday).

Start by importing pandas with the alias `pd`.

### 1.

In [35]:
import pandas as pd

### 2.

The dataset we'll be working with has the following columns:

|column      |description |
|:-------------|:-----------|
|country       | Country Name |
|food_category | Food Category |
|consumption   | Consumption (kg/person/year) |
|co2_emmission | Co2 Emission (Kg CO2/person/year) |


Import the dataset as a dataframe named `df` from this url: <https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-02-18/food_consumption.csv>

In [36]:
df = pd.read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-02-18/food_consumption.csv")
df.head()

Unnamed: 0,country,food_category,consumption,co2_emmission
0,Argentina,Pork,10.51,37.2
1,Argentina,Poultry,38.66,41.53
2,Argentina,Beef,55.48,1712.0
3,Argentina,Lamb & Goat,1.56,54.63
4,Argentina,Fish,4.36,6.96


### 3.

How many rows and columns are there in the dataframe?

In [37]:
df.shape

(1430, 4)

### 4.

What is the type of data in each column of `df`?

In [38]:
df.dtypes

country           object
food_category     object
consumption      float64
co2_emmission    float64
dtype: object

### 5.

What is the mean `co2_emission` of the whole dataset?

In [39]:
df['co2_emmission'].mean()

74.38399300699302

### 6.

How many different kinds of foods are there in the dataset? How many countries are in the dataset?

In [40]:
df['food_category'].nunique()

11

In [41]:
df['country'].nunique()

130

### 7.

What is the maximum `co2_emmission` in the dataset and which food type and country does it belong to?

In [42]:
df.iloc[df['co2_emmission'].idxmax()]

country          Argentina
food_category         Beef
consumption          55.48
co2_emmission         1712
Name: 2, dtype: object

### 8.

How many countries produce more than 1000 Kg CO2/person/year for at least one food type?

In [44]:
df[(df['co2_emmission'] > 1000)]

Unnamed: 0,country,food_category,consumption,co2_emmission
2,Argentina,Beef,55.48,1712.0
13,Australia,Beef,33.86,1044.85
57,USA,Beef,36.24,1118.29
90,Brazil,Beef,39.25,1211.17
123,Bermuda,Beef,33.15,1022.94


### 9.

Which country consumes the least amount of beef per person per year?

In [50]:
df_beef = df[(df['food_category'] == 'Beef')]
df_beef = df_beef.reset_index(drop=True)

df_beef.iloc[df_beef['consumption'].idxmin()]

country          Liberia
food_category       Beef
consumption         0.78
co2_emmission      24.07
Name: 128, dtype: object

### 10.

Which country consumes the most amount of soybeans per person per year?

In [53]:
df_soybeans = df[(df['food_category'] == 'Soybeans')]
df_soybeans = df_soybeans.reset_index(drop=True)
df_soybeans.iloc[df_soybeans['consumption'].idxmax()]

country          Taiwan. ROC
food_category       Soybeans
consumption            16.95
co2_emmission           7.63
Name: 91, dtype: object

### 11.

What is the total emissions of all the meat products (Pork, Poultry, Fish, Lamb & Goat, Beef) in the dataset combined?

In [70]:
df_meat = df[(df['food_category'] == 'Pork') | (df['food_category'] == 'Poultry' ) | (df['food_category'] == 'Beef') |
(df['food_category'] == 'Lamb & Goat') | (df['food_category'] == 'Fish')] 
total = df_meat['co2_emmission'].sum()
print(f'Total emission of all meat products is : {total}')

Total emission of all meat products is : 74441.13


### 12.

What is the total emissions of all other (non-meat) products in the dataset combined?

In [75]:
df['food_category'].unique()
df_non_meat = df[(df['food_category'] == 'Eggs') | (df['food_category'] == 'Milk - inc. cheese' ) | (df['food_category'] == 'Wheat and Wheat Products') |
(df['food_category'] == 'Rice') | (df['food_category'] == 'Soybeans') | (df['food_category'] == 'Nuts inc. Peanut Butter')] 

total = df_non_meat['co2_emmission'].sum()
print(f'Total emission of all non_meat products is : {total}')

Total emission of all non_meat products is : 31927.98


<hr>
<hr>
<hr>