![](../docs/banner.png)

# Pandas


## Exercises

In this set of practice exercises we'll be investigating the carbon footprint of different foods. We'll be leveraging a dataset compiled by [Kasia Kulma](https://r-tastic.co.uk/post/from-messy-to-tidy/) and contributed to [R's Tidy Tuesday project](https://github.com/rfordatascience/tidytuesday).

Start by importing pandas with the alias `pd`.

In [2]:
# Your answer here.
import pandas as pd

### 2.

The dataset we'll be working with has the following columns:

|column      |description |
|:-------------|:-----------|
|country       | Country Name |
|food_category | Food Category |
|consumption   | Consumption (kg/person/year) |
|co2_emmission | Co2 Emission (Kg CO2/person/year) |


Import the dataset as a dataframe named `df` from this url: <https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-02-18/food_consumption.csv>

In [5]:
# Your answer here.
url = 'https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-02-18/food_consumption.csv'
df = pd.read_csv(url)
df.head(5)

Unnamed: 0,country,food_category,consumption,co2_emmission
0,Argentina,Pork,10.51,37.2
1,Argentina,Poultry,38.66,41.53
2,Argentina,Beef,55.48,1712.0
3,Argentina,Lamb & Goat,1.56,54.63
4,Argentina,Fish,4.36,6.96


### 3.

How many rows and columns are there in the dataframe?

In [30]:
# Your answer here.
#data.shape or 

df_rows = df.shape[0]
df_columns = df.shape[1]

print('rows: ', df_rows)
print('coumns is: ', df_columns)

rows:  1430
coumns is:  4


### 4.

What is the type of data in each column of `df`?

In [4]:
# Your answer here.
df.dtypes

country           object
food_category     object
consumption      float64
co2_emmission    float64
dtype: object

### 5.

What is the mean `co2_emission` of the whole dataset?

In [5]:
# Your answer here.
mean_co2_emmission = df['co2_emmission'].mean()
print('The mean co2_emission of the whole dataset is: ', mean_co2_emmission)

The mean co2_emission of the whole dataset is:  74.383993006993


### 6.

How many different kinds of foods are there in the dataset? How many countries are in the dataset?

In [6]:
# Your answer here.
unique_kinds =df['food_category'].unique()
print(unique_kinds)
print()
num_unique_kinds = len(unique_kinds)
print('We have', num_unique_kinds, ' kinds of foods in the dataset')


['Pork' 'Poultry' 'Beef' 'Lamb & Goat' 'Fish' 'Eggs' 'Milk - inc. cheese'
 'Wheat and Wheat Products' 'Rice' 'Soybeans' 'Nuts inc. Peanut Butter']

We have 11  kinds of foods in the dataset


In [7]:
#Your answer here
unique_country = df['country'].unique()
print(unique_country)
print()
num_unique_country = len(unique_country)
print('We have ', num_unique_country, 'different countries in the dataset')

['Argentina' 'Australia' 'Albania' 'Iceland' 'New Zealand' 'USA' 'Uruguay'
 'Luxembourg' 'Brazil' 'Kazakhstan' 'Sweden' 'Bermuda' 'Denmark' 'Finland'
 'Ireland' 'Greece' 'France' 'Canada' 'Norway' 'Hong Kong SAR. China'
 'French Polynesia' 'Israel' 'Switzerland' 'Netherlands' 'Kuwait'
 'United Kingdom' 'Austria' 'Oman' 'Italy' 'Bahamas' 'Portugal' 'Malta'
 'Armenia' 'Slovenia' 'Chile' 'Venezuela' 'Belgium' 'Germany' 'Russia'
 'Croatia' 'Belarus' 'Spain' 'Paraguay' 'New Caledonia' 'South Africa'
 'Barbados' 'Lithuania' 'Turkey' 'Estonia' 'Mexico' 'Costa Rica' 'Bolivia'
 'Ecuador' 'Panama' 'Czech Republic' 'Romania' 'Colombia' 'Maldives'
 'Cyprus' 'Serbia' 'United Arab Emirates' 'Algeria' 'Ukraine' 'Pakistan'
 'Swaziland' 'Latvia' 'Bosnia and Herzegovina' 'Fiji' 'South Korea'
 'Poland' 'Saudi Arabia' 'Botswana' 'Macedonia' 'Hungary'
 'Trinidad and Tobago' 'Tunisia' 'Egypt' 'Mauritius' 'Bulgaria' 'Morocco'
 'Slovakia' 'Niger' 'Kenya' 'Jordan' 'Japan' 'Georgia' 'Grenada'
 'El Salvador' 'Cu

### 7.

What is the maximum `co2_emmission` in the dataset and which food type and country does it belong to?

In [8]:
# Your answer here.
df.head(4)
find_max = df[['co2_emmission', 'country', 'food_category']].max()
print(find_max)

co2_emmission                      1712.0
country                          Zimbabwe
food_category    Wheat and Wheat Products
dtype: object


### 8.

How many countries produce more than 1000 Kg CO2/person/year for at least one food type?

In [17]:
# Your answer here.
#let us start by grouping the country data

df_group = df[df['co2_emmission'] > 1000]
compare = df_group['country'].nunique()


print(compare, 'countries produced more than 1000kg/CO2/person/year for at least on food type')

5 countries produced more than 1000kg/CO2/person/year for at least on food type


### 9.

Which country consumes the least amount of beef per person per year?

In [21]:
# Your answer here.
df.query("food_category == 'Beef'").sort_values(by = 'consumption').head(1)

Unnamed: 0,country,food_category,consumption,co2_emmission
1410,Liberia,Beef,0.78,24.07
1333,India,Beef,0.81,24.99


### 10.

Which country consumes the most amount of soybeans per person per year?

In [31]:
# Your answer here.
df_soy = df[df['food_category'] == 'Soybeans']
df_soya = df_soy.sort_values(by = 'consumption')
sort = df_soya.iloc[-1]['country']
print(sort, 'consumes the most amount of soybeans')

Taiwan. ROC consumes the most amount of soybeans


### 11.

What is the total emissions of all the meat products (Pork, Poultry, Fish, Lamb & Goat, Beef) in the dataset combined?

In [23]:
# Your answer here.
df_emissions = df.groupby('food_category').sum()
df_calc = df_emissions['co2_emmission'].iloc[[0, 3, 6, 7, 2]].sum()
print('The total emissions of all the meat products is ', df_calc)

The total emissions of all the meat products is  74441.13


### 12.

What is the total emissions of all other (non-meat) products in the dataset combined?

In [29]:
# Your answer here.
df.head()
df_nnmt = df.groupby('food_category').sum()
df_nnmn = df_nnmt['co2_emmission'].iloc[[1,4,5,8,9, 10]].sum()
print('Total emissions of all non meat products in the dataset is ', df_nnmn)

Total emissions of all non meat products in the dataset is  31927.98
