### This project features interactive maps and HTML markup. Since Github does not support dynamic displays for notebooks, it is necessary to download the project to render the elements and to manipulate the data visualizations.

# Data Analysis

This is the <b>second in a series of three Jupyter notebooks</b> on the 2018 Food Consumption and CO<sub>2</sub> Emissions. This activity is in partial fulfillment of the Tidy Tuesdays deliverables for probationary Lyrids of the <b>Center for Complexity and Emerging Technologies, College of Computer Studies, De La Salle University</b>.

<b>Climate change and global warming are pressing environmental issues &mdash; and among the foremost drivers of these are human-induced emissions of greenhouse gases, such as carbon dioxide (CO<sub>2</sub>)</b>. While this project does not seek to present a professional or rigorous statistical analysis, the author of this series of Jupyter notebooks would like to increase awareness on the importance of data-driven policy directions and to hopefully contribute to the present discourse on how food consumption can greatly impact our carbon footprint. 

<hr/>

The required dataset for this Tidy Tuesdays activity is the 2018 Food Consumption and CO<sub>2</sub> Emissions from the R Community's Tidy Tuesdays (Github): https://github.com/rfordatascience/tidytuesday/tree/master/data/2020/2020-02-18. 

To enrich the analysis and visualization, the following datasets were integrated:

- Country Codes and Names - http://country.io/names.json
- World Bank Country and Lending Groups (Classification for the 2018 Fiscal Year) - https://datahelpdesk.worldbank.org/knowledgebase/articles/906519-world-bank-country-and-lending-groups

These datasets are stored in the folder <code>data</code> of the repository.

# PRELIMINARIES *(Lifted from Data Preparation)*

<b>Due to restrictions related to the size of files in Github repositories, the project had to be divided into separate notebooks. In this regard, this section is just a repeat of pertinent code excerpts from the data preparation phase. </b> 

For the complete documentation, please refer to this notebook: <code>1. Data Preparation.ipynb</code>.

In [1]:
import re
import json

import pandas as pd
import numpy as np
import scipy.stats
import matplotlib.pyplot as plt
from pywaffle import Waffle

import plotly.express as px
import plotly.graph_objs as go
from plotly.subplots import make_subplots
from plotly.offline import iplot, init_notebook_mode
init_notebook_mode(connected = True)

pd.options.mode.chained_assignment = None  

NUM_ROWS = 10

pd.set_option('display.max_rows', NUM_ROWS)
pd.set_option('display.min_rows', NUM_ROWS)

In [2]:
standardized_names = [("Taiwan. ROC", "Taiwan"),
                      ("USA", "United States"),
                      ("Hong Kong SAR. China", "Hong Kong"),
                      ("Congo", "Republic of the Congo")]

standardized_categories = [("Milk - inc. cheese", "Milk & Cheese"),
                           ("Wheat and Wheat Products", "Wheat & Wheat Products"),
                           ("Nuts inc. Peanut Butter", "Nuts & Peanut Butter")]

In [3]:
data_raw = pd.read_csv('data/food_consumption.csv')
data = data_raw.copy(deep = True)

for name in standardized_names:
    data['country'] = data['country'].str.replace(re.escape(name[0]), name[1])
    
for category in standardized_categories:
    data['food_category'] = data['food_category'].str.replace(re.escape(category[0]), category[1])

<hr/>

# DATA ANALYSIS

Essentially, the dataset presents country-level data on the consumption (in kg/person/year) of categories of food products and the associated carbon dioxide emission (in kg CO<sub>2</sub>/person/year). This nature of the dataset, thus, lends itself to three themes of analysis:
- Consumption and emission by country
- Consumption and emission by food product
- Correlations between selected variables

The goal of this step is twofold: (1) detecting anomalies that may have to be addressed before proceeding to visualization and further analysis (or, possibly, forecasting) and (2) recognizing notable trends and patterns that may hopefully enrich our understanding of the data at hand.

### *Animal-Based vs. Plant-Based Products*

An interesting angle to explore, as suggested by its creator (nu3, 2018), is splitting the food categories into those that are animal-based and plant-based.

In [4]:
animal_products = ['country', 'Beef', 'Eggs', 'Fish', 'Lamb & Goat', 'Milk & Cheese', 'Pork', 'Poultry']
plant_products = ['country', 'Nuts & Peanut Butter', 'Rice', 'Soybeans', 'Wheat & Wheat Products']

*The under-the-hood implementation of having the first entry of each list as <code>country</code>* is in light of the structure of the pivot dataset (which will be constructed in the next section).

## A. By Country

We first perform an independent (separate) analysis of the country-level data related to consumption and carbon dioxide emission.

### 1. Consumption

We pivot the dataset so that the columns are now the food categories and each row corresponds to a country. We also add a column showing the total consumption per country.

In [5]:
country_data = data.pivot(index = 'country', columns = 'food_category', values = 'consumption').reset_index()

country_data['TOTAL'] = country_data['Beef'] + country_data['Eggs'] + country_data['Fish'] \
                        + country_data['Lamb & Goat'] + country_data['Milk & Cheese'] \
                        + country_data['Nuts & Peanut Butter'] + country_data['Pork'] + country_data['Poultry'] \
                        + country_data['Rice'] + country_data['Soybeans'] + country_data['Wheat & Wheat Products']

print("-- Food Product Consumption per Country --")
country_data

-- Food Product Consumption per Country --


food_category,country,Beef,Eggs,Fish,Lamb & Goat,Milk & Cheese,Nuts & Peanut Butter,Pork,Poultry,Rice,Soybeans,Wheat & Wheat Products,TOTAL
0,Albania,22.50,12.45,3.85,15.32,303.72,4.36,10.88,13.23,7.78,0.00,138.64,532.73
1,Algeria,5.60,8.06,3.74,7.69,141.53,2.08,0.00,7.42,2.97,0.00,185.42,364.51
2,Angola,8.42,1.11,15.24,1.08,12.30,2.26,8.89,17.33,8.12,0.52,40.72,115.99
3,Argentina,55.48,11.39,4.36,1.56,195.08,0.49,10.51,38.66,8.77,0.00,103.11,429.41
4,Armenia,19.66,11.69,4.36,3.02,209.03,2.55,9.67,13.35,3.18,0.00,130.60,407.11
...,...,...,...,...,...,...,...,...,...,...,...,...,...
125,Uruguay,29.10,13.14,6.53,8.23,210.54,0.95,16.84,27.45,11.50,0.01,109.31,433.60
126,Venezuela,25.89,5.63,8.34,0.32,117.79,0.35,7.23,39.28,23.39,0.00,49.17,277.39
127,Vietnam,7.44,3.84,26.52,0.14,16.36,6.28,35.00,12.36,144.56,5.75,10.49,268.74
128,Zambia,4.76,3.32,6.20,0.68,9.71,5.04,1.66,3.29,3.05,7.30,12.10,57.11


We arrange the entries based on the total consumption. As a preliminary visualization, we employ blue gradients to correspond to the amount of food products consumed.

In [6]:
sorted_country_data = country_data.sort_values(by = 'TOTAL', ascending = False).reset_index(drop = True)

print("-- Food Product Consumption per Country (Sorted) --")
sorted_country_data.style.background_gradient(cmap = 'Blues')

-- Food Product Consumption per Country (Sorted) --


food_category,country,Beef,Eggs,Fish,Lamb & Goat,Milk & Cheese,Nuts & Peanut Butter,Pork,Poultry,Rice,Soybeans,Wheat & Wheat Products,TOTAL
0,Finland,19.22,9.55,33.8,0.53,430.76,3.43,36.14,19.87,4.42,0.08,81.99,639.79
1,Lithuania,4.49,13.11,42.39,0.24,295.46,2.13,45.67,26.84,3.07,0.02,121.59,555.01
2,Sweden,24.58,13.37,23.86,1.41,341.23,6.23,37.0,16.64,5.96,0.13,79.59,550.0
3,Netherlands,17.67,14.03,18.64,0.94,341.47,7.94,36.36,23.9,2.93,0.12,70.17,534.17
4,Albania,22.5,12.45,3.85,15.32,303.72,4.36,10.88,13.23,7.78,0.0,138.64,532.73
5,Ireland,22.35,8.96,17.39,4.1,291.86,4.1,32.4,26.26,3.0,0.25,107.98,518.65
6,Switzerland,21.26,10.53,13.48,1.42,318.69,9.27,31.49,16.38,2.43,0.44,89.51,514.9
7,Italy,18.6,13.34,15.6,0.92,246.88,7.63,40.28,18.61,5.74,0.01,146.37,513.98
8,Denmark,28.46,15.35,16.49,0.92,277.3,5.94,24.87,26.75,4.96,0.03,98.0,499.07
9,Luxembourg,29.88,14.64,23.09,1.67,255.3,0.93,43.58,21.37,4.2,0.04,103.2,497.9


Below are some observations that can be noted from the table above:
- The ten countries with the highest record of food product consumption are all in Europe, with Finland being the top consumer (639.79 kg/person/year). In particular, there is a noticeably high consumption of milk and cheese, eggs, and pork. It is to be noted that the European Union is the biggest cheese and skin milk powder exporter globally, as well as the second-largest producer and the foremost exporter of pork in the world (European Union, 2021).
- In these countries, the consumption of milk and cheese products also far outweighs those of others, contributing to over half of their total consumption.
- On the other hand, the ten countries with the lowest record of food product consumption are all in Africa, with Rwanda being the lowest consumer (40 kg/person/year).
- This trend may possibly indicate a relationship between the economy of a country and consumption. Consumption is intrinsically tied with the available resources of a country. In particular, countries with enough or above-adequate resources for the production and even participate in the trade of goods also tend to have a more ample supply of food products (either locally produced or imported) for the consumption of their citizens. 

### *Animal-Based Food Products*

We now zero in on the food products that are derived from animals.

In [7]:
animal_country_data = country_data[animal_products]

animal_country_data['TOTAL'] = animal_country_data['Beef'] + animal_country_data['Eggs'] + animal_country_data['Fish'] \
                               + animal_country_data['Lamb & Goat'] + animal_country_data['Milk & Cheese'] \
                               + animal_country_data['Pork'] + animal_country_data['Poultry']

print("-- Animal-Based Food Product Consumption per Country --")
animal_country_data

-- Animal-Based Food Product Consumption per Country --


food_category,country,Beef,Eggs,Fish,Lamb & Goat,Milk & Cheese,Pork,Poultry,TOTAL
0,Albania,22.50,12.45,3.85,15.32,303.72,10.88,13.23,381.95
1,Algeria,5.60,8.06,3.74,7.69,141.53,0.00,7.42,174.04
2,Angola,8.42,1.11,15.24,1.08,12.30,8.89,17.33,64.37
3,Argentina,55.48,11.39,4.36,1.56,195.08,10.51,38.66,317.04
4,Armenia,19.66,11.69,4.36,3.02,209.03,9.67,13.35,270.78
...,...,...,...,...,...,...,...,...,...
125,Uruguay,29.10,13.14,6.53,8.23,210.54,16.84,27.45,311.83
126,Venezuela,25.89,5.63,8.34,0.32,117.79,7.23,39.28,204.48
127,Vietnam,7.44,3.84,26.52,0.14,16.36,35.00,12.36,101.66
128,Zambia,4.76,3.32,6.20,0.68,9.71,1.66,3.29,29.62


We arrange the entries based on the total consumption. As a preliminary visualization, we employ red gradients to correspond to the amount of food products consumed.

In [8]:
sorted_animal_country_data = animal_country_data.sort_values(by = 'TOTAL', ascending = False).reset_index(drop = True)

print("-- Animal-Based Food Product Consumption per Country (Sorted) --")
sorted_animal_country_data.style.background_gradient(cmap = 'Reds')

-- Animal-Based Food Product Consumption per Country (Sorted) --


food_category,country,Beef,Eggs,Fish,Lamb & Goat,Milk & Cheese,Pork,Poultry,TOTAL
0,Finland,19.22,9.55,33.8,0.53,430.76,36.14,19.87,549.87
1,Sweden,24.58,13.37,23.86,1.41,341.23,37.0,16.64,458.09
2,Netherlands,17.67,14.03,18.64,0.94,341.47,36.36,23.9,453.01
3,Lithuania,4.49,13.11,42.39,0.24,295.46,45.67,26.84,428.2
4,Switzerland,21.26,10.53,13.48,1.42,318.69,31.49,16.38,413.25
5,Ireland,22.35,8.96,17.39,4.1,291.86,32.4,26.26,403.32
6,United States,36.24,14.58,12.35,0.43,254.69,27.64,50.01,395.94
7,Iceland,13.36,8.24,74.41,21.12,225.82,21.69,26.87,391.51
8,Denmark,28.46,15.35,16.49,0.92,277.3,24.87,26.75,390.14
9,Luxembourg,29.88,14.64,23.09,1.67,255.3,43.58,21.37,389.53


Below are some observations that can be noted from the table above:

- Once again, the ten countries with the highest record of consumption are mostly from Europe, with the sole exception of the United States. Finland, which is the overall top consumer at 549.87 kg/person/year, is also the top consumer of milk and cheese products in the world, at 430.75 kg/person/year. Meanwhile, Iceland (which is ranked 8<sup>th</sup>, consuming 391.51 kg/person/year) is the top consumer of lamb and goat products globally (21.12 kg/person/year).
- In these countries, the consumption of milk and cheese products also far outweighs those of others. Considering only animal-based food consumption, over three-fourths (> 75%) can be attributed to milk and cheese products alone.
- Similar to the trend observed with respect to the overall food consumption, the ten countries with the lowest record of animal-based food product consumption are all in Africa, with Rwanda still being the lowest consumer at 17.83 kg/person/year.
- This trend may possibly indicate a relationship between the economy of a country and consumption. Consumption is intrinsically tied with the available resources of a country. In particular, countries with enough or above-adequate resources for the production and even participate in the trade of goods also tend to have a more ample supply of food products (either locally produced or imported) for the consumption of their citizens.

### *Plant-Based Food Products*

We now zero in on the food products that are derived from plants.

In [9]:
plant_country_data = country_data[plant_products]

plant_country_data['TOTAL'] = plant_country_data['Nuts & Peanut Butter'] \
                              + plant_country_data['Rice'] + plant_country_data['Soybeans'] \
                              + plant_country_data['Wheat & Wheat Products']

print("-- Plant-Based Food Product Consumption per Country --")
plant_country_data

-- Plant-Based Food Product Consumption per Country --


food_category,country,Nuts & Peanut Butter,Rice,Soybeans,Wheat & Wheat Products,TOTAL
0,Albania,4.36,7.78,0.00,138.64,150.78
1,Algeria,2.08,2.97,0.00,185.42,190.47
2,Angola,2.26,8.12,0.52,40.72,51.62
3,Argentina,0.49,8.77,0.00,103.11,112.37
4,Armenia,2.55,3.18,0.00,130.60,136.33
...,...,...,...,...,...,...
125,Uruguay,0.95,11.50,0.01,109.31,121.77
126,Venezuela,0.35,23.39,0.00,49.17,72.91
127,Vietnam,6.28,144.56,5.75,10.49,167.08
128,Zambia,5.04,3.05,7.30,12.10,27.49


We arrange the entries based on the total consumption. As a preliminary visualization, we employ green gradients to correspond to the amount of food products consumed.

In [10]:
sorted_plant_country_data = plant_country_data.sort_values(by = 'TOTAL', ascending = False).reset_index(drop = True)

print("-- Plant-Based Food Product Consumption per Country  (Sorted) --")
sorted_plant_country_data.style.background_gradient(cmap = 'Greens')

-- Plant-Based Food Product Consumption per Country  (Sorted) --


food_category,country,Nuts & Peanut Butter,Rice,Soybeans,Wheat & Wheat Products,TOTAL
0,Tunisia,6.34,1.07,1.88,197.5,206.79
1,Iran,12.16,29.95,0.0,153.31,195.42
2,Bangladesh,0.72,171.73,0.61,17.47,190.53
3,Algeria,2.08,2.97,0.0,185.42,190.47
4,Turkey,7.97,10.74,1.71,169.96,190.38
5,Egypt,1.82,39.77,0.64,146.83,189.06
6,Morocco,3.23,1.1,0.0,179.7,184.03
7,United Arab Emirates,23.03,56.25,0.08,101.29,180.65
8,Georgia,2.47,2.64,0.0,163.43,168.54
9,Cambodia,1.25,159.1,4.33,2.74,167.42


Below are some observations that can be noted from the table above:

- For plant-based food product consumption, the trend has shifted from Europe to Asia (predominantly West Asia) with regard to the top ten consumers, with the exception of the African countries Tunisia, Algeria, and Egypt. The top consumer is Tunisia at 206.79 kg/person/year. Meanwhile, the top consumer of rice is Bangladesh, at 171.73 kg/person/year; it is also to be noted that this country is the world's fourth-largest producer of rice (Mottaleb, Rahut, Kruseman & Erenstein, 2017). Moreover, the top consumer of nuts and peanut butter is the United Arab Emirates, at 23.03 kg/person/year.
- In most of these countries &mdash; with the notable deviation of Bangladesh in South Asia and Cambodia in Southeast Asia &mdash; the primary plant-based products consumed are wheat and wheat-derived, which account for around 95% of their total plant-based consumption.
- Similar to the previously observed trends, the ten countries with the lowest record of animal-based food product consumption are mostly in Africa, with the exception of Paraguay and El Salvador in South America and Central America, respectively. Uganda has the lowest plant-based food consumption at 22.55 kg/person/year, followed by Rwanda (which has the lowest overall and animal-based consumption) at 18.56 kg/person/year. 
- This trend may possibly indicate a relationship between the economy of a country and consumption. Consumption is intrinsically tied with the available resources of a country. In particular, countries with enough or above-adequate resources for the production and even participate in the trade of goods also tend to have a more ample supply of food products (either locally produced or imported) for the consumption of their citizens.

### *Bar Graphs*

In order to give a more visual presentation of the data analyzed above, we construct stacked bar graphs to show the animal- and plant-based food product consumption of the respective top 20 countries.

In [11]:
sorted_animal_country_data = sorted_animal_country_data.head(20)
sorted_plant_country_data = sorted_plant_country_data.head(20)

*Note that this graph showing the animal-based consumption of the top 20 countries is interactive. Hovering on the sectors or clicking on the legend allows for a more granular look at the data.*

In [12]:
fig = go.Figure(data = [
    go.Bar(orientation = 'h',
          y = sorted_animal_country_data['country'],
          x = sorted_animal_country_data['Beef'],
          name = 'Beef'),
    go.Bar(orientation = 'h',
          y = sorted_animal_country_data['country'],
          x = sorted_animal_country_data['Eggs'],
          name = 'Eggs'),
    go.Bar(orientation = 'h',
          y = sorted_animal_country_data['country'],
          x = sorted_animal_country_data['Fish'],
          name = 'Fish'),
    go.Bar(orientation = 'h',
          y = sorted_animal_country_data['country'],
          x = sorted_animal_country_data['Lamb & Goat'],
          name = 'Lamb & Goat'),
    go.Bar(orientation = 'h',
          y = sorted_animal_country_data['country'],
          x = sorted_animal_country_data['Milk & Cheese'],
          name = 'Milk & Cheese'),
    go.Bar(orientation = 'h',
          y = sorted_animal_country_data['country'],
          x = sorted_animal_country_data['Pork'],
          name = 'Pork'),
    go.Bar(orientation = 'h',
          y = sorted_animal_country_data['country'],
          x = sorted_animal_country_data['Poultry'],
          name = 'Poultry'),
])

fig.update_layout(barmode = 'stack',
                 yaxis = dict(autorange = 'reversed'),
                 title = "Animal-Based Food Product Consumption of Top 20 Consumers",
                 xaxis_title = "Consumption (kg/person/year)",
                 yaxis_title = "Country")

fig.show()

This graph corroborates our earlier observation that, especially for most European states, milk and cheese products account for over three-fourths of these countries' animal-based food product consumption. 

*Note that this graph showing the plant-based consumption of the top 20 countries is interactive. Hovering on the sectors or clicking on the legend allows for a more granular look at the data.*

In [13]:
fig = go.Figure(data = [
    go.Bar(orientation = 'h',
          y = sorted_plant_country_data['country'],
          x = sorted_plant_country_data['Nuts & Peanut Butter'],
          name = 'Nuts & Peanut Butter'),
    go.Bar(orientation = 'h',
          y = sorted_plant_country_data['country'],
          x = sorted_plant_country_data['Rice'],
          name = 'Rice'),
    go.Bar(orientation = 'h',
          y = sorted_plant_country_data['country'],
          x = sorted_plant_country_data['Soybeans'],
          name = 'Soybeans'),
    go.Bar(orientation = 'h',
          y = sorted_plant_country_data['country'],
          x = sorted_plant_country_data['Wheat & Wheat Products'],
          name = 'Wheat & Wheat Products',
          marker_color = 'sandybrown')
])

fig.update_layout(barmode = 'stack',
                 yaxis = dict(autorange = 'reversed'),
                 title = "Plant-Based Food Product Consumption of Top 20 Consumers",
                 xaxis_title = "Consumption (kg/person/year)",
                 yaxis_title = "Country")

fig.show()

This graph corroborates our earlier observation that, for most states in West Asia, wheat and wheat-derived products account for more than 95% of these countries' total plant-based food product consumption. In Southeast Asian and selected South Asian countries, however, rice stands predominant.

### 2. Carbon Dioxide Emission

We pivot the dataset so that the columns are now the food categories and each row corresponds to a country. We also add a column showing the total emission per country.

In [14]:
co2_data = data.pivot(index = 'country', columns = 'food_category', values = 'co2_emmission').reset_index()

co2_data['TOTAL'] = co2_data['Beef'] + co2_data['Eggs'] + co2_data['Fish'] \
                    + co2_data['Lamb & Goat'] + co2_data['Milk & Cheese'] \
                    + co2_data['Nuts & Peanut Butter'] + co2_data['Pork'] + co2_data['Poultry'] \
                    + co2_data['Rice'] + co2_data['Soybeans'] + co2_data['Wheat & Wheat Products']

print("-- Carbon Dioxide Emission per Country --")
co2_data

-- Carbon Dioxide Emission per Country --


food_category,country,Beef,Eggs,Fish,Lamb & Goat,Milk & Cheese,Nuts & Peanut Butter,Pork,Poultry,Rice,Soybeans,Wheat & Wheat Products,TOTAL
0,Albania,694.30,11.44,6.15,536.50,432.62,7.72,38.51,14.21,9.96,0.00,26.44,1777.85
1,Algeria,172.80,7.40,5.97,269.30,201.60,3.68,0.00,7.97,3.80,0.00,35.36,707.88
2,Angola,259.82,1.02,24.33,37.82,17.52,4.00,31.47,18.62,10.39,0.23,7.77,412.99
3,Argentina,1712.00,10.46,6.96,54.63,277.87,0.87,37.20,41.53,11.22,0.00,19.66,2172.40
4,Armenia,606.67,10.74,6.96,105.76,297.74,4.51,34.23,14.34,4.07,0.00,24.91,1109.93
...,...,...,...,...,...,...,...,...,...,...,...,...,...
125,Uruguay,897.96,12.07,10.43,288.21,299.89,1.68,59.61,29.49,14.72,0.00,20.85,1634.91
126,Venezuela,798.91,5.17,13.32,11.21,167.78,0.62,25.59,42.19,29.93,0.00,9.38,1104.10
127,Vietnam,229.58,3.53,42.34,4.90,23.30,11.12,123.88,13.28,184.99,2.59,2.00,641.51
128,Zambia,146.88,3.05,9.90,23.81,13.83,8.92,5.88,3.53,3.90,3.29,2.31,225.30


We arrange the entries based on the total emission. As a preliminary visualization, we employ blue gradients to correspond to the amount of carbon dioxide emitted.

In [15]:
sorted_co2_data = co2_data.sort_values(by = 'TOTAL', ascending = False).reset_index(drop = True)

print("-- Carbon Dioxide Emission per Country (Sorted) --")
sorted_co2_data.style.background_gradient(cmap = 'Blues')

-- Carbon Dioxide Emission per Country (Sorted) --


food_category,country,Beef,Eggs,Fish,Lamb & Goat,Milk & Cheese,Nuts & Peanut Butter,Pork,Poultry,Rice,Soybeans,Wheat & Wheat Products,TOTAL
0,Argentina,1712.0,10.46,6.96,54.63,277.87,0.87,37.2,41.53,11.22,0.0,19.66,2172.4
1,Australia,1044.85,7.82,28.25,345.65,334.01,15.45,85.44,49.54,14.12,0.09,13.44,1938.66
2,Albania,694.3,11.44,6.15,536.5,432.62,7.72,38.51,14.21,9.96,0.0,26.44,1777.85
3,New Zealand,693.99,9.1,32.51,662.23,195.5,14.55,78.9,37.58,11.72,0.2,14.67,1750.95
4,Iceland,412.26,7.57,118.81,739.62,321.66,6.87,76.77,28.86,4.98,0.05,13.91,1731.36
5,United States,1118.29,13.39,19.72,15.06,362.78,13.91,97.83,53.72,8.8,0.02,15.34,1718.86
6,Uruguay,897.96,12.07,10.43,288.21,299.89,1.68,59.61,29.49,14.72,0.0,20.85,1634.91
7,Brazil,1211.17,8.25,15.98,21.71,212.63,1.19,44.6,48.34,41.12,1.63,10.11,1616.73
8,Luxembourg,922.03,13.45,36.87,58.48,363.65,1.65,154.25,22.96,5.37,0.02,19.68,1598.41
9,Kazakhstan,721.46,7.62,8.32,334.79,410.4,9.1,36.67,19.74,9.37,0.01,17.6,1575.08


Below are some observations that can be noted from the table above:
- Unlike consumption, emission has no clear trend as regards the ten countries that registered the highest figures. They come from different regions: South America (Argentina, Uruguay, and Brazil), Oceania (Australia and New Zealand), Europe (Albania, Iceland, and Luxembourg), Central Asia (Kazakhstan), and the United States. The highest emission is recorded by Argentina at 2172.4 kg CO<sub>2</sub>/kg/year. 
- Beef and milk and cheese are the primary contributory products to emission although the proportion of emission due to the latter is less pronounced compared to their consumption. The overall top emitter Argentina is also the top emitter of beef-related CO<sub>2</sub> at 1712 kg CO<sub>2</sub>/kg/year, accounting for 78.82% of its total emission. Meanwhile, Iceland, which also happens to be the top consumer of lamb and goat food products, is also the top emitter for the same food category, at 739.62 kg CO<sub>2</sub>/kg/year, which corresponds to around 42.72% of its total emission.
- On the other hand, the ten countries with the lowest record of food product-related emission are all in Africa, with Mozambique recording the lowest emission (141.4 kg CO<sub>2</sub>/person/year), followed by Rwanda at 181.63 kg CO<sub>2</sub>/person/year. Interestingly, Rwanda also happens to be the lowest food product consumer. In fact, there is a noticeable overlap between the lowest emitters although their placements are shuffled. The Republic of the Congo is the only country in the ten lowest emitters that is not in the ten lowest consumers as well (Cameroon is included instead). 

### *Animal-Based Food Products*

We now zero in on the food products that are derived from animals.

In [16]:
animal_co2_data = co2_data[animal_products]

animal_co2_data['TOTAL'] = animal_co2_data['Beef'] + animal_co2_data['Eggs'] + animal_co2_data['Fish'] \
                               + animal_co2_data['Lamb & Goat'] + animal_co2_data['Milk & Cheese'] \
                               + animal_co2_data['Pork'] + animal_co2_data['Poultry']

print("-- Animal-Based Food Product-Related Emission per Country --")
animal_co2_data

-- Animal-Based Food Product-Related Emission per Country --


food_category,country,Beef,Eggs,Fish,Lamb & Goat,Milk & Cheese,Pork,Poultry,TOTAL
0,Albania,694.30,11.44,6.15,536.50,432.62,38.51,14.21,1733.73
1,Algeria,172.80,7.40,5.97,269.30,201.60,0.00,7.97,665.04
2,Angola,259.82,1.02,24.33,37.82,17.52,31.47,18.62,390.60
3,Argentina,1712.00,10.46,6.96,54.63,277.87,37.20,41.53,2140.65
4,Armenia,606.67,10.74,6.96,105.76,297.74,34.23,14.34,1076.44
...,...,...,...,...,...,...,...,...,...
125,Uruguay,897.96,12.07,10.43,288.21,299.89,59.61,29.49,1597.66
126,Venezuela,798.91,5.17,13.32,11.21,167.78,25.59,42.19,1064.17
127,Vietnam,229.58,3.53,42.34,4.90,23.30,123.88,13.28,440.81
128,Zambia,146.88,3.05,9.90,23.81,13.83,5.88,3.53,206.88


We arrange the entries based on the total emission. As a preliminary visualization, we employ red gradients to correspond to the amount of carbon dioxide emitted.

In [17]:
sorted_animal_co2_data = animal_co2_data.sort_values(by = 'TOTAL', ascending = False).reset_index(drop = True)

print("-- Animal-Based Food Product-Related Emission per Country (Sorted) --")
sorted_animal_co2_data.style.background_gradient(cmap = 'Reds')

-- Animal-Based Food Product-Related Emission per Country (Sorted) --


food_category,country,Beef,Eggs,Fish,Lamb & Goat,Milk & Cheese,Pork,Poultry,TOTAL
0,Argentina,1712.0,10.46,6.96,54.63,277.87,37.2,41.53,2140.65
1,Australia,1044.85,7.82,28.25,345.65,334.01,85.44,49.54,1895.56
2,Albania,694.3,11.44,6.15,536.5,432.62,38.51,14.21,1733.73
3,New Zealand,693.99,9.1,32.51,662.23,195.5,78.9,37.58,1709.81
4,Iceland,412.26,7.57,118.81,739.62,321.66,76.77,28.86,1705.55
5,United States,1118.29,13.39,19.72,15.06,362.78,97.83,53.72,1680.79
6,Uruguay,897.96,12.07,10.43,288.21,299.89,59.61,29.49,1597.66
7,Luxembourg,922.03,13.45,36.87,58.48,363.65,154.25,22.96,1571.69
8,Brazil,1211.17,8.25,15.98,21.71,212.63,44.6,48.34,1562.68
9,Kazakhstan,721.46,7.62,8.32,334.79,410.4,36.67,19.74,1539.0


Below are some observations that can be noted from the table above:

- The ten countries that registered the highest figures as regards animal-based food product-related emission are the same as those for food product-related emission in general. In fact, their placements also remained constant, with the sole exception of Brazil and Luxembourg exchanging positions (#8 and #9). 
- Reiterating the point mentioned in the previous trend, these countries come from different regions: South America (Argentina, Uruguay, and Brazil), Oceania (Australia and New Zealand), Europe (Albania, Iceland, and Luxembourg), Central Asia (Kazakhstan), and the United States. The highest animal-based food product-related emission is recorded by Argentina at 2140.65 kg CO<sub>2</sub>/kg/year. 
- Beef and milk and cheese are the primary contributory products to emission although the proportion of emission due to the latter is less pronounced compared to their consumption. For instance, the overall top emitter Argentina is also the top emitter of beef-related CO<sub>2</sub> at 1712 kg CO<sub>2</sub>/kg/year, accounting for 80% of its total animal-based food product-related emission. Iceland, which also happens to be the top consumer of lamb and goat food products, is also the top emitter for the same food category, at 739.62 kg CO<sub>2</sub>/kg/year, which corresponds to around 43.37% of its total animal-based food product-related emission.
- On the other hand, most of the ten countries with the lowest animal-based food product-related emission are in Africa, with Liberia recording the lowest at 77.44 kg CO<sub>2</sub>/person/year. The non-African countries are the South Asian countries Bangladesh and India and the Southeast Asian country Indonesia. It may be fitting to note that India is a predominently Hindu country, which bans the slaughter of cattle (including cows) in most of its regions; cow-related products, like beef and dairy, are among the highest contributors of carbon footprint.

### ***Plant-Based Food Products***

We now zero in on the food products that are derived from plants.

In [18]:
plant_co2_data = co2_data[plant_products]

plant_co2_data['TOTAL'] = plant_co2_data['Nuts & Peanut Butter'] \
                              + plant_co2_data['Rice'] + plant_co2_data['Soybeans'] \
                              + plant_co2_data['Wheat & Wheat Products']
    
print("-- Plant-Based Food Product-Related Emission per Country --")
plant_co2_data

-- Plant-Based Food Product-Related Emission per Country --


food_category,country,Nuts & Peanut Butter,Rice,Soybeans,Wheat & Wheat Products,TOTAL
0,Albania,7.72,9.96,0.00,26.44,44.12
1,Algeria,3.68,3.80,0.00,35.36,42.84
2,Angola,4.00,10.39,0.23,7.77,22.39
3,Argentina,0.87,11.22,0.00,19.66,31.75
4,Armenia,4.51,4.07,0.00,24.91,33.49
...,...,...,...,...,...,...
125,Uruguay,1.68,14.72,0.00,20.85,37.25
126,Venezuela,0.62,29.93,0.00,9.38,39.93
127,Vietnam,11.12,184.99,2.59,2.00,200.70
128,Zambia,8.92,3.90,3.29,2.31,18.42


We arrange the entries based on the total emission. As a preliminary visualization, we employ green gradients to correspond to the amount of carbon dioxide emitted.

In [19]:
sorted_plant_co2_data = plant_co2_data.sort_values(by = 'TOTAL', ascending = False).reset_index(drop = True)

print("-- Plant-Based Food Product-Related Emission per Country (Sorted) --")
sorted_plant_co2_data.style.background_gradient(cmap = 'Greens')

-- Plant-Based Food Product-Related Emission per Country (Sorted) --


food_category,country,Nuts & Peanut Butter,Rice,Soybeans,Wheat & Wheat Products,TOTAL
0,Bangladesh,1.27,219.76,0.27,3.33,224.63
1,Cambodia,2.21,203.6,1.95,0.52,208.28
2,Vietnam,11.12,184.99,2.59,2.0,200.7
3,Indonesia,8.71,172.27,0.5,4.85,186.33
4,Myanmar,7.97,169.94,0.17,1.19,179.27
5,Philippines,3.68,152.85,0.01,4.41,160.95
6,Thailand,2.66,146.62,0.96,2.08,152.32
7,Sri Lanka,1.93,140.41,0.0,7.06,149.4
8,Sierra Leone,9.86,132.19,0.0,1.66,143.71
9,Guinea,7.33,124.28,0.0,3.54,135.15


Below are some observations that can be noted from the table above:

- For plant-based food product consumption, the ten countries that recorded the highest figures have shifted to Southeast Asia, the predominant rice-producing belt in the world, with the exception of the South Asian countries Sri Lanka and Bangladesh &mdash; which also happens to be the top consumer of rice and its fourth-largest producer (Mottaleb, Rahut, Kruseman & Erenstein, 2017) &mdash; and the African states Sierra Leone and Guinea. The highest emission is from Bangladesh at 224.63 kg CO<sub>2</sub>/person/year.
- In these countries, emission due to rice far outweighs the others, with close to 98% of the total plant-based food product-related emission attributed to the said crop.
- The Philippines is the 6<sup>th</sup> highest emitter of plant-based food product-related carbon dioxide at 160.95 kg CO<sub>2</sub>/person/year.
- Similar to the previously observed trends, the ten countries with the lowest record of animal-based food product consumption are mostly in Africa, with the exception of Paraguay, Guatemala Mexico in South America, Central America, and North America, respectively. Ethopia has the lowest plant-based food product-related emission at 11.18 kg CO<sub>2</sub>/person/year, followed by Uganda &mdash; which also happens to be the lowest plant-based food product-related consumer &mdash; at 14.54 kg CO<sub>2</sub>/person/year.

### *Bar Graphs*

In order to give a more visual presentation of the data analyzed above, we construct stacked bar graphs to show the animal- and plant-based food product-related emission of the respective top 20 countries.

In [20]:
sorted_animal_co2_data = sorted_animal_co2_data.head(20)
sorted_plant_co2_data = sorted_plant_co2_data.head(20)

*Note that this graph showing the animal-based food product-related emission of the top 20 countries is interactive. Hovering on the sectors or clicking on the legend allows for a more granular look at the data.*

In [21]:
fig = go.Figure(data = [
    go.Bar(orientation = 'h',
          y = sorted_animal_co2_data['country'],
          x = sorted_animal_co2_data['Beef'],
          name = 'Beef'),
    go.Bar(orientation = 'h',
          y = sorted_animal_co2_data['country'],
          x = sorted_animal_co2_data['Eggs'],
          name = 'Eggs'),
    go.Bar(orientation = 'h',
          y = sorted_animal_co2_data['country'],
          x = sorted_animal_co2_data['Fish'],
          name = 'Fish'),
    go.Bar(orientation = 'h',
          y = sorted_animal_co2_data['country'],
          x = sorted_animal_co2_data['Lamb & Goat'],
          name = 'Lamb & Goat'),
    go.Bar(orientation = 'h',
          y = sorted_animal_co2_data['country'],
          x = sorted_animal_co2_data['Milk & Cheese'],
          name = 'Milk & Cheese'),
    go.Bar(orientation = 'h',
          y = sorted_animal_co2_data['country'],
          x = sorted_animal_co2_data['Pork'],
          name = 'Pork'),
    go.Bar(orientation = 'h',
          y = sorted_animal_co2_data['country'],
          x = sorted_animal_co2_data['Poultry'],
          name = 'Poultry'),
])

fig.update_layout(barmode = 'stack',
                 yaxis = dict(autorange = 'reversed'),
                 title = "Animal-Based Food Product-Related CO<sub>2</sub> Emission of Top 20 Emitters",
                 xaxis_title = "Emission (kg CO<sub>2</sub>/person/year)",
                 yaxis_title = "Country")

fig.show()

This graph corroborates our earlier observation that, for these states, beef accounts for an significant majority of the food product consumption; it is followed by milk and cheese derivatives. Interestingly, beef is not among the most consumed products in these countries, which points towards the strong carbon footprint left by the production of this food product.

*Note that this graph showing the plant-based food product-related emission of the top 20 countries is interactive. Hovering on the sectors or clicking on the legend allows for a more granular look at the data.*

In [22]:
fig = go.Figure(data = [
    go.Bar(orientation = 'h',
          y = sorted_plant_co2_data['country'],
          x = sorted_plant_co2_data['Nuts & Peanut Butter'],
          name = 'Nuts & Peanut Butter'),
    go.Bar(orientation = 'h',
          y = sorted_plant_co2_data['country'],
          x = sorted_plant_co2_data['Rice'],
          name = 'Rice'),
    go.Bar(orientation = 'h',
          y = sorted_plant_co2_data['country'],
          x = sorted_plant_co2_data['Soybeans'],
          name = 'Soybeans'),
    go.Bar(orientation = 'h',
          y = sorted_plant_co2_data['country'],
          x = sorted_plant_co2_data['Wheat & Wheat Products'],
          name = 'Wheat & Wheat Products',
          marker_color = 'sandybrown')
])

fig.update_layout(barmode = 'stack',
                 yaxis = dict(autorange = 'reversed'),
                 title = "Plant-Based Food Product-Related CO<sub>2</sub> Emission of Top 20 Emitters",
                 xaxis_title = "Emission (kg CO<sub>2</sub>/person/year)",
                 yaxis_title = "Country")
fig.show()

This graph corroborates our earlier observation that, for these states (which are mostly located in Asia), emission due to rice far outweighs the others, with close to 98% of the total plant-based food product-related emission ascribed to the said crop.

<hr/>

## B. By Food Product

We now proceed to an identification of the trends related to both consumption and carbon dioxide emission based on the food product.

In [23]:
food_data = data.groupby(['food_category'], as_index = False).sum()

print("-- Consumption and Carbon Dioxide Emission By Food Product --")
food_data

-- Consumption and Carbon Dioxide Emission By Food Product --


Unnamed: 0,food_category,consumption,co2_emmission
0,Beef,1576.04,48633.26
1,Eggs,1061.29,974.95
2,Fish,2247.32,3588.22
3,Lamb & Goat,338.02,11837.38
4,Milk & Cheese,16350.71,23290.00
...,...,...,...
6,Pork,2096.08,7419.11
7,Poultry,2758.50,2963.16
8,Rice,3818.77,4886.91
9,Soybeans,111.87,50.35


### 1. Consumption

We arrange the food categories based on the total consumption. As a preliminary visualization, we employ blue gradients to correspond to the amount of food products consumed.

In [24]:
sorted_consumption_food_data = food_data.sort_values(by = 'consumption', ascending = False).reset_index(drop = True)
sorted_consumption_food_data.drop(columns = ['co2_emmission'], axis = 1, inplace = True)

sorted_consumption_food_data.style.background_gradient(cmap = 'Blues')

Unnamed: 0,food_category,consumption
0,Milk & Cheese,16350.71
1,Wheat & Wheat Products,9301.44
2,Rice,3818.77
3,Poultry,2758.5
4,Fish,2247.32
5,Pork,2096.08
6,Beef,1576.04
7,Eggs,1061.29
8,Nuts & Peanut Butter,537.84
9,Lamb & Goat,338.02


Below are some observations that can be noted from the table above:
- The top three food products consumed are milk and cheese (40.67%), wheat and wheat products (23.13%), and rice (9.50%). Taken collectively, these three categories alone already total to close to three-fourths of the gross food consumption worldwide.
- The three sources that are associated with the lowest CO<sub>2</sub> emissions are nuts and peanut butter, lamb and goat, and eggs. Taken collectively, they account for only 2.46% of the overall food consumption worldwide.
- The consumption of milk and cheese is around three-fourths more than that of wheat and wheat products (which are second on the list). In turn, the consumption of wheat and wheat products is roughly two and a half times that of rice (which is third on the list).

### 2. Carbon Dioxide Emission

We arrange the food categories based on the total carbon dioxide emission. As a preliminary visualization, we employ blue gradients to correspond to the amount of CO<sub>2</sub> emitted.

In [25]:
sorted_co2_food_data = food_data.sort_values(by = 'co2_emmission', ascending = False).reset_index(drop = True)
sorted_co2_food_data.drop(columns = ['consumption'], axis = 1, inplace = True)

sorted_co2_food_data.style.background_gradient(cmap = 'Reds')

Unnamed: 0,food_category,co2_emmission
0,Beef,48633.26
1,Milk & Cheese,23290.0
2,Lamb & Goat,11837.38
3,Pork,7419.11
4,Rice,4886.91
5,Fish,3588.22
6,Poultry,2963.16
7,Wheat & Wheat Products,1773.78
8,Eggs,974.95
9,Nuts & Peanut Butter,951.99


Below are some observations that can be noted from the table above:
- The top three sources of CO<sub>2</sub> emissions are beef (45.72%), milk and cheese (21.90%), and lamb and goat products (11.13%). Taken collectively, these three categories alone already total to close to four-fifths of the gross food-related emission worldwide.
- The three sources that are associated with the lowest CO<sub>2</sub> emissions are soybeans, nuts and peanut butter, and eggs. Taken collectively, they account for around 18% of the overall food-related emission worldwide.
- Soybeans and nuts and peanut butter are also among the three least consumed products worldwide.
- The emission due to beef products is over twice the emission due to milk and cheese products (which are second on the list). In turn, the emission due to cheese products is also over twice the emission due to lamb and goat (which are third on the list).

### *Overall Bar Graph*

In order to give a more visual presentation of the data analyzed above, we construct a bar graph to show the total consumption vis-a-vis emission for each of the 11 food categories. 

In [26]:
sorted_food_data = food_data.sort_values(by = 'co2_emmission', ascending = False).reset_index(drop = True)

In [27]:
fig = go.Figure(data = [
    go.Bar(x = sorted_food_data['food_category'],
          y = sorted_food_data['consumption'],
          name = 'Consumption'),
    
     go.Bar(x = sorted_food_data['food_category'],
          y = sorted_food_data['co2_emmission'],
          name = 'Carbon Dioxide Emission')
])

fig.update_layout(title = "Consumption and CO<sub>2</sub> Emission by Food Product",
                 xaxis_title = "Consumption (kg/person/year) and Emission (kg CO<sub>2</sub>/person/year)",
                 yaxis_title = "Country")

fig.show()

The graph above evinces the intense carbon dioxide emission that is associated with beef food products. Although its consumption is low (in fact, it is in the bottom half of the consumption hierarchy), its emission is immensely massive, accouning for 45.72% of total emission. 

On the other hand, although wheat and wheat products are consumed by a significant proportion of the global population (it is the second most-consumed food product), its emission is remarkably low, placing 8<sup>th</sup> in the emission hierarchy. 

This observation sparks interest as to whether there is a relationship (or, perhaps, a correlation) between the nature of the food product &mdash; beef is animal-derived whereas wheat is plant-based &mdash; and carbon dioxide emission, thus connecting this section to the next part of the data analysis, which focuses on statistical correlations.

<hr/>

## C. Correlations

In the previous phases of the data analysis, our focus was on the independent treatment of the variables. Although certain patterns as to the interplay of these variables have been observed and hypothesized, they have not been subjected to any statistical verification. In this section, we are going to formally explore certain correlations of interest.

### 1. Between Consumption & Carbon Dioxide Emission

First, we check if there is any correlation between consumption and carbon dioxide emission. 

In [28]:
consumption_co2 = country_data[['country','TOTAL']]
consumption_co2 = consumption_co2.rename(columns = {"TOTAL": "Total Consumption"})

consumption_co2['Total CO2 Emission'] = co2_data['TOTAL']

print("-- Total Consumption and Carbon Dioxide Emission per Country --")
consumption_co2

-- Total Consumption and Carbon Dioxide Emission per Country --


food_category,country,Total Consumption,Total CO2 Emission
0,Albania,532.73,1777.85
1,Algeria,364.51,707.88
2,Angola,115.99,412.99
3,Argentina,429.41,2172.40
4,Armenia,407.11,1109.93
...,...,...,...
125,Uruguay,433.60,1634.91
126,Venezuela,277.39,1104.10
127,Vietnam,268.74,641.51
128,Zambia,57.11,225.30


The scatterplot generated below seems to be indicative of a degree of correlation between consumption and carbon dioxide emission. 

*Note that this is an interactive graph. Hovering allows for a more granular look at the data.*

In [29]:
fig = go.Figure(data = [
    go.Scatter(x = consumption_co2['Total Consumption'],
          y = consumption_co2['Total CO2 Emission'],
          mode = 'markers')
])

fig.update_layout(title = "Total Country-Level Consumption versus Total CO<sub>2</sub> Emission",
                 xaxis_title = "Total Consumption (kg/person/year)",
                 yaxis_title = "Total CO<sub>2</sub> Emission (kg CO<sub>2</sub>/person/year)")

fig.show()

In order to give a numerical score to the strength of this correlation, the **Pearson correlation coefficient** (also refered to as **Pearson's *r***) is employed.

In [30]:
x = np.array(consumption_co2['Total Consumption'])
y = np.array(consumption_co2['Total CO2 Emission'])

scipy.stats.pearsonr(x, y)

(0.8003280560404025, 3.200073281545265e-30)

Since the correlation coefficient (the first value in the tuple returned by the <code>scipy</code> function) is 0.80, there exists a **high positive correlation between consumption and carbon dioxide emission**.

### 2. Between Food Source & Carbon Dioxide Emission

Next, we check if there is any correlation between the predominant food source (either animal or plant) of a country and carbon dioxide emission. 

The creator of the dataset, nu3 (2018), suggested taking the difference between the total animal-based food product-related emission and the total plant-based food product-related emission:

<blockquote> A low value means that a larger proportion of the population feeds on plant products which have a better carbon emission footprint. A negative value means that the majority of the population consumes more non-animal products than animal products and the carbon emissions caused by these products are higher in the country concerned than the total emissions caused by animal products. </blockquote>

In [31]:
animal_plant = animal_co2_data[['country', 'TOTAL']]
animal_plant = animal_plant.rename(columns = {"TOTAL": "Animal-Based Food Product Emission"})

animal_plant['Plant-Based Food Product Emission'] = plant_co2_data['TOTAL']
animal_plant['Total'] = animal_plant['Animal-Based Food Product Emission'] + animal_plant['Plant-Based Food Product Emission']
animal_plant['Difference'] = animal_plant['Animal-Based Food Product Emission'] - animal_plant['Plant-Based Food Product Emission']

animal_plant

food_category,country,Animal-Based Food Product Emission,Plant-Based Food Product Emission,Total,Difference
0,Albania,1733.73,44.12,1777.85,1689.61
1,Algeria,665.04,42.84,707.88,622.20
2,Angola,390.60,22.39,412.99,368.21
3,Argentina,2140.65,31.75,2172.40,2108.90
4,Armenia,1076.44,33.49,1109.93,1042.95
...,...,...,...,...,...
125,Uruguay,1597.66,37.25,1634.91,1560.41
126,Venezuela,1064.17,39.93,1104.10,1024.24
127,Vietnam,440.81,200.70,641.51,240.11
128,Zambia,206.88,18.42,225.30,188.46


We arrange the entries based on this difference. As a preliminary visualization, we employ red gradients to correspond to the intensity of the difference.

In [32]:
sorted_animal_plant = animal_plant.sort_values(by = 'Difference', ascending = False).reset_index(drop = True)
sorted_animal_plant.style.background_gradient(cmap = 'Reds')

food_category,country,Animal-Based Food Product Emission,Plant-Based Food Product Emission,Total,Difference
0,Argentina,2140.65,31.75,2172.4,2108.9
1,Australia,1895.56,43.1,1938.66,1852.46
2,Albania,1733.73,44.12,1777.85,1689.61
3,Iceland,1705.55,25.81,1731.36,1679.74
4,New Zealand,1709.81,41.14,1750.95,1668.67
5,United States,1680.79,38.07,1718.86,1642.72
6,Uruguay,1597.66,37.25,1634.91,1560.41
7,Luxembourg,1571.69,26.72,1598.41,1544.97
8,Brazil,1562.68,54.05,1616.73,1508.63
9,Kazakhstan,1539.0,36.08,1575.08,1502.92


Scrolling through the table seems to evince a correlation between the total emission and the said difference metric. We confirm this graphically through the scatterplot generated below.

*Note that this is an interactive plot. Hovering allows for a more granular look at the data.*

In [33]:
fig = go.Figure(data = [
    go.Scatter(x = animal_plant['Difference'],
          y = animal_plant['Total'],
          mode = 'markers')
])

fig.update_layout(title = "Total Country-Level Emission versus Animal-to-Plant-Based Emission Difference",
                 xaxis_title = "Total Country-Level Emission (kg CO<sub>2</sub>/person/year)",
                 yaxis_title = "Animal-to-Plant-Based Emission Difference (kg CO<sub>2</sub>/person/year)")

fig.show()

Visual inspection is sufficient to notice the markedly linear correlation between the variables being observed. Nevertheless, we still have to employ **Pearson correlation coefficient** (also known as **Pearson's *r***) to assign a numerical score to this correlation.

In [34]:
x = np.array(animal_plant['Difference'])
y = np.array(animal_plant['Total'])

scipy.stats.pearsonr(x, y)

(0.9840473105876909, 7.569400961764421e-98)

Since the correlation coefficient (the first value in the tuple returned by the <code>scipy</code> function) is 0.98, there exists a **very high positive correlation between the predominant food source of a country and its carbon dioxide emission**.

# References

- European Union. (2021). *Animal products*. https://ec.europa.eu/info/food-farming-fisheries/animals-and-animal-products/animal-products/
- Mottaleb, K., Rahut, D.N., Kruseman, G., & Erenstein, O. (2017). Wheat production and consumption dynamics in an Asian rice economy: The Bangladesh case. *European Journal of Development Research, 30*(1), 1-24. doi:10.1057/s41287-017-0096-1
- nu3. (2018). *Food carbon footprint index 2018*. https://www.nu3.de/blogs/nutrition/food-carbon-footprint-index-2018
- Quinton, A. (2019, June 27). *Cows and climate change*. University of California, Davis. https://www.ucdavis.edu/food/news/making-cattle-more-sustainable
- Stylianou, N., Guibourg, C., & Briggs, H. (2019, August 9). *Climate change calculator: What's your diet's carbon footprint?* https://www.bbc.com/news/science-environment-46459714
- World Health Organization. (n.d.). *Congo*. https://www.who.int/countries/cog/