# Food for thought: how to eat sustainably

Student names: Ardjano Mark 14713926, Daan Huisman 14650797, Ivo de Brouwer 11045841, Mauro Dieters 14533391

Team number: I2

In [21]:
# Load image from link
url = 'https://cdn.cbs.nl/images/3743526a303256676975626e61366c7a383931695a513d3d/900x450.jpg'

# Display image from URL with smaller size and subtitle
from IPython.display import Image, display

# Set the desired image width and height
width = 600
height = 300

# Set the subtitle text
subtitle = "© ANP / Robin Utrecht"

# Create an Image instance with the URL
image = Image(url=url, width=width, height=height)

# Display the image and subtitle
display(image)
print(subtitle)

© ANP / Robin Utrecht


## Introduction

Our world is changing. Humans have dominated this planet and its resources since the Agricultural Revolution about 12,000 years ago, when hunter-gatherers exchanged their nomadic lifestyles for permanent settlements and farming.
Since then, our population and our resource exploitation has increased precipitously. In recent years, the downside of our development is becoming ever more apparent.
The reports from the United Nations’ Intergovernmental Panel on Climate Change are increasingly alarming. Their 2021 report found that human activity is changing the Earth’s climate in “unprecedented” ways, with some of the changes now inevitable and “irreversible”.[1] It would seem drastic change at both the societal and individual level is an immediate necessity.
A significant piece of the puzzle is food. Recent estimates of the contribution of food emissions to worldwide greenhouse gas emissions range from one-quarter to one-third.[2] Moreover, agriculture takes up two third of our freshwater use and about half of the world’s habitable land.[3] It also causes severe acidification and eutrophication, which have caused the ongoing ‘nitrogen crisis’ in the Netherlands.
 In this data story, we explore how what we consume affects our environment. We hope it helps you make informed decisions about the food you eat.


## Dataset and Preprocessing

The dataset we used can be downloaded from https://www.science.org/doi/10.1126/science.aaq0216. It is called 'aaq0216_datas1.xls'. The dataset encompasses variables such as land use (in m2), greenhouse gas emissions (in kg CO2-eq), eutrophication (in kg PO43-eq) and freshwater (in liters), for 43 different agricultural products. The dataset is an xls file containing multiple Excel sheets. 

To look at the global totals of Greenhouse gas emissions for each product, for example, we can look at the following dataframe:

In [22]:
import plotly.graph_objs as go
import plotly.express as px
import matplotlib as plt
import pandas as pd
import numpy as np
import seaborn as sns
from plotly.subplots import make_subplots
from ipywidgets import interact, interactive, fixed, interact_manual
from ipywidgets import GridspecLayout
import ipywidgets as widgets

layout = go.Layout(
        font=dict(
        family=""""Lato, "Helvetica Neue", Helvetica, Arial, "Liberation Sans", sans-serif""",
        ),
        title=dict(font=dict(size=24)),
        newshape_label_padding=8,
        margin_pad=5,
        legend=dict(font=dict(size=14)),
    )

df_ghg = pd.read_excel('dataset1.xls', sheet_name=2, skiprows=2, nrows=43, usecols="A, F:L")
df_ghg['Total'] = df_ghg[['LUC', 'Feed', 'Farm', 'Processing', 'Transport', 'Packging', 'Retail']].sum(axis=1)
df_ghg.head()

Unnamed: 0,Product,LUC,Feed,Farm,Processing,Transport,Packging,Retail,Total
0,Wheat & Rye (Bread),0.1,0.0,0.847,0.217,0.129,0.09,0.058,1.441
1,Maize (Meal),0.315,0.0,0.475,0.052,0.06,0.06,0.026,0.988
2,Barley (Beer),0.009,0.0,0.176,0.128,0.035,0.497,0.264,1.109
3,Oatmeal,0.001,0.0,1.37,0.042,0.067,0.066,0.029,1.575
4,Rice,-0.022,0.0,3.553,0.065,0.096,0.084,0.063,3.839


Here, we can see that the greenhouse gas emissions are divided into different stages of the production and distribution of the product. We added another column to see the total amount of greenhouse gas emissions, as that is often the most important metric to compare.

## The type of food we consume is the most important factor

Li Europan lingues es membres del sam familie. Lor separat existentie es un myth. Por scientie, musica, sport etc, litot Europa usa li sam vocabular. Li lingues differe solmen in li grammatica, li pronunciation e li plu commun vocabules. Omnicos directe al desirabilite de un nov lingua franca: On refusa continuar payar custosi traductores.

### Some food groups are thirstier than others

Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu.

In [23]:
df_land = pd.read_excel('dataset1.xls', sheet_name=1, skiprows=2,nrows=43)
colormap = {
    'Wheat & Rye (Bread)':'darkblue',
    'Maize (Meal)':'darkblue',
    'Barley (Beer)':'darkblue',
    'Oatmeal':'darkblue',
    'Rice':'darkblue',
    'Potatoes':'darkblue',
    'Cassava':'darkblue',
    'Cane Sugar':'magenta',
    'Beet Sugar':'magenta',
    'Other Pulses':'magenta',
    'Peas':'brown',
    'Nuts':'brown',
    'Groundnuts':'brown',
    'Soymilk':'purple',
    'Tofu':'purple',
    'Soybean Oil':'black',
    'Palm Oil':'black',
    'Sunflower Oil':'black',
    'Rapeseed Oil':'black',
    'Olive Oil':'black',
    'Tomatoes':'green',
    'Onions & Leeks':'green',
    'Root Vegetables':'green',
    'Brassicas':'green',
    'Other Vegetables':'green',
    'Citrus Fruit':'orange',
    'Bananas':'orange',
    'Apples':'orange',
    'Berries & Grapes':'orange',
    'Wine':'orange',
    'Other Fruit':'orange',
    'Coffee':'gold',
    'Dark Chocolate':'gold',
    'Bovine Meat (beef herd)':'red',
    'Bovine Meat (dairy herd)':'red',
    'Lamb & Mutton':'red',
    'Pig Meat':'red',
    'Poultry Meat':'red',
    'Milk':'grey',
    'Cheese':'grey',
    'Eggs':'grey',
    'Fish (farmed)':'blue',
    'Crustaceans (farmed)':'blue'
}
fig = px.histogram(df_land, x='Product', y='Mean.5', color='Product', color_discrete_map=colormap)
fig.update_layout(
    font=dict(
        family='Lato, "Helvetica Neue", Helvetica, Arial, "Liberation Sans", sans-serif',
        size=13,
    ),
    yaxis=dict(title='Average Freshwater Withdrawals (L/NU)'),
    xaxis=dict(title='Products'),
    title='Freshwater Withdrawals (L/NU)',
    showlegend=False,
)
values = ['starchy', 'sugars', 'legume', 'vegan alt', 'oils', 'vegetables', 'fruits','proc nuts', 'meat', 'animal prod', 'fish']
colors = ['darkblue', 'magenta', 'brown', 'purple', 'black', 'green', 'orange','gold', 'red', 'grey', 'blue']


for i in range(len(values)):
    fig.add_annotation(
        x=1.05, y=0.85 - i*0.10,
        xref='paper', yref='paper',
        text=values[i],
        showarrow=True,
        arrowcolor='white',
        font=dict(color='white'),
        borderwidth=1,
        bgcolor= colors[i]
    )
fig.show()

> *Figure 1: Amount of freshwater withdrawals per nutritional units.*

In figure 1 you can see the different amount of liters water used per nutritional unit. In the figure you can also see that nuts and cheese use alot of water for the amount of nutritions that they give when eating them. Most of the meat substitutes are made from wheat, soy and different kinds of fungi (https://www.milieucentraal.nl/eten-en-drinken/milieubewust-eten/vleesvervangers/). The figure shows that there is a substantial difference between amount of water used for the different kinds of meat and the water usage from wheat and soymilk. This shows that there is also a large difference in the amount of water used between meat and plant based products. 

### Vegan is better

* Overall meat is the biggest contributor to a large ecological footprint
* Lamb & mutton are the overall biggest contributer
* The non-vegan products have an ecological foot print that is several times larger then the vegan products

In [24]:
#Mauro
non_vegan_products = ['Barley (Beer)','Cane Sugar','Milk','Cheese','Eggs','Fish (farmed)','Crustaceans (farmed)','Bovine Meat (beef herd)','Bovine Meat (dairy herd)','Lamb & Mutton','Pig Meat','Poultry Meat']
meat_products = ['Bovine Meat (beef herd)', 'Bovine Meat (dairy herd)', 'Lamb & Mutton', 'Pig Meat','Poultry Meat','Fish (farmed)','Crustaceans (farmed)']
dairy_products = ['Milk','Cheese','Eggs']
other_non_vegan_products = ['Cane Sugar','Crustaceans (farmed)','Barley (Beer)']

df_all_products = pd.read_excel("dataset1.xls", sheet_name="Results - Retail Weight", skiprows=2,nrows=43, index_col=None, na_values=["NA"])

#non vegan products
non_vegan = df_all_products["Product"].isin(non_vegan_products)
meat = df_all_products["Product"].isin(meat_products)
dairy = df_all_products["Product"].isin(dairy_products)
other_non_vegan = df_all_products["Product"].isin(other_non_vegan_products)

df_non_vegan_products = df_all_products[non_vegan]
df_meat_products = df_all_products[meat]
df_dairy_products = df_all_products[dairy]
df_other_non_vegan_products = df_all_products[other_non_vegan]

#non vegan
df_non_vegan_products_land_use = df_non_vegan_products.iloc[:, [0, 3]]
df_non_vegan_products_ghg_2013 = df_non_vegan_products.iloc[:, [0, 9]]
df_non_vegan_products_ghg_2007 = df_non_vegan_products.iloc[:, [0, 15]]
df_non_vegan_products_acid = df_non_vegan_products.iloc[:, [0, 21]]
df_non_vegan_products_eutro = df_non_vegan_products.iloc[:, [0, 27]]
df_non_vegan_products_fresh_water_withdraw = df_non_vegan_products.iloc[:, [0, 33]]
df_non_vegan_products_stress_water_use = df_non_vegan_products.iloc[:, [0, 39]]

#meat products
df_meat_products_land_use = df_meat_products.iloc[:, [0, 3]]
df_meat_products_ghg_2013 = df_meat_products.iloc[:, [0, 9]]
df_meat_products_ghg_2007 = df_meat_products.iloc[:, [0, 15]]
df_meat_products_acid = df_meat_products.iloc[:, [0, 21]]
df_meat_products_eutro = df_meat_products.iloc[:, [0, 27]]
df_meat_products_fresh_water_withdraw = df_meat_products.iloc[:, [0, 33]]
df_meat_products_stress_water_use = df_meat_products.iloc[:, [0, 39]]
#dairy
df_dairy_products_land_use = df_dairy_products.iloc[:, [0, 3]]
df_dairy_products_ghg_2013 = df_dairy_products.iloc[:, [0, 9]]
df_dairy_products_ghg_2007 = df_dairy_products.iloc[:, [0, 15]]
df_dairy_products_acid = df_dairy_products.iloc[:, [0, 21]]
df_dairy_products_eutro = df_dairy_products.iloc[:, [0, 27]]
df_dairy_products_fresh_water_withdraw = df_dairy_products.iloc[:, [0, 33]]
df_dairy_products_stress_water_use = df_dairy_products.iloc[:, [0, 39]]

#other non vegan
df_other_non_vegan_products_land_use = df_other_non_vegan_products.iloc[:, [0, 3]]
df_other_non_vegan_products_ghg_2013 = df_other_non_vegan_products.iloc[:, [0, 9]]
df_other_non_vegan_products_ghg_2007 = df_other_non_vegan_products.iloc[:, [0, 15]]
df_other_non_vegan_products_acid = df_other_non_vegan_products.iloc[:, [0, 21]]
df_other_non_vegan_products_eutro = df_other_non_vegan_products.iloc[:, [0, 27]]
df_other_non_vegan_products_fresh_water_withdraw = df_other_non_vegan_products.iloc[:, [0, 33]]
df_other_non_vegan_products_stress_water_use = df_other_non_vegan_products.iloc[:, [0, 39]]


#vegan products
grains_products = ['Wheat & Rye (Bread)', 'Maize (Meal)', 'Oatmeal', 'Rice']
vegetables_products = ['Potatoes', 'Cassava', 'Tomatoes', 'Onions & Leeks', 'Root Vegetables', 'Brassicas', 'Other Vegetables']
fruits_products = ['Citrus Fruit', 'Bananas', 'Apples', 'Berries & Grapes', 'Other Fruit']
legumes_products = ['Other Pulses', 'Peas', 'Nuts', 'Groundnuts']
plant_based_alternatives_products = ['Soymilk', 'Tofu']
oils_products = ['Soybean Oil', 'Sunflower Oil', 'Olive Oil']
others_products = ['Coffee', 'Dark Chocolate']

vegan = ~df_all_products['Product'].isin(non_vegan_products)
grains = df_all_products["Product"].isin(grains_products)
vegetables = df_all_products["Product"].isin(vegetables_products)
fruits = df_all_products["Product"].isin(fruits_products)
legumes = df_all_products["Product"].isin(legumes_products)
plant_based_alternatives = df_all_products["Product"].isin(plant_based_alternatives_products)
oils = df_all_products["Product"].isin(oils_products)
others = df_all_products["Product"].isin(others_products)


df_vegan_products = df_all_products[vegan]
df_grains_products = df_all_products[grains]
df_vegetables_products = df_all_products[vegetables]
df_fruits_products = df_all_products[fruits]
df_legumes_products = df_all_products[legumes]
df_plant_based_alternatives_products = df_all_products[plant_based_alternatives]
df_oils_products = df_all_products[oils]
df_others_products = df_all_products[others]

#vegan
df_vegan_products_land_use = df_vegan_products.iloc[:, [0, 3]]
df_vegan_products_ghg_2013 = df_vegan_products.iloc[:, [0, 9]]
df_vegan_products_ghg_2007 = df_vegan_products.iloc[:, [0, 15]]
df_vegan_products_acid = df_vegan_products.iloc[:, [0, 21]]
df_vegan_products_eutro = df_vegan_products.iloc[:, [0, 27]]
df_vegan_products_fresh_water_withdraw = df_vegan_products.iloc[:, [0, 33]]
df_vegan_products_stress_water_use = df_vegan_products.iloc[:, [0, 39]]

# Define the data for the bar plots
categories = ['Land use', 'GHG emission 2013', 'GHG emission 2007', 'Acidification', 'Eutrophication', 'Fresh Water Withdrawal', 'Stress Water Use']
vegan_values = [df_vegan_products_land_use["Mean"].sum(),
                df_vegan_products_ghg_2013["Mean.1"].sum(),
                df_vegan_products_ghg_2007["Mean.2"].sum(),
                df_vegan_products_acid["Mean.3"].sum(),
                df_vegan_products_eutro["Mean.4"].sum(),
                df_vegan_products_fresh_water_withdraw["Mean.5"].sum(),
                df_vegan_products_stress_water_use["Mean.6"].sum()]
non_vegan_values = [df_non_vegan_products_land_use["Mean"].sum(),
                    df_non_vegan_products_ghg_2013["Mean.1"].sum(),
                    df_non_vegan_products_ghg_2007["Mean.2"].sum(),
                    df_non_vegan_products_acid["Mean.3"].sum(),
                    df_non_vegan_products_eutro["Mean.4"].sum(),
                    df_non_vegan_products_fresh_water_withdraw["Mean.5"].sum(),
                    df_non_vegan_products_stress_water_use["Mean.6"].sum()]
yaxvalues = ['(m2/FU)', '(kg CO2eq/FU)', '(kg CO2eq/FU)', '(g SO2eq/FU)', '(g PO43-eq/FU)', '(L/FU)', '(L/FU)']

# function to create sunbursts    
def create_sunburst_figure(title, values):
    colors = ['darkgreen', 'darkred'] * 2  
    fig = go.Figure(go.Sunburst(
        labels=labels,
        parents=parents,
        values=values,
        marker=dict(colors=colors),
    ))
    
    fig.update_layout(title=title)
    return fig

labels = ["", "Non-vegan", "Vegan"] + list(df_non_vegan_products["Product"]) + list(df_vegan_products["Product"])
parents = ["", "", ""] + ["Non-vegan"] * len(df_non_vegan_products) + ["Vegan"] * len(df_vegan_products)

# Creating the sunbursts
# Land use
values = [0] * 3 + list(df_non_vegan_products_land_use["Mean"]) + list(df_vegan_products_land_use["Mean"])
data1 = create_sunburst_figure("Land use", values)
# GHG emission 2013
values = [0] * 3 + list(df_non_vegan_products_ghg_2013["Mean.1"]) + list(df_vegan_products_ghg_2013["Mean.1"])
data2 = create_sunburst_figure("GHG emission 2013", values)
# GHG emission 2007
values = [0] * 3 + list(df_non_vegan_products_ghg_2007["Mean.2"]) + list(df_vegan_products_ghg_2007["Mean.2"])
data3 = create_sunburst_figure("GHG emission 2007", values)
# Acidification
values = [0] * 3 + list(df_non_vegan_products_acid["Mean.3"]) + list(df_vegan_products_acid["Mean.3"])
data4 = create_sunburst_figure("Acidification", values)
# Eutrophication
values = [0] * 3 + list(df_non_vegan_products_eutro["Mean.4"]) + list(df_vegan_products_eutro["Mean.4"])
data5 = create_sunburst_figure("Eutrophication", values)
# Fresh Water Withdrawal
values = [0] * 3 + list(df_non_vegan_products_fresh_water_withdraw["Mean.5"]) + list(df_vegan_products_fresh_water_withdraw["Mean.5"])
data6 = create_sunburst_figure("Fresh Water Withdrawal", values)
# Stress Water Use
values = [0] * 3 + list(df_non_vegan_products_stress_water_use["Mean.6"]) + list(df_vegan_products_stress_water_use["Mean.6"])
data7 = create_sunburst_figure("Stress Water Use", values)

# Creating the figure
specs_list1 = []
specs_list2 = []

for i in range(len(categories)):
    specs_list1.append({'type': 'bar'})
for i in range(len(categories)):
    specs_list2.append({"type": "sunburst"})
specs_list = [specs_list2, specs_list1]

fig = make_subplots(rows=2, cols=len(categories), specs=specs_list, vertical_spacing=0.2)

# Creating and adding the bar plots to the figure
for category, vegan_value, non_vegan_value in zip(categories, vegan_values, non_vegan_values):
    fig.add_trace(go.Bar(
        x=['Vegan', 'Non-vegan'],
        y=[vegan_value, non_vegan_value],
        name=category,
        marker_color=['darkgreen', 'darkred']
    ), row=2, col=categories.index(category) + 1)

    fig.update_xaxes(title_text=category, row=2, col=categories.index(category) + 1)
    fig.update_yaxes(title_text=yaxvalues[categories.index(category)], row=2, col=categories.index(category) + 1)

# Adding sunbursts to the figure
fig.add_trace(data1.data[0], row=1, col=1)
fig.add_trace(data2.data[0], row=1, col=2)
fig.add_trace(data3.data[0], row=1, col=3)
fig.add_trace(data4.data[0], row=1, col=4)
fig.add_trace(data5.data[0], row=1, col=5)
fig.add_trace(data6.data[0], row=1, col=6)
fig.add_trace(data7.data[0], row=1, col=7)

fig.update_layout(width = 2250, height=800, showlegend=False)
fig.show()

> *Figure 2: Bar plots of vegan vs. non-vegan emissions.*

These figures show the significant difference in land use and greenhouse gas emissions between vegan and non-vegan products. In all cases, square meters per fixture unit is used for land use, kg CO2 equivalent per fixture unit is used for GHG emissions.

The figures are meant to support the perspective that the type of food matters more than the sourcing; we need to evaluate what
it is we eat, not necessarily where it’s from.

Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu.

In [25]:
df_land = pd.read_excel('dataset1.xls', sheet_name=2, skiprows=2, nrows=43, usecols="A, C:E")
df_land['Total'] = df_land[['Arable', 'Fallow', 'Perm Past']].sum(axis=1)

df_ghg = pd.read_excel('dataset1.xls', sheet_name=2, skiprows=2, nrows=43, usecols="A, F:L")
df_ghg['Total'] = df_ghg[['LUC', 'Feed', 'Farm', 'Processing', 'Transport', 'Packging', 'Retail']].sum(axis=1)

df_eutr = pd.read_excel('dataset1.xls', sheet_name=2, skiprows=2, nrows=43, usecols="A, N")
df_eutr.rename(columns={"Total.1": "Total"}, inplace=True)

df_fresh = pd.read_excel('dataset1.xls', sheet_name=2, skiprows=2, nrows=43, usecols="A, O")
df_fresh.rename(columns={"Total.2": "Total"}, inplace=True)

df_stress = pd.read_excel('dataset1.xls', sheet_name=2, skiprows=2, nrows=43, usecols="A, P")
df_stress.rename(columns={"Total.3": "Total"}, inplace=True)

crop_color = '#35A85A'
meat_color = '#ff1a0d'
dairy_color = '#fd9a04'
seafood_color = '#1092d1'


# Combine the dataframes into a single dataframe
df_combined = pd.concat([df_land['Total'], df_ghg['Total'], df_eutr['Total'], df_fresh['Total'], df_stress['Total']], axis=1)
df_combined.columns = ['Land Use', 'GHG Emissions', 'Eutrophication', 'Freshwater Use', 'Stress-weighted Water Use']

# Define custom colors for GHG Emissions
df_combined[df_combined.columns] = df_combined[df_combined.columns].apply(lambda x: pd.qcut(x, q=3, labels=['low', 'medium', 'high']))
# color_mapping = {'low': 'lightsteelblue', 'medium': 'mediumseagreen', 'high': 'salmon'}
# colors = [color_mapping[category] for category in df_combined['GHG Emissions']]

# assigns categories
df_combined['Type'] = 0
df_combined.loc[33:37, 'Type'] = 2
df_combined.loc[38:40, 'Type'] = 3
df_combined.loc[41:42, 'Type'] = 1

df_combined['Class'] = 'Vegan'
df_combined.loc[33:42, 'Class'] = 'Non-vegan'

colorscale = [[0, crop_color], [0.33, seafood_color], [0.66, dairy_color], [1, meat_color]];
colors = df_combined['Type'];

highlighted = ['Olive Oil', 'Bovine Meat (beef herd)', 'Cheese', 'Lamb & Mutton', 'Dark Chocolate', 'Fish (farmed)', 'Eggs', 'Poultry Meat', 'Milk', 'Potatoes', 'Milk', 'Rice', 'Soymilk', 'Nuts', 'Bananas', 'Apples']


# Create Parallel categories plot
dimensions = [
    go.parcats.Dimension(values=df_combined['Class'], label='Vegan', categoryorder='array', categoryarray=['Vegan', 'Non-vegan'], ticktext=['Yes', 'No']),
    go.parcats.Dimension(values=df_combined['Type'], label='Type', categoryorder='array', categoryarray=[0, 1, 2, 3], ticktext=['Crops', 'Seafood', 'Dairy', 'Meat']),
    go.parcats.Dimension(values=df_combined['Land Use'], label='Land Use', categoryorder='array', categoryarray=['low', 'medium', 'high']),
    go.parcats.Dimension(values=df_combined['Eutrophication'], label='Eutrophication', categoryorder='array', categoryarray=['low', 'medium', 'high']),
    go.parcats.Dimension(values=df_combined['Freshwater Use'], label='Freshwater Use', categoryorder='array', categoryarray=['low', 'medium', 'high']),
    go.parcats.Dimension(values=df_combined['Stress-weighted Water Use'], label='Stress-weighted', categoryorder='array', categoryarray=['low', 'medium', 'high']),
    go.parcats.Dimension(values=df_combined['GHG Emissions'], label='GHG Emissions', categoryorder='array', categoryarray=['low', 'medium', 'high'])
]

parcats_trace = go.Parcats(dimensions=dimensions, line={'color': colors, 'colorscale':colorscale})

fig = go.Figure(data=parcats_trace)
fig.update_layout(title='Analysis of Environmental Factors')
fig.show()


> *Figure 3: Parallel categories plot of vegan vs. non-vegan emissions.*

Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu.

## We need to grow food locally

Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu.

### Transport emissions are crippling

Phasellus viverra nulla ut metus varius laoreet. Quisque rutrum. Aenean imperdiet. Etiam ultricies nisi vel augue. Curabitur ullamcorper ultricies nisi. Nam eget dui. Etiam rhoncus. Maecenas tempus, tellus eget condimentum rhoncus, sem quam semper libero, sit amet adipiscing sem neque sed ipsum. Nam quam nunc, blandit vel, luctus pulvinar, hendrerit id, lorem. Maecenas nec odio et ante tincidunt tempus. Donec vitae sapien ut libero venenatis faucibus. Nullam quis ante. Etiam sit amet orci eget eros faucibus tincidunt. Duis leo. Sed fringilla mauris sit amet nibh. Donec sodales sagittis magna. Sed consequat, leo eget bibendum sodales, augue velit cursus nunc.

In [26]:
standard_discrete = px.colors.qualitative.T10

df_per_product = pd.read_excel('dataset1.xls', sheet_name=2, skiprows=2, nrows=70, usecols="A, F:L")
df_per_product = df_per_product.rename(columns={'LUC': 'Land Use Change', 'Packging': 'Packaging'})

df = df_per_product[0:43].copy()
df['Total'] = df.iloc[:, 1:].sum(axis=1)
# display(df)
df = df.sort_values(by='Total')

# display(df.Product)

df_melted = pd.melt(df, id_vars='Product', value_vars=['Land Use Change', 'Feed', 'Farm', 'Processing', 'Transport', 'Packaging', 'Retail'], var_name='Stage', value_name='Emissions')

# Reshape the DataFrame into a "long" format
df_melted = pd.melt(df, id_vars='Product', value_vars=['Land Use Change', 'Feed', 'Farm', 'Processing', 'Transport', 'Packaging', 'Retail'], var_name='Stage', value_name='Emissions')


df_cat = pd.read_excel('dataset1.xls', sheet_name=2, skiprows=2, nrows=43, usecols="A:B, F:L")
df_cat = df_cat.rename(columns={'LUC': 'Land Use Change', 'Packging': 'Packaging', "Food and Waste ('000 t, 2009-11 avg.)": 'Amount produced'})
df_cat['Total emissions'] = df_cat.iloc[:, 2:].sum(axis=1)
df_cat = pd.concat([df_cat, pd.read_excel('dataset1.xls', sheet_name=2, skiprows=2, nrows=43, usecols="M:P")], axis=1)
df_cat = df_cat.rename(columns={'Total': 'Acidification', 'Total.1': 'Eutrophication', 'Total.2': 'Freshwater', 'Total.3': 'Stress-weighted water usage'})

# assigns categories
df_cat['Type'] = 'Crops'
df_cat.loc[33:37, 'Type'] = 'Meat'
df_cat.loc[38:40, 'Type'] = 'Dairy'
df_cat.loc[41:42, 'Type'] = 'Seafood'


highlighted = ['Crustaceans (farmed)','Olive Oil', 'Bovine Meat (beef herd)', 'Cheese', 'Lamb & Mutton', 'Dark Chocolate', 'Fish (farmed)', 'Eggs', 'Poultry Meat', 'Milk', 'Potatoes', 'Milk', 'Rice', 'Soymilk', 'Nuts', 'Bananas', 'Apples']


# creates subcategory for labels
df_cat['labeled'] = df_cat['Product'].where(df_cat['Product'].isin(highlighted))

fig = px.scatter(
    df_cat, 
    x="Amount produced",
    y="Total emissions",
    log_x=True,
    height=500,
    text='labeled',
    # log_y=True,
    # size="Total impact",
    size_max=60,
    hover_data=['Product'],
    title="GHG impact and global production"
    )

df_filtered = df_cat[df_cat['Product'].isin(highlighted)]

annotation= [{
    'x': np.log10(df_cat.loc[df_cat['Product'] == name, 'Amount produced'].iloc[0]),
    'y': df_cat.loc[df['Product'] == name, 'Total emissions'].iloc[0],
    'text': name,  # text
    'showarrow': True,  # would you want to see arrow
} for name in highlighted]


updatemenus = [
    {
        "buttons": [
            {
                "label": col,
                "method": "update",
                "args": [
                    {"y": [df_cat[col]]},
                    {"yaxis": {"title": {"text": col}}}
                ],
            }
            for col in list(df_cat.loc[:, 'Total emissions': 'Stress-weighted water usage'].columns)
        ],
        "x": 0,
        "y": 1.2,
    }
]


fig.update_layout(updatemenus=updatemenus, title_x=0.5, yaxis_title='Emissions (kg CO2eq / kg)', xaxis_title='Global production (tonnes)')





fig.update_traces(marker={
        "color": pd.Categorical(df_cat["Type"]).codes,
        "colorscale": [(0,crop_color), (0.33,dairy_color), (0.66,meat_color), (1,seafood_color)]
    },
    marker_size=10,
    marker_opacity=df_cat['labeled'].notnull().map({True: 0.8, False: 0.35}).values,
    textposition="top center",
    hoverinfo="name+x+y",
    hovertemplate='Amount produced=%{x}<br>Total=%{y}<br>Product=%{customdata[0]}<extra></extra>',
    textfont={'size': 12}
)

# adds custom legend
types = ['Crops', 'Seafood', 'Dairy', 'Meat']
colors = [crop_color, seafood_color, dairy_color, meat_color]
for t, color in zip(types,colors):
    fig.add_trace(
        go.Scatter(
            x=[None], y=[None],
            mode='markers',
            marker=dict(
                size=10,
                color=color
            ),
            showlegend=True,
            name=t
        )
    )

fig.show()

> *Figure 4: Scatter plot of global production (in tonnes) and greenhouse gas emissions (in kg CO2eq / kg).*

This scatter plot shows that high global production and low emissions per kg are highly correlated. This figure is meant to support the perspective that it matters a lot whether you produce food locally or globally; the emissions
produced during transport are significant, and a huge factor in the total emissions of the
food we consume.

In [27]:
correlations = df_cat.loc[:, 'Total emissions': 'Stress-weighted water usage']


corr = correlations.corr(method='pearson')


fig = px.imshow(corr, text_auto='.2f', aspect='auto', color_continuous_scale='viridis_r')
fig.update_xaxes(side="top")
fig.show()

> *Figure 5: Heat map*

In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium. Integer tincidunt. Cras dapibus. Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a, tellus.

In [28]:
df_transport =  pd.read_excel('dataset1.xls', sheet_name=2, skiprows=2, nrows=43, usecols="A, J")

# assigns categories
df_transport['Type'] = 'Crops'
df_transport.loc[33:37, 'Type'] = 'Meat'
df_transport.loc[38:40, 'Type'] = 'Dairy'
df_transport.loc[41:42, 'Type'] = 'Seafood'

df_transport = df_transport.sort_values(by='Transport')
df_transport = df_transport.set_index('Product')

fig1 = px.bar(df_transport,
    y=df_transport.index,
    x='Transport',
    orientation='h',
    color='Type',
        color_discrete_map={
        'Crops': crop_color,
        'Meat': meat_color,
        'Dairy': dairy_color,
        'Seafood': seafood_color
    },
    height=800,
    title="Transport emissions per product")

fig1.update_layout( yaxis={'categoryorder':'array', 'categoryarray':df_transport.index})
fig1.show()

> *Figure 6: Bar chart of transport emissions*

In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium. Integer tincidunt. Cras dapibus. Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a, tellus.

## Global food production is the answer

In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium. Integer tincidunt. Cras dapibus. Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a, tellus.

### Transport emissions are just a sliver of the whole

Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt.

In [32]:
# Create a horizontal stacked bar plot using plotly.express
fig = px.bar(df_melted,
             y='Product',
             x='Emissions',
             color='Stage',
             height=800,
             color_discrete_sequence=standard_discrete,
             title="Greenhouse gas emissions per food type over the supply chain")

frames = [go.Frame(data=[go.Bar(marker=dict(opacity=opacity)) if i != get_traceindex('Transport', fig) else go.Bar(marker=dict(opacity=1)) for i in range(len(fig.data))]) for opacity in [1, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1]]

updatemenus = [dict(
    type='buttons',
    showactive=True,
    xanchor='right',
    yanchor='top',
    x=1,
    y=1.09,
    direction="left",
    buttons=[
    dict(
        label='Play',
        method='animate',
        args=[None, dict(frame=dict(duration=5), transition=dict(duration=150))]
    ),
    dict(
        label='Reset',
        method='restyle',
        args=[{'marker.opacity': 1}]    )
]
)]


fig.update_layout(barmode='relative')


fig.update_layout(layout,        
        legend=dict(
        orientation="h",
        yanchor="top",
        y=1.065,
        xanchor="left",
        x=-0.06),
        updatemenus=updatemenus
)


# fig.layout.updatemenus[0].buttons[0].args[1]["frame"]["duration"] = 2000



fig.frames = frames


# fig.updatemenus[0].buttons[0].args[1] = dict(frame=dict(duration=1000), transition=dict(duration=500))


fig.show()

NameError: name 'get_traceindex' is not defined

> *Figure 7: Animated bar chart of transport emissions per product compared to other types of emissions.*

This figure shows what part of the greenhouse gas emissions for each food type is caused by transport of the product. The figure is meant to show that transport is a tiny portion of the total greenhouse gas emissions, and supports the perspective that the type of food matters more than the sourcing; we need to evaluate what
it is we eat, not necessarily where it’s from.

A pie chart more concretely shows the relative (total) CO2 impact of transport:

In [31]:
df2 = df_per_product.iloc[55]
fig = px.pie(df2, values=df2.values[1:], names=df2.index[1:], hole=.3, height=600, title='Percentage of total GHG emissions by part of supply chain')
fig.update_traces(textposition='outside', textinfo='label+percent', marker=dict(colors=standard_discrete, line=dict(color='#000000', width=2)), showlegend=False)
fig.show()

> *Figure 8: Pie chart of the total greenhouse gas emissions divided into each part of the supply chain.*

## Reflection

Following the feedback session, we improved the legend of the first plot as well as the readability.
We made sure to use a consistent style and color scheme for all figures. 
For each plot, we made sure it relates to an argument that supports one of the perspectives described in the introduction.
Furthermore, we limited any dropdown menus to at most 5 options, and limited the amount of different variables for each individual plot.
Aside from this, we continued working on our data story, adding supportive text to each visualisation and improved on some minor visual details as discussed during the feedback session.

## Work Distribution

#### Ardjano Mark 14713926
Visualisation 4-8, came up with the perspectives, animation
#### Daan Huisman 14650797
Visualisation 1, plot styles, making sure the visualisations look good and consistent
#### Ivo de Brouwer 11045841
Visualisation 3-4, introduction/dataset and preprocessing/reflection sections, supporting text under figures
#### Mauro Dieters 14533391
Visualisation 2, helped with most other visualisations

---

Every team member helped to some extent with all visualisations as well as editing the supporting text.

## References

[1]  https://www.theguardian.com/environment/2021/aug/10/code-red-for-humanity-what-the-papers-say-about-the-ipcc-report-on-the-climate-crisis

[2] https://ourworldindata.org/greenhouse-gas-emissions-food

[3] https://www.science.org/doi/10.1126/science.aaq0216

## Appendix

Generative AI (ChatGPT with GPT 3.5) is used to facilitate the creation of this document, as shown in the table below.

| Reasons of Usage | In which parts? | Which prompts were used? |
| ------------------------ | --------------------------------- | -------------------------------------------- |
| Brainstorm research questions and identify keywords for further search | The entire project framing | "Give keywords about the current debate in climate change with brief explanations" |
| Improve writing clarity and enhance readability | All sections | "Edit the following text to make it more clear. Do not alter the meaning." |
| Enhance readability | All sections | "Revise the paragraph to improve readability." |
| Ensure grammatical accuracy |  All sections | "Correct any grammatical errors in the text." |
| Provide alternative phrasing | Descriptions of the perspectives | "Suggest alternative phrases for better clarity." |
| Optimize sentence structure | All sections | "Restructure the sentence for better flow." |
| Condense lengthy sentences | All sections | "Simplify the following sentences without losing important information."|

> *Table 1: Usage of generative AI to facilitate the creation of this document.*