# Emissions of different Food Products
This Dataset was sourced from *Science* and *Our World in Data* (OWID) by **AMANDAROSEKNUDSEN** and downloaded from [**kaggle.com**](https://www.kaggle.com/datasets/amandaroseknudsen/foodproductemissions).

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import plotly.express as px

In [2]:
data = pd.read_csv('Food_Product_Emissions.csv')

In [3]:
pd.options.display.float_format = '{:,.2f}'.format

### Preliminary Data Exploration

In [4]:
shape = data.shape
print(f"number of rows: {shape[0]}")
print(f"number of columns: {shape[1]}")

number of rows: 43
number of columns: 11


In [5]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 43 entries, 0 to 42
Data columns (total 11 columns):
 #   Column                                     Non-Null Count  Dtype  
---  ------                                     --------------  -----  
 0   Food product                               43 non-null     object 
 1   Land Use Change                            43 non-null     float64
 2   Feed                                       43 non-null     float64
 3   Farm                                       43 non-null     float64
 4   Processing                                 43 non-null     float64
 5   Transport                                  43 non-null     float64
 6   Packaging                                  43 non-null     float64
 7   Retail                                     43 non-null     float64
 8   Total from Land to Retail                  43 non-null     float64
 9   Total Global Average GHG Emissions per kg  43 non-null     float64
 10  Unit of GHG Emissions       

In [7]:
data.head()

Unnamed: 0,Food product,Land Use Change,Feed,Farm,Processing,Transport,Packaging,Retail,Total from Land to Retail,Total Global Average GHG Emissions per kg,Unit of GHG Emissions
0,Apples,-0.03,0.0,0.23,0.0,0.1,0.04,0.02,0.36,0.43,kg CO2e per kg food produced
1,Bananas,-0.03,0.0,0.27,0.06,0.29,0.07,0.02,0.68,0.86,kg CO2e per kg food produced
2,Barley,0.01,0.0,0.18,0.13,0.04,0.5,0.26,1.11,1.18,kg CO2e per kg food produced
3,Beef (beef herd),16.28,1.88,39.39,1.27,0.35,0.25,0.16,59.57,99.48,kg CO2e per kg food produced
4,Beef (dairy herd),0.91,2.51,15.69,1.11,0.42,0.27,0.18,21.09,33.3,kg CO2e per kg food produced


In [8]:
print(f"any duplicates: {data.duplicated().values.any()}")

any duplicates: False


The data doesn't contain any duplicates or NaN values. The datatypes are already in the right format. So there is nothing to clean. 

In [9]:
data.describe()

Unnamed: 0,Land Use Change,Feed,Farm,Processing,Transport,Packaging,Retail,Total from Land to Retail,Total Global Average GHG Emissions per kg
count,43.0,43.0,43.0,43.0,43.0,43.0,43.0,43.0,43.0
mean,1.26,0.46,3.47,0.26,0.19,0.27,0.08,6.0,9.47
std,3.35,0.92,7.08,0.37,0.16,0.34,0.09,10.49,18.07
min,-2.05,0.0,0.09,0.0,0.04,0.04,0.01,0.28,0.39
25%,0.0,0.0,0.34,0.0,0.09,0.06,0.03,0.91,1.02
50%,0.18,0.0,0.85,0.07,0.13,0.1,0.04,1.61,2.48
75%,0.81,0.0,2.25,0.3,0.22,0.32,0.11,6.02,6.82
max,16.28,2.94,39.39,1.27,0.78,1.63,0.33,59.57,99.48


### Data Analysis 

Which food has the most and which the least total GHG emissions?

In [18]:
emissions_max = data[data['Total Global Average GHG Emissions per kg'] == data['Total Global Average GHG Emissions per kg'].max()]
emissions_min = data[data['Total Global Average GHG Emissions per kg'] == data['Total Global Average GHG Emissions per kg'].min()]
print(f"Food with the most total GHG emissions: {emissions_max['Food product'].values[0]} - {emissions_max['Total Global Average GHG Emissions per kg'].values[0]} per kg")
print(f"Food with the least total GHG emissions: {emissions_min['Food product'].values[0]} - {emissions_min['Total Global Average GHG Emissions per kg'].values[0]} per kg")

Food with the most total GHG emissions: Beef (beef herd) - 99.48 per kg
Food with the least total GHG emissions: Citrus Fruit - 0.39 per kg
