<a href="https://colab.research.google.com/github/RaghadAlzahranii/Food-waste-analysis/blob/main/food_waste_analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Food Waste Analysis**
### By Raghad Alzahrani

**📖 Description**:  

This notebook presents an analysis of global food waste in honor of World Food Day. It explores food waste by region and sector, estimates the number of people who could have been fed using the wasted food, and calculates the environmental impact in terms of carbon emissions. The analysis provides actionable insights supported by data visualizations to raise awareness and drive discussion on reducing food waste, feeding more people, and addressing climate change.

📊 **Key Insights:**

	•	Total Global Food Waste: 930 million tonnes per year.
	•	People Who Could Have Been Fed: 1.86 billion people annually.
	•	Total CO2 Emissions from Food Waste: 2.33 billion tonnes of CO2 per year.
  
	Food Waste by Sector:
	•	Households: 569 million tonnes/year
	•	Retail: 118 million tonnes/year
	•	Food Services: 244 million tonnes/year

📂 **Dataset:**

The dataset used for this analysis is publicly available on Kaggle.
You can download it from the following link:

Kaggle Dataset: [Food Waste Data](https://www.kaggle.com/datasets/joebeachcapital/food-waste/data)

# Step 1: Import Libraries and Load Data

In [None]:
# Import libraries
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')
%matplotlib inline

# Load the dataset
file_path = '/content/Food_waste.csv'
df = pd.read_csv(file_path)

# Step 2: Exploratory Data Analysis (EDA)

In [None]:
df.head(5)

Unnamed: 0,Country,combined figures (kg/capita/year),Household estimate (kg/capita/year),Household estimate (tonnes/year),Retail estimate (kg/capita/year),Retail estimate (tonnes/year),Food service estimate (kg/capita/year),Food service estimate (tonnes/year),Confidence in estimate,M49 code,Region,Source
0,Afghanistan,126,82,3109153,16,594982,28,1051783,Very Low Confidence,4,Southern Asia,https://www.unep.org/resources/report/unep-foo...
1,Albania,127,83,238492,16,45058,28,79651,Very Low Confidence,8,Southern Europe,https://www.unep.org/resources/report/unep-foo...
2,Algeria,135,91,3918529,16,673360,28,1190335,Very Low Confidence,12,Northern Africa,https://www.unep.org/resources/report/unep-foo...
3,Andorra,123,84,6497,13,988,26,1971,Low Confidence,20,Southern Europe,https://www.unep.org/resources/report/unep-foo...
4,Angola,144,100,3169523,16,497755,28,879908,Very Low Confidence,24,Sub-Saharan Africa,https://www.unep.org/resources/report/unep-foo...


In [None]:
df.tail(5)

Unnamed: 0,Country,combined figures (kg/capita/year),Household estimate (kg/capita/year),Household estimate (tonnes/year),Retail estimate (kg/capita/year),Retail estimate (tonnes/year),Food service estimate (kg/capita/year),Food service estimate (tonnes/year),Confidence in estimate,M49 code,Region,Source
209,Venezuela (Boliv. Rep. of),116,72,2065461,16,445994,28,788407,Very Low Confidence,862,Latin America and the Caribbean,https://www.unep.org/resources/report/unep-foo...
210,Viet Nam,120,76,7346717,16,1508689,28,2666991,Very Low Confidence,704,South-eastern Asia,https://www.unep.org/resources/report/unep-foo...
211,Yemen,148,104,3026946,16,456099,28,806270,Very Low Confidence,887,Western Asia,https://www.unep.org/resources/report/unep-foo...
212,Zambia,122,78,1391729,16,279350,28,493822,Very Low Confidence,894,Sub-Saharan Africa,https://www.unep.org/resources/report/unep-foo...
213,Zimbabwe,144,100,1458564,16,229059,28,404920,Very Low Confidence,716,Sub-Saharan Africa,https://www.unep.org/resources/report/unep-foo...


In [None]:
df.shape

(214, 12)

**Dataset consists 12 features and total 214 countries record.**

In [None]:
df.isna().sum()

Unnamed: 0,0
Country,0
combined figures (kg/capita/year),0
Household estimate (kg/capita/year),0
Household estimate (tonnes/year),0
Retail estimate (kg/capita/year),0
Retail estimate (tonnes/year),0
Food service estimate (kg/capita/year),0
Food service estimate (tonnes/year),0
Confidence in estimate,0
M49 code,0


**No missing values in the dataset**

In [None]:
df.duplicated().sum()

0

**No duplicates in the dataset**

In [None]:
df.columns

Index(['Country', 'combined figures (kg/capita/year)',
       'Household estimate (kg/capita/year)',
       'Household estimate (tonnes/year)', 'Retail estimate (kg/capita/year)',
       'Retail estimate (tonnes/year)',
       'Food service estimate (kg/capita/year)',
       'Food service estimate (tonnes/year)', 'Confidence in estimate',
       'M49 code', 'Region', 'Source'],
      dtype='object')

In [None]:
df.drop(['M49 code', 'Source'], axis=1, inplace=True)

In [None]:
df['Confidence in estimate'].value_counts()

Unnamed: 0_level_0,count
Confidence in estimate,Unnamed: 1_level_1
Very Low Confidence,130
Low Confidence,61
Medium Confidence,13
High Confidence,10


### **Out of 214 countries**

* 130 countries are very low in confidence estimate.
* 10 countries are having high confidence estimate.


## 2.1: Total Food Waste by Region and Sector

In [None]:
# Group data by region and calculate food waste per region
region_waste = df.groupby('Region')[['Household estimate (tonnes/year)',
                                     'Retail estimate (tonnes/year)',
                                     'Food service estimate (tonnes/year)']].sum().reset_index()

# Melt data to long format
region_waste_long = pd.melt(region_waste, id_vars='Region', var_name='Sector', value_name='Tonnes/year')

# Create an interactive bar chart using Plotly
fig = px.bar(region_waste_long, x='Region', y='Tonnes/year', color='Sector',
             title='Food Waste by Region and Sector (Tonnes/year)',
             labels={'Tonnes/year': 'Tonnes of Food Waste'},
             text_auto=True)

# Update layout for better readability
fig.update_layout(xaxis_tickangle=-45)
fig.show()

# Step 3: Key Insights and Calculations

## 3.1: Calculate Total Food Waste (in tonnes/year)

In [None]:
# Calculate total food waste across sectors
df['Total Waste (tonnes/year)'] = df['Household estimate (tonnes/year)'] + df['Retail estimate (tonnes/year)'] + df['Food service estimate (tonnes/year)']

# Get the total food waste globally (in tonnes)
total_waste_tonnes = df['Total Waste (tonnes/year)'].sum()

print(f"Total Food Waste (in tonnes/year): {total_waste_tonnes:.2f}")

Total Food Waste (in tonnes/year): 930857271.00


## 3.2: Estimating the #People Who Could Have Been Fed

In [None]:
# Assumed amount of food per person per year (in kg)
food_per_person_per_year = 500  # in kg

# Convert tonnes to kg
total_waste_kg = total_waste_tonnes * 1000

# Calculate number of people that could be fed
people_fed = total_waste_kg / food_per_person_per_year

print(f"Number of people who could have been fed: {people_fed:.0f}")

Number of people who could have been fed: 1861714542


## 3.3: Estimating Carbon Emissions from Food Waste

In [None]:
# CO2 emission factor (tonnes of CO2 per tonne of food waste)
co2_emission_factor = 2.5  # tonnes of CO2/tonne of food waste

# Calculate total carbon emissions from food waste (in tonnes)
total_co2_emissions = total_waste_tonnes * co2_emission_factor

print(f"Total CO2 Emissions (in tonnes): {total_co2_emissions:.2f}")

Total CO2 Emissions (in tonnes): 2327143177.50


# Step 4: Data Visualizations

## 4.1: Total Food Waste by Sector (Household, Retail, Food Service)

In [None]:
# Sum food waste per sector
sector_totals = df[['Household estimate (tonnes/year)', 'Retail estimate (tonnes/year)', 'Food service estimate (tonnes/year)']].sum().reset_index()
sector_totals.columns = ['Sector', 'Tonnes/year']

# Plotting using Plotly
fig = px.bar(sector_totals, x='Sector', y='Tonnes/year', text='Tonnes/year',
             title='Total Food Waste by Sector (Tonnes/year)',
             labels={'Tonnes/year': 'Tonnes of Food Waste'},
             color='Sector')

# Update layout for better readability
fig.update_layout(xaxis_tickangle=-45)
fig.show()

# 4.2: Scatter Plot of Food Waste vs. Confidence in Estimates

In [None]:
# Map confidence levels to numeric values for better visualization
confidence_mapping = {'Very Low Confidence': 1, 'Low Confidence': 2, 'Medium Confidence': 3, 'High Confidence': 4}
df['Confidence Level'] = df['Confidence in estimate'].map(confidence_mapping)

fig = px.scatter(df, x='Total Waste (tonnes/year)', y='Confidence Level',
                 color='Region', title='Food Waste vs. Confidence in Estimates',
                 labels={'Total Waste (tonnes/year)': 'Total Waste (tonnes/year)',
                         'Confidence Level': 'Confidence Level'},
                 hover_name='Country')

fig.update_layout(yaxis=dict(tickmode='array', tickvals=[1, 2, 3, 4],
                              ticktext=['Very Low', 'Low', 'Medium', 'High']))
fig.show()

This scatter plot shows the relationship between the total food waste and the confidence in estimates, colored by region. This visualization helps identify if there’s any correlation between waste levels and the reliability of data.

# 4.3: Heatmap of Food Waste by Region and Sector

In [None]:
heatmap_data = region_waste.set_index('Region').T
fig = go.Figure(data=go.Heatmap(
                   z=heatmap_data.values,
                   x=heatmap_data.columns,
                   y=heatmap_data.index,
                   colorscale='Viridis'))

fig.update_layout(title='Heatmap of Food Waste by Region and Sector',
                  xaxis_title='Region',
                  yaxis_title='Sector',
                  xaxis=dict(tickangle=-45))
fig.show()

The heatmap visualizes food waste by region and sector, allowing you to see which regions contribute the most waste in each sector. The heatmap can provide insights into targeted areas for food waste reduction initiatives.
