## **Environment Impact of Food Production Analysis**

### Business Understanding  
The environmental impact of food production is a critical global issue, affecting climate change, water scarcity, and biodiversity. As food demand rises with population growth and economic development, the challenge is to produce food sustainably while minimizing negative environmental consequences. This project seeks to analyze the environmental footprint of different food products and identify ways to reduce their impact.

### Objective/Goals 
- Assess the environmental impact of food production across various stages (e.g., land use, water usage, carbon emissions).  
- Identify the most and least sustainable food products based on environmental factors.  
- Provide data-driven insights to support policymakers, businesses, and consumers in making informed decisions.  
- Recommend strategies to reduce the environmental footprint of food production.  

### Problem Statement  
Food production contributes significantly to greenhouse gas emissions, water pollution, and deforestation. However, the environmental impact varies across food types and production methods. Without proper analysis, policymakers, businesses, and consumers lack the insights needed to adopt more sustainable food choices. This project aims to quantify and analyze the environmental impact of food production to guide sustainable practices.

### Stakeholders 
- **Government and policymakers** – To develop regulations and policies for sustainable food production.  
- **Agricultural and food industries** – To adopt environmentally friendly farming and production techniques.  
- **Consumers** – To make informed choices about sustainable food consumption.  
- **Environmental organizations** – To advocate for sustainable food systems.  
- **Researchers and data scientists** – To further study the impact of food production on the environment.  

### Features (Key Data Points)  
1. Land use change (Kg CO₂-equivalents per kg product)  
2. Animal feed (Kg CO₂-equivalents per kg product)  
3. Farm emissions (Kg CO₂-equivalents per kg product)  
4. Processing emissions (Kg CO₂-equivalents per kg product)  
5. Transport emissions (Kg CO₂-equivalents per kg product)  
6. Packaging emissions (Kg CO₂-equivalents per kg product)  
7. Retail emissions (Kg CO₂-equivalents per kg product)  
8. Water usage per kg of food  
9. Impact on eutrophication (water pollution due to excess nutrients)  
10. Biodiversity impact  

### Hypothesis  
1. **Animal-based food production has a higher environmental footprint than plant-based food.**  
2. **Transport emissions significantly contribute to the total carbon footprint of food production.**  
3. **Processed foods have a higher carbon footprint due to additional energy consumption.**  
4. **Local food production results in lower transport-related emissions.**  
5. **Higher land use change values correlate with higher biodiversity loss.**  
6. **Water-intensive crops contribute more to water scarcity issues.**  
7. **Sustainable farming methods (e.g., organic, regenerative agriculture) result in lower environmental impact.**  

### 7 Analytical Questions  
1. Which food products have the highest and lowest carbon emissions?  
2. How does the environmental impact of animal-based foods compare to plant-based foods?  
3. What is the correlation between land use change and biodiversity loss?  
4. Which stage of food production (farm, processing, transport, packaging, retail) contributes the most to carbon emissions?  
5. How does food transportation distance affect total emissions?  
6. Which foods contribute the most to eutrophication (water pollution)?  
7. What are the most sustainable food options based on environmental impact metrics?  

This structured approach will help extract actionable insights from the dataset, supporting sustainable decision-making in food production and consumption.

### Data Understanding & Preparation
Importing all the relevant libraries

In [1]:
# Data manipulation and analysis
import pandas as pd
import numpy as np

# To load multiple files
import glob 

# Data visualization
import matplotlib.pyplot as plt
import seaborn as sns

# Statistical analysis
from scipy import stats

# Date and time handling
from datetime import datetime

# Geospatial analysis (if needed for visualizing trade routes)
import geopandas as gpd

# Machine learning (if needed for predictive modeling)
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

# For handling large datasets (if needed)
import dask.dataframe as dd

# For interactive visualizations (optional)
import plotly.express as px
import plotly.graph_objects as go

# For data profiling- pandas (optional)
#import ydata_profiling
#from ydata_profiling import ProfileReport


# For handling missing data
from sklearn.impute import SimpleImputer

# For encoding categorical variables
from sklearn.preprocessing import LabelEncoder, OneHotEncoder

# For advanced visualizations (optional)
import altair as alt

# For working with Excel files (if your data is in Excel format)
import openpyxl

# For reading data from different file formats
import pyarrow

# For working with large CSV files
import csv

# For system operations
import os
import sys

# For progress bars in data processing
from tqdm import tqdm

# Set plotting style
# plt.style.use('seaborn')

### Load all datasets from their sources

In [2]:
# Path of csv file
file_path = '../EIFPA_Data/Food_Production.csv'
 
# Check if the file exists at the specified path
if os.path.exists(file_path):
    print("File exists at the specified path.")
    try:
        # Read the Excel file into a pandas DataFrame
        df_Food_Production= pd.read_csv(file_path)
       
    except FileNotFoundError as e:
        print(f"FileNotFoundError: {e}")
    except Exception as e:
        print(f"An error occurred: {e}")
else:
    print("File does not exist at the specified path.")
 
# Display the DataFrame
df_Food_Production.head()

File exists at the specified path.


Unnamed: 0,Food product,Land use change,Animal Feed,Farm,Processing,Transport,Packging,Retail,Total_emissions,Eutrophying emissions per 1000kcal (gPO₄eq per 1000kcal),...,Freshwater withdrawals per 100g protein (liters per 100g protein),Freshwater withdrawals per kilogram (liters per kilogram),Greenhouse gas emissions per 1000kcal (kgCO₂eq per 1000kcal),Greenhouse gas emissions per 100g protein (kgCO₂eq per 100g protein),Land use per 1000kcal (m² per 1000kcal),Land use per kilogram (m² per kilogram),Land use per 100g protein (m² per 100g protein),Scarcity-weighted water use per kilogram (liters per kilogram),Scarcity-weighted water use per 100g protein (liters per 100g protein),Scarcity-weighted water use per 1000kcal (liters per 1000 kilocalories)
0,Wheat & Rye (Bread),0.1,0.0,0.8,0.2,0.1,0.1,0.1,1.4,,...,,,,,,,,,,
1,Maize (Meal),0.3,0.0,0.5,0.1,0.1,0.1,0.0,1.1,,...,,,,,,,,,,
2,Barley (Beer),0.0,0.0,0.2,0.1,0.0,0.5,0.3,1.1,,...,,,,,,,,,,
3,Oatmeal,0.0,0.0,1.4,0.0,0.1,0.1,0.0,1.6,4.281357,...,371.076923,482.4,0.945482,1.907692,2.897446,7.6,5.846154,18786.2,14450.92308,7162.104461
4,Rice,0.0,0.0,3.6,0.1,0.1,0.1,0.1,4.0,9.514379,...,3166.760563,2248.4,1.207271,6.267606,0.759631,2.8,3.943662,49576.3,69825.77465,13449.89148
