# Analysing the Impact of Agricultural CO2 emissions on Climate Change


<div id="agriculture", align="center">

  <img src="agriculture.jpg" width="1000" height="500" alt=""/>

</div>

## Table of Contents

- <b> [1. Project Overview](#1-project-overview)
    - [1.1. Introduction](#11-introduction)
    - [1.2. Objective](#12-objective)
- <b> [2. Importing Packages](#2-importing-packages)
- <b> [3. Data Loading and Inspection](#3-data-loading-and-inspection)
- <b> [4. Data Cleaning](#4-data-cleaning)
- <b> [5. Exploratory Data Analysis (EDA)](#5-exploratory-data-analysis)
- <b> [6. Feature Engineering](#6-feature-engineering)</b>
- <b> [7. Model Development](#7-model-development)</b>
- <b> [8. Model Performance](#8-model-performance)
- <b> [9. Conclusion and Insights](#9-conclusion-and-recommendations)</b>


## 1. Project Overview <a class="anchor" id="chapter1"></a>

### 1.1 Introduction <a class="anchor" id="section_1_1"></a>

Climate change is a global issue that has significant ramifications on the environment. It a weather phenomenon defined as the long-term significant shift in weather and temperature conditions. It is caused by greenhouse gas emissions that trap the sun's heat, typically carbon dioxide (CO2) or methane (CH4), which results in global temperature upsurges. Various industries and sectors contribute to global emissions. The agricultural sector is responsible for a third of these emissions as one of the leading sources of CO2 emissions and biodiversity loss. The investigation and comprehension of the effect the sector has on climate change trends is paramount for devising effective mitigation strategies and implementing sustainable agricultural practices. Such strategies and practises may include agroforestry, crop rotation, biodynamic agriculture and Renewanble energy integration. 

The main objective of this project is to investigate and comprehend the impact of agricultural activities on climate change. This is fulfilled by analysing and predicting the effect of various CO2 emission sources on climate change. Consequently, uncovering underlying patterns and key insights to inform strategy development. The analyses is conducted using data retrieved from the Food and Agriculture Organization (FAO) and the Intergovernmental Panel on Climate Change (IPCC) for various countries/areas. The dataset comprises of the average temperature variations in Celcius (&deg;C), the carbon dioxide emissions from several sources and the total emissions  in kilotonnes (kt) for the year 1990 to 2020. Climate change is represented as the trend in temperature variations observed in an area. A higher variation denotes extreme climate change, whereas a lower variation indicates a stable climate. The variation can either be positive or negative, describing a warming and cooling climate repectively. The CO2 sources are given as:
*   Savanna fires 
*   Forest fires 
*   Crop Residues
*   Rice Cultivation
*   Drained organic soils (CO2)
*   Pesticides Manufacturing
*   Food Transport
*   Forestland
*   Net Forest conversion
*   Food Household Consumption
*   Food Retail
*   On-farm Electricity Use
*   Food Packaging
*   Agrifood Systems Waste Disposal
*   Food Processing
*   Fertilizers Manufacturing
*   IPPU
*   Manure applied to Soils
*   Manure Management
*   Fires in organic soils
*   Fires in humid tropical forests
*   On-farm energy use
*   Rural population
*   Urban population
*   Total Male Population
*   Total Female Population

A thorough data analysis procedure is employed, followed by regression predictive modelling. The procedure is embedded in the structure of this notebook. The notebook consists of several sections and exploits python's extensive capabilities by utilizing regression, data analysis and visulisation libraries. The first section imports python packages essential for analysis. The data loading section loads and inspects the dataset for errors and/or data consistencies. The subsequent section cleans the dataset by handling the errors and data consistencies found, preparing it for analysis. The exploratory data analysis section draws valuable insights from the dataset using statistical techniques and data visualisations. In the feature engineering section, the dataset is prepared for regression modelling. The predictive model is developed and evaluated in the model development and model performance sections accordingly. The insights gathered from both analyses will highlight the impact of agricultural activities on climate change and will be reviewed in the conclusions and recommendation section. Here suitable mitigation strategies are discussed at length and recommended to the stakeholders for implementation. 


### 1.2 Objective <a class="anchor" id="section_1_2"></a>

The key objectives are defined as follows:

*   To perform exploratory data analysis on the agricultural emissions dataset.
*   To identify the relationship between CO2 emmisions and temperature variations.
*   To identify and describe the global trend for climate change.
*   To identify emission sources that are significant contributors of CO2 emissions and have major influence on temperature variations.
*   To perform regression analysis on the agricultural emissions dataset.
*   To develop a regression model to predict temperature variations.
*   To offer actionable insights to agricultural stakeholders including policymakers and agricultural organisations.


## 2. Importing Packages <a class="anchor" id="chapter2"></a>

In this segment, the library packages necessary for data analysis and regression analysis are imported. The main libraries are:
*   <b>*Pandas:*</b> Stores data in DataFrames and facilitates data manipulation and analysis.
*   <b>*Numpy:*</b> Assists in data manipulation by supporting numerical computations on data array.
*   <b>*Matplotlib:*</b> Used to create static and interactive data visualisations.
*   <b>*Seaborn:*</b>  Builds upon the matplotlib library by creating visually appealing data visualisations.
*   <b>*Sklearn:*</b> Used to build and evaluate machine learning models.

The `%matplotlib inline` line ensure the figures generated by matplotlib are placed within the notebook. Additionally, the warnings library is imported for the supression of raised warnings. This is to avoid  unneccessary interruptions during code execution. 


In [9]:
## Libraries for data manipulation and analysis
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

## Libraries for regression analysis
from sklearn.model_selection import train_test_split  #Splits dataframe into training and testing set
from sklearn.preprocessing import StandardScaler #Scales data

from sklearn.linear_model import LinearRegression, Lasso, Ridge, ElasticNet  #Various linear regression model regressors
from sklearn.tree import DecisionTreeRegressor #Decision tree regressor

from sklearn.ensemble import RandomForestRegressor, BaggingRegressor,  AdaBoostRegressor #Ensemble methods regressors

from sklearn.metrics import mean_squared_error, r2_score    #Evaluates model evaluation

## Displays output inline
%matplotlib inline

## Libraries for Handing Warnings
import warnings
warnings.filterwarnings('ignore')

## 3. Data Loading and Inspection <a class="anchor" id="chapter3"></a>

*Brief section introduction

In [10]:
#reading the dataset into a dataframe
df = pd.read_csv('co2_emissions_from_agri.csv')
#display first five columns of dataframe
df.head()

Unnamed: 0,Area,Year,Savanna fires,Forest fires,Crop Residues,Rice Cultivation,Drained organic soils (CO2),Pesticides Manufacturing,Food Transport,Forestland,...,Manure Management,Fires in organic soils,Fires in humid tropical forests,On-farm energy use,Rural population,Urban population,Total Population - Male,Total Population - Female,total_emission,Average Temperature °C
0,Afghanistan,1990,14.7237,0.0557,205.6077,686.0,0.0,11.807483,63.1152,-2388.803,...,319.1763,0.0,0.0,,9655167.0,2593947.0,5348387.0,5346409.0,2198.963539,0.536167
1,Afghanistan,1991,14.7237,0.0557,209.4971,678.16,0.0,11.712073,61.2125,-2388.803,...,342.3079,0.0,0.0,,10230490.0,2763167.0,5372959.0,5372208.0,2323.876629,0.020667
2,Afghanistan,1992,14.7237,0.0557,196.5341,686.0,0.0,11.712073,53.317,-2388.803,...,349.1224,0.0,0.0,,10995568.0,2985663.0,6028494.0,6028939.0,2356.304229,-0.259583
3,Afghanistan,1993,14.7237,0.0557,230.8175,686.0,0.0,11.712073,54.3617,-2388.803,...,352.2947,0.0,0.0,,11858090.0,3237009.0,7003641.0,7000119.0,2368.470529,0.101917
4,Afghanistan,1994,14.7237,0.0557,242.0494,705.6,0.0,11.712073,53.9874,-2388.803,...,367.6784,0.0,0.0,,12690115.0,3482604.0,7733458.0,7722096.0,2500.768729,0.37225


*   **Savanna fires:** Emissions from fires in savanna ecosystems.
*   **Forest fires:** Emissions from fires in forested areas.
*   **Crop Residues:** Emissions from burning or decomposing leftover plant material after crop harvesting.
*   **Rice Cultivation:** Emissions from methane released during rice cultivation.
*   **Drained organic soils (CO2):** Emissions from carbon dioxide released when draining organic soils.
*   **Pesticides Manufacturing:** Emissions from the production of pesticides.
*   **Food Transport:** Emissions from transporting food products.
*   **Forestland:** Land covered by forests.
*   **Net Forest conversion:** Change in forest area due to deforestation and afforestation.
*   **Food Household Consumption:** Emissions from food consumption at the household level.
*   **Food Retail:** Emissions from the operation of retail establishments selling food.
*   **On-farm Electricity Use:** Electricity consumption on farms.
*   **Food Packaging:** Emissions from the production and disposal of food packaging materials.
*   **Agrifood Systems Waste Disposal:** Emissions from waste disposal in the agrifood system.
*   **Food Processing:** Emissions from processing food products.
*   **Fertilizers Manufacturing:** Emissions from the production of fertilizers.
*   **IPPU:** Emissions from industrial processes and product use.
*   **Manure applied to Soils:** Emissions from applying animal manure to agricultural soils.
*   **Manure left on Pasture:** Emissions from animal manure on pasture or grazing land.
*   **Manure Management:** Emissions from managing and treating animal manure.
*   **Fires in organic soils:** Emissions from fires in organic soils.
*   **Fires in humid tropical forests:** Emissions from fires in humid tropical forests.
*   **On-farm energy use:** Energy consumption on farms.
*   **Rural population:** Number of people living in rural areas.
*   **Urban population:** Number of people living in urban areas.
*   **Total Population - Male:** Total number of male individuals in the population.
*   **Total Population - Female:** Total number of female individuals in the population.
*   **total_emission:** Total greenhouse gas emissions from various sources.
*   **Average Temperature °C:** The average increasing or decreasing of temperature (by year) in degrees Celsius

CO2 is recorded in kilotonnes (kt) and  1 kt represents 1000 kg of CO2.

## 4. Data Cleaning <a class="anchor" id="chapter4"></a>

*Brief section introduction

In [11]:
#renaming features to adhere to python naming standards
df= df.rename(columns = {'Average Temperature °C' : 'average_temperature_change', 'Total Population - Female' : 'female_population', 'Total Population - Male':'male_population', 'Urban population': 'urban_population', 'Rural population':'rural_population', 'On-farm energy use' : 'on_farm_energy_use' , 'Fires in humid tropical forests' : 'fires_in_humid_tropical_forests'
                         ,'Fires in organic soils' : 'fires_in_organic_soils', 'Manure Management' : 'manure_management', 'Manure left on Pasture': 'manure_left_on_pasture', 'Manure applied to Soils' : 'manure_applied_to_soils','Fertilizers Manufacturing' : 'fertilizers_manufacturing', 'Food Processing' : 'food_processing', 'Agrifood Systems Waste Disposal' : 'agrifood_systems_waste_disposal'
                         ,'Food Packaging' :'food_packaging', 'On-farm Electricity Use': 'on_farm_electricity_use', 'Food Retail' : 'food_retail' , 'Food Household Consumption' : 'food_household_consumption' , 'Net Forest conversion' : 'net_forest_conversion', 'Forestland': 'forestland', 'Food Transport' : 'food_transport', 'Pesticides Manufacturing' : 'pesticides_manufacturing'
                         ,'Drained organic soils (CO2)': 'drained_organic_soils', 'Rice Cultivation' : 'rice_cultivation', 'Crop Residues':'crop_residues', 'Forest fires':'forest_fires', 'Savanna fires': 'savanna_fires', 'Area' :'area', 'Year' : 'year'})

df.head()                     

Unnamed: 0,area,year,savanna_fires,forest_fires,crop_residues,rice_cultivation,drained_organic_soils,pesticides_manufacturing,food_transport,forestland,...,manure_management,fires_in_organic_soils,fires_in_humid_tropical_forests,on_farm_energy_use,rural_population,urban_population,male_population,female_population,total_emission,average_temperature_change
0,Afghanistan,1990,14.7237,0.0557,205.6077,686.0,0.0,11.807483,63.1152,-2388.803,...,319.1763,0.0,0.0,,9655167.0,2593947.0,5348387.0,5346409.0,2198.963539,0.536167
1,Afghanistan,1991,14.7237,0.0557,209.4971,678.16,0.0,11.712073,61.2125,-2388.803,...,342.3079,0.0,0.0,,10230490.0,2763167.0,5372959.0,5372208.0,2323.876629,0.020667
2,Afghanistan,1992,14.7237,0.0557,196.5341,686.0,0.0,11.712073,53.317,-2388.803,...,349.1224,0.0,0.0,,10995568.0,2985663.0,6028494.0,6028939.0,2356.304229,-0.259583
3,Afghanistan,1993,14.7237,0.0557,230.8175,686.0,0.0,11.712073,54.3617,-2388.803,...,352.2947,0.0,0.0,,11858090.0,3237009.0,7003641.0,7000119.0,2368.470529,0.101917
4,Afghanistan,1994,14.7237,0.0557,242.0494,705.6,0.0,11.712073,53.9874,-2388.803,...,367.6784,0.0,0.0,,12690115.0,3482604.0,7733458.0,7722096.0,2500.768729,0.37225


In [12]:
#Checking for null values per column
print("Null Values in each column:")
print(df.isnull().sum())

Null Values in each column:
area                                  0
year                                  0
savanna_fires                        31
forest_fires                         93
crop_residues                      1389
rice_cultivation                      0
drained_organic_soils                 0
pesticides_manufacturing              0
food_transport                        0
forestland                          493
net_forest_conversion               493
food_household_consumption          473
food_retail                           0
on_farm_electricity_use               0
food_packaging                        0
agrifood_systems_waste_disposal       0
food_processing                       0
fertilizers_manufacturing             0
IPPU                                743
manure_applied_to_soils             928
manure_left_on_pasture                0
manure_management                   928
fires_in_organic_soils                0
fires_in_humid_tropical_forests     155
on_farm_ener

In [13]:
#Replacing Missing Values with zero
df[['savanna_fires', 'forest_fires', 'crop_residues', 'forestland', 'net_forest_conversion', 'food_household_consumption', 'IPPU','manure_applied_to_soils','manure_management', 'fires_in_humid_tropical_forests', 'on_farm_energy_use']] = df[['savanna_fires', 'forest_fires', 'crop_residues', 'forestland', 'net_forest_conversion', 'food_household_consumption', 'IPPU','manure_applied_to_soils','manure_management', 'fires_in_humid_tropical_forests', 'on_farm_energy_use']].fillna(0)

The empty cells are specific to specific areas, hence their replacement with zero

In [14]:
#Checking for duplicates
print("Number of duplicated rows: ")
print(df.duplicated().sum())

Number of duplicated rows: 
0


*Insights/comments

## 5. Exploratory Data Analysis <a class="anchor" id="chapter5"></a>

*Brief section introduction

*Insights/comments

## 6. Feature Engineering <a class="anchor" id="chapter6"></a>

*Brief section introduction

*Insights

## 7. Model Development <a class="anchor" id="chapter7"></a>

*Brief section introduction

*Insights/comments

## 8. Model Performance <a class="anchor" id="chapter8"></a>

*Brief section introduction

*Insights/comments

## 9. Conclusion and Recommendations <a class="anchor" id="chapter9"></a>

*Summarise Insights
*Offer recommendations such as sustainable agricultural practices