# Global Ecological Footprint 2023
**Group 6**
- Jay Michael Carlos
- Seth Jovellana
- Janica Megan Reyes
- Abigail Vicencio

---
Dataset uploaded by user JAINA to kaggle at https://www.kaggle.com/datasets/jainaru/global-ecological-footprint-2023.

Original dataset:
York University Ecological Footprint Initiative & Global Footprint Network. Public Data Package of the National Footprint and Biocapacity Accounts, 2023 edition. Produced for the Footprint Data Foundation and distributed by Global Footprint Network. Available online at: https://data.footprintnetwork.org.

# Dataset Description
This dataset contains measures of various ecological assets that a country needs to produce the natural resources that its population consumes. Six categories are tracked:
- Crop land
- Grazing land
- Fishing grounds
- Built-up (urban) land
- Forest area
- Carbon demand on land
In addition, it also contains other economic factors such as per capita GDP and income group.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [2]:
# Need a specific encode setting to decode without eror.
eco_df = pd.read_csv('global-ecological-footprint-2023.csv', encoding = 'unicode_escape')

In [3]:
eco_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 182 entries, 0 to 181
Data columns (total 24 columns):
 #   Column                                    Non-Null Count  Dtype  
---  ------                                    --------------  -----  
 0   Country                                   182 non-null    object 
 1   Region                                    182 non-null    object 
 2   SDGi                                      159 non-null    object 
 3   Life Exectancy                            176 non-null    object 
 4   HDI                                       173 non-null    object 
 5   Per Capita GDP                            165 non-null    object 
 6   Income Group                              178 non-null    object 
 7   Population (millions)                     182 non-null    object 
 8   Cropland Footprint                        152 non-null    float64
 9   Grazing Footprint                         152 non-null    float64
 10  Forest Product Footprint              

## Collection process and methodology

## Structure
Each column of the dataset represents a country's measure, such as the average life expectancy or its population. Each row is an observation of the variables per country.

In [4]:
eco_df.head()

Unnamed: 0,Country,Region,SDGi,Life Exectancy,HDI,Per Capita GDP,Income Group,Population (millions),Cropland Footprint,Grazing Footprint,...,Total Ecological Footprint (Consumption),Cropland,Grazing land,Forest land,Fishing ground,Built up land.1,Total biocapacity,Ecological (Deficit) or Reserve,Number of Earths required,Number of Countries required
0,Afghanistan,Middle East/Central Asia,52.5,62,0.48,,LI,40.8,0.4,0.1,...,0.8,0.3,0.1,0.012981,0.000565,0.028232,0.513827,-0.287638,0.530696,1.559795
1,Albania,Other Europe,71.6,76,0.8,"$14,889",UM,2.9,0.8,0.2,...,2.1,0.6,0.2,0.223326,0.081392,0.073006,1.176752,-0.894486,1.371485,1.760131
2,Algeria,Africa,71.5,76,0.75,"$11,137",UM,45.4,0.7,0.2,...,2.2,0.4,0.2,0.023912,0.007179,0.037775,0.663375,-1.559593,1.471955,3.350998
3,Angola,Africa,50.9,62,0.59,"$6,304",LM,35.0,0.2,0.1,...,0.9,0.2,0.8,0.416888,0.153499,0.06136,1.588191,0.730346,0.568029,0.54014
4,Antigua and Barbuda,Central America/Caribbean,,78,0.79,"$18,749",HI,0.1,,,...,2.9,,,,,,0.917277,-2.019458,1.94458,3.201578


## Column definitions
0. Country: The name of the country being observed.
1. Region: The region of the world that the country is in, like Asia-Pacific, North America, Africa, etc.
2. SDGi: The SDG Index is a measure of progress of all UN member states on the Sustainable Development Goals.
3. Life Exectancy *\(sic\)*: The number of years a person can expect to live. (Esteban Ortiz-Ospina, 2017)
4. HDI: The United Nations' Human Development Index is an approximate measure of economic and social development based on three dimensions: health, education, and standard of living. Health is measured by life expectancy at birth, education is measured by mean of years of schooling, and standard of living is measured by gross national income per capita. Each country is then given a value from 0 to 1. An HDI of > 0.7 is considered "high human development". (United Nations Development Programme, 2023)
5. Per Capita GDP: Measured as the GDP (gross domestic product) of a nation divided by its population. It is used to measure the prosperity of a nation. (Investopedia, 2024)
6. Income Group: The World Bank Group (2023) assigns countries around the world to one of four income groups: low, lower-middle, upper-middle, and high. This is represented in the dataset as LI (low), LM (lower-middle), UM (upper-middle), and HI (high).
7. Population (millions): All people inhabiting a country, shown in millions in the dataset. (Merriam-Webster, 2024)
8. Cropland Footprint: The ecological footprint (defined below) used to produce food and fiber for human consumption and feed for livestock and other uses. All footprint is measured in global hectares per person. (York University Ecological Footprint Initiative & Global Footprint Network, 2023)
9. Grazing Footprint: 
10. Forest Product Footprint: 
11. Carbon Footprint: 
12. Fish Footprint: 
13. Built up land: 
14. Total Ecological Footprint (Consumption): A measure of productive land and water area that a population requires to produce all the resources it consumes and to absorb the waste that it generates. This is measured in **global hectares per person**, which is defined as a biologically productive hectare with world average productivity. (York University Ecological Footprint Initiative & Global Footprint Network, 2023)
15. Cropland: The biocapacity (defined below) of land used to produce food and fiber for human consumption and feed. (York University Ecological Footprint Initiative & Global Footprint Network, 2023)
16. Grazing land: 
17. Forest land: 
18. Fishing ground: 
19. Built up land.1: 
20. Total biocapacity: The capacity of biologically productive areas to provide for human demand (footprints). It is also measured in global hectares per person. (York University Ecological Footprint Initiative & Global Footprint Network, 2023)
21. Ecological (Deficit) or Reserve: This variable measures a country's ecological footprint relative to its biocapacity. If the country's ecological footprint exceeds its biocapacity, there is an ecological deficit. If the country's biocapacity exceeds its ecological footprint, there is an ecological reserve. Like biocapacity and ecological footprint, this is measured in global hectares per person. (York University Ecological Footprint Initiative & Global Footprint Network, 2023)
22. Number of Earths required: The number of planet Earths required for everyone in the world to sustain the resources needed to live the average lifestyle of a person in the observed country. (York University Ecological Footprint Initiative & Global Footprint Network, 2023)
23. Number of Countries required: Represents how many times the observed country's total biocapacity is needed to provide for its consumption footprint. (York University Ecological Footprint Initiative & Global Footprint Network, 2023)

# Data Cleaning

One of the variables to be used is Per Capita GDP. It is represented with a dollar sign and is formatted with commas for readability, however, to properly visualize it, it must be converted to a standard float value without the dollar sign and commas. Additionally, the 17 missing values must be dropped from the table.

In [35]:
nan_per_capita_gdp = eco_df['Per Capita GDP'].isna().sum()
print('There are', nan_per_capita_gdp, 'countries with no recorded per capita GDP.')

There are 17 countries with no recorded per capita GDP.


In [38]:
no_nan_pcgdp_eco_df = eco_df[eco_df['Per Capita GDP'].notna()].astype(str)
no_nan_pcgdp_eco_df = eco_df.replace(' ', np.nan)
no_nan_pcgdp_eco_df.head(50)

Unnamed: 0,Country,Region,SDGi,Life Exectancy,HDI,Per Capita GDP,Income Group,Population (millions),Cropland Footprint,Grazing Footprint,...,Total Ecological Footprint (Consumption),Cropland,Grazing land,Forest land,Fishing ground,Built up land.1,Total biocapacity,Ecological (Deficit) or Reserve,Number of Earths required,Number of Countries required
0,Afghanistan,Middle East/Central Asia,52.5,62.0,0.48,,LI,40.8,0.4,0.1,...,0.8,0.3,0.1,0.012981,0.000565,0.028232,0.513827,-0.287638,0.530696,1.559795
1,Albania,Other Europe,71.6,76.0,0.8,"$14,889",UM,2.9,0.8,0.2,...,2.1,0.6,0.2,0.223326,0.081392,0.073006,1.176752,-0.894486,1.371485,1.760131
2,Algeria,Africa,71.5,76.0,0.75,"$11,137",UM,45.4,0.7,0.2,...,2.2,0.4,0.2,0.023912,0.007179,0.037775,0.663375,-1.559593,1.471955,3.350998
3,Angola,Africa,50.9,62.0,0.59,"$6,304",LM,35.0,0.2,0.1,...,0.9,0.2,0.8,0.416888,0.153499,0.06136,1.588191,0.730346,0.568029,0.54014
4,Antigua and Barbuda,Central America/Caribbean,,78.0,0.79,"$18,749",HI,0.1,,,...,2.9,,,,,,0.917277,-2.019458,1.94458,3.201578
5,Argentina,South America,72.8,75.0,0.84,"$22,117",UM,46.0,0.9,0.5,...,3.2,1.8,1.2,0.591673,1.527615,0.083517,5.231663,2.011045,2.132556,0.615601
6,Armenia,Middle East/Central Asia,71.1,72.0,0.76,"$13,548",LM,3.0,0.7,0.2,...,2.3,0.4,0.3,0.0982,0.016853,0.052182,0.846625,-1.47977,1.540439,2.747847
7,Australia,Asia-Pacific,75.6,83.0,0.95,"$53,053",HI,26.1,0.1,0.5,...,5.8,1.8,4.5,1.861992,2.827503,0.024587,11.021401,5.244362,3.825307,0.524166
8,Austria,EU-27,82.3,81.0,0.92,"$55,460",HI,9.1,1.0,0.3,...,5.6,0.6,0.1,1.95226,0.005609,0.194413,2.893775,-2.732866,3.725721,1.944395
9,Azerbaijan,Middle East/Central Asia,73.5,69.0,0.75,"$14,692",UM,10.3,0.8,0.2,...,2.4,0.6,0.2,0.103819,0.014092,0.045343,0.936955,-1.420134,1.560763,2.515692


In [37]:
cleaned_pcgdp = no_nan_pcgdp_eco_df['Per Capita GDP'].str.replace('$', '').str.replace(',', '')
cleaned_pcgdp.head()

0       NaN
1    14889 
2    11137 
3     6304 
4    18749 
Name: Per Capita GDP, dtype: object

# Exploratory Data Analysis

# Research Questions

# References
@JAINA. (2024). *Global Ecological Footprint 2023🌐\[latest report\]*. Kaggle. https://www.kaggle.com/datasets/jainaru/global-ecological-footprint-2023.

Esteban Ortiz-Ospina. (2017). *"Life Expectancy" – What does this actually mean?*. Published online at OurWorldInData.org. https://ourworldindata.org/life-expectancy-how-is-it-calculated-and-how-should-it-be-interpreted.

Investopedia. (2024). *GDP Per Capita: Definition, Uses, and Highest Per Country*. https://www.investopedia.com/terms/p/per-capita-gdp.asp.

Merriam-Webster. (2024). *Population definition*. https://www.merriam-webster.com/dictionary/population

UN Sustainable Development Solutions Network. (2022). *Sustainable Development Goals Index*. https://www.sdgindex.org/. 
United Nations Development Programme. (2023). *Human Development Report*. http://hdr.undp.org/en/data.
 
International Monetary Fund. (2023). *World Economic Outlook*. https://www.imf.org/en/Publications/WEO

Hamadeh, N., Van Rompaey, C, & Metreau, E. (2023). *World Bank Group country classifications by income level for FY24 (July 1, 2023- June 30, 2024)*. https://blogs.worldbank.org/en/opendata/new-world-bank-group-country-classifications-income-level-fy24

York University Ecological Footprint Initiative & Global Footprint Network. *Public Data Package of the National Footprint and Biocapacity Accounts, 2023 edition*. Produced for the Footprint Data Foundation and distributed by Global Footprint Network. https://data.footprintnetwork.org.