<a href="https://colab.research.google.com/github/abdulSalamKagaji97/world_development_explorer/blob/main/Part_B/wdx_analysis_Part_B_Draft.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **How does population density, fossil fuel consumption, Renewable Energy Consumption impact on greenhouse gas emission?** 

*Abdul Salam Kagaji*

*Data Science Master's Student, UMBC*

*April 1, 2022*

Global population growth leads to overpopulation, causing scarcity of natural resources of the Earth. Population is growth is influenced by multiple factors like evolved health care, better and safe urban infrastructures. It makes life unsustainable in areas with dense populations.
Advanced innovations in the fields of science and health care have made life on Earth long-lasting. Despite these phenomenal results, it has also led to overpopulation which in turn affects the consumption of natural resources for energy generation. Modern innovations ranging from automobiles to semiconductor devices demand energy sources in form of electricity or fuels. 

Fossil fuels are the most prominent natural resources for the production of energy. Increasing populations require higher production of energy for everyday activities and this demands consumption of fossil fuels like coal and crude oil. Despite the production of hydroelectricity, the majority proportion of energy generated is from fossil fuels. 
Continued usage of energy and electronic devices have affected carbon emission leading to the greenhouse effect on Earth. Greenhouse gases cause global warming which disrupts the environmental conditions creating chaos that could lead to devastating situations.

Lets analyze the effects of overpopulation, fossil fuel consumption, renewable energy consumption on greenhouse gas emissions with the data from the time period of 2010 to 2021.

## Approach

**a.	Source of data & graphs:** THE WORLD DEVELOPMENT EXPLORER (https://www.worlddev.xyz/) 

**b.	Regions Compared:** East Asia & Pacific, Europe & central Asia, Latin America & Caribbean, Middle East & North Africa, North America, South Asia, Sub-Saharan Africa.

**c.	Timeline :** 2010 to 2021

**d.	Topics and Indicators:**

-	**Urban Development** : Population density (people per sq. km of land area - EN.POP.DNST) 
- **Energy & Mining** : Renewable energy consumption (% of total final energy consumption - EG.FEC.RNEW.ZS)
- **Energy & Mining** : Fossil fuel energy consumption (% of total - EG.USE.COMM.FO.ZS)
- **Climate Change** : Total greenhouse gas emissions (kt of CO2 equivalent - EN.ATM.GHGT.KT.CE)

## Visual Analysis:

In [1]:
import pandas as pd
import plotly.express as px

In [2]:
URL = "https://raw.githubusercontent.com/abdulSalamKagaji97/world_development_explorer/main/Part_B/wdi_data_part_b.csv"

In [5]:
df = pd.read_csv(URL)
df.sample(10)

Unnamed: 0.1,Unnamed: 0,Year,value,indicator,Country Code,Country Name,Region,Income Group,Lending Type
5971,5971,2019,14.647442,EN.POP.DNST,NOR,Norway,Europe & Central Asia,High income,Not classified
734,734,2015,89510.0,EN.ATM.GHGT.KT.CE,ISR,Israel,Middle East & North Africa,High income,Not classified
5905,5905,2019,18.91041,EN.POP.DNST,NZL,New Zealand,East Asia & Pacific,High income,Not classified
6491,6491,2012,142.688889,EN.POP.DNST,TON,Tonga,East Asia & Pacific,Upper middle income,IDA
6459,6459,2013,133.383936,EN.POP.DNST,THA,Thailand,East Asia & Pacific,Upper middle income,IBRD
4937,4937,2016,143.20025,EN.POP.DNST,DNK,Denmark,Europe & Central Asia,High income,Not classified
6544,6544,2010,34.376842,EN.POP.DNST,TCA,Turks and Caicos Islands,Latin America & Caribbean,High income,Not classified
827,827,2018,112970.0,EN.ATM.GHGT.KT.CE,KWT,Kuwait,Middle East & North Africa,High income,Not classified
6216,6216,2011,90.928761,EN.POP.DNST,SLE,Sierra Leone,Sub-Saharan Africa,Low income,IDA
4857,4857,2013,92.873306,EN.POP.DNST,CRI,Costa Rica,Latin America & Caribbean,Upper middle income,IBRD


In [6]:
df.indicator.unique()

array(['EN.ATM.GHGT.KT.CE', 'EG.USE.COMM.FO.ZS', 'EG.FEC.RNEW.ZS',
       'EN.POP.DNST'], dtype=object)

### Population distribution among regions:

In [7]:
df_population_density = df.query("indicator == 'EN.POP.DNST'")
df_population_density.sample(5)

Unnamed: 0.1,Unnamed: 0,Year,value,indicator,Country Code,Country Name,Region,Income Group,Lending Type
5352,5352,2013,46.957172,EN.POP.DNST,IRN,"Iran, Islamic Rep.",Middle East & North Africa,Lower middle income,IBRD
6565,6565,2020,393.066667,EN.POP.DNST,TUV,Tuvalu,East Asia & Pacific,Upper middle income,IDA
5296,5296,2012,6809.619048,EN.POP.DNST,HKG,"Hong Kong SAR, China",East Asia & Pacific,High income,Not classified
5823,5823,2014,33.426832,EN.POP.DNST,MOZ,Mozambique,Sub-Saharan Africa,Low income,IDA
4397,4397,2015,165.942553,EN.POP.DNST,AND,Andorra,Europe & Central Asia,High income,Not classified


In [12]:
df_population_region_group = df_population_density.groupby("Region").value.sum()
df_population_region_group = df_population_region_group.reset_index()
df_population_region_group

Unnamed: 0,Region,value
0,East Asia & Pacific,431804.961155
1,Europe & Central Asia,319456.010751
2,Latin America & Caribbean,92435.011488
3,Middle East & North Africa,68586.107779
4,North America,13577.721557
5,South Asia,44202.455145
6,Sub-Saharan Africa,53924.04833


In [17]:
df_pop_sorted = df_population_region_group.sort_values(by="value",ascending=False)
fig = px.bar(
    data_frame=df_pop_sorted,
    x = "Region", 
    y = "value",
    color = "Region",
    height = 600,
    labels={'value':'population density (group sum)'},
    template = "plotly_white"
    )
fig.update_layout(showlegend = True)
fig.show()

- From the distribution graph it is clearly seen that the population density of East Asia & Pacific region is the highest as compared to other western regions like North America and middle East & North Africa region, followed by Europe & central Asia with second highest population density.
- Sub-Saharan Africa is the region with least population density.

### Population Change between 2010 and 2021:

In [47]:
df_population_region_group_trend = df_population_density.groupby(["Year","Region"]).value.sum()
df_population_region_group_trend = df_population_region_group_trend.reset_index()
df_population_region_group_trend
fig = px.line(
    df_population_region_group_trend, 
    x="Year", 
    labels={"value":"Population Density (group sum)"},
    y="value",
    color="Region",
    height=600,
    title="Population density change"
)

fig.update_layout(showlegend=True)

fig.show()

- The time series graph shows a gradual increase in population densities in all the regions with time and it is also seen that East Asia & Pacific has highest change in population density.

### Fossil Fuel Consumption by region:

In [48]:
df_fossil_fuel_consumption = df.query("indicator == 'EG.USE.COMM.FO.ZS'")
df_fossil_fuel_consumption.sample(5)

Unnamed: 0.1,Unnamed: 0,Year,value,indicator,Country Code,Country Name,Region,Income Group,Lending Type
1797,1797,2013,68.826033,EG.USE.COMM.FO.ZS,BWA,Botswana,Sub-Saharan Africa,Upper middle income,IBRD
2291,2291,2012,66.832868,EG.USE.COMM.FO.ZS,SVK,Slovak Republic,Europe & Central Asia,High income,Not classified
2015,2015,2013,98.958888,EG.USE.COMM.FO.ZS,IRN,"Iran, Islamic Rep.",Middle East & North Africa,Lower middle income,IBRD
2281,2281,2012,87.089408,EG.USE.COMM.FO.ZS,SRB,Serbia,Europe & Central Asia,Upper middle income,IBRD
2382,2382,2011,79.554036,EG.USE.COMM.FO.ZS,UKR,Ukraine,Europe & Central Asia,Lower middle income,IBRD


In [55]:
df_fossil_fuel_consumption = df_fossil_fuel_consumption.groupby(["Region"]).value.sum()
df_fossil_fuel_consumption = df_fossil_fuel_consumption.reset_index()
df_fossil_fuel_consumption

Unnamed: 0,Region,value
0,East Asia & Pacific,6620.593711
1,Europe & Central Asia,19345.555666
2,Latin America & Caribbean,7896.366823
3,Middle East & North Africa,8107.266415
4,North America,944.73127
5,South Asia,1328.998062
6,Sub-Saharan Africa,4254.209111


In [56]:
fig = px.pie(df_fossil_fuel_consumption, values='value', names='Region', title = "Fossil fuel consumption by region")
fig.show()

- The above pie chart shows that the Europe & Central Asia has the highest fossil fuel consumption recorded with a total of 39.9% followed by Middle East & North Africa.

### Renewable Energy Consumption by region:

In [57]:
df_renewable_energy_consumption = df.query("indicator == 'EG.FEC.RNEW.ZS'")
df_renewable_energy_consumption.sample(5)

Unnamed: 0.1,Unnamed: 0,Year,value,indicator,Country Code,Country Name,Region,Income Group,Lending Type
4343,4343,2014,80.775002,EG.FEC.RNEW.ZS,ZWE,Zimbabwe,Sub-Saharan Africa,Lower middle income,Blend
3924,3924,2018,21.144699,EG.FEC.RNEW.ZS,SRB,Serbia,Europe & Central Asia,Upper middle income,IBRD
3962,3962,2011,10.3571,EG.FEC.RNEW.ZS,SVK,Slovak Republic,Europe & Central Asia,High income,Not classified
3123,3123,2018,11.4736,EG.FEC.RNEW.ZS,GRL,Greenland,Europe & Central Asia,High income,Not classified
2446,2446,2016,39.587299,EG.FEC.RNEW.ZS,ALB,Albania,Europe & Central Asia,Upper middle income,IBRD


In [58]:
df_renewable_energy_consumption = df_renewable_energy_consumption.groupby(["Region"]).value.sum()
df_renewable_energy_consumption = df_renewable_energy_consumption.reset_index()
df_renewable_energy_consumption

Unnamed: 0,Region,value
0,East Asia & Pacific,6426.95789
1,Europe & Central Asia,10159.76558
2,Latin America & Caribbean,7438.931716
3,Middle East & North Africa,886.370204
4,North America,285.972101
5,South Asia,3268.956908
6,Sub-Saharan Africa,27506.435389


In [59]:
fig = px.pie(df_renewable_energy_consumption, values='value', names='Region', title = "Renewable Energy consumption by region")
fig.show()

- In contrast to the fossil fuel consumption graph, Sub-Saharan Africa has the highest renewable energy consumption when compared to all the other regions with a total of 49.1% and North America has the least renewable Energy Consumption records

### Greenhouse Gas Emission by region:

In [60]:
df_greenhouse_gas_emission = df.query("indicator == 'EN.ATM.GHGT.KT.CE'")
df_greenhouse_gas_emission.sample(5)

Unnamed: 0.1,Unnamed: 0,Year,value,indicator,Country Code,Country Name,Region,Income Group,Lending Type
556,556,2017,7530.0,EN.ATM.GHGT.KT.CE,GAB,Gabon,Sub-Saharan Africa,Upper middle income,IBRD
566,566,2018,2790.0,EN.ATM.GHGT.KT.CE,GMB,"Gambia, The",Sub-Saharan Africa,Low income,IDA
1366,1366,2017,40480.0,EN.ATM.GHGT.KT.CE,SVK,Slovak Republic,Europe & Central Asia,High income,Not classified
598,598,2014,89910.0,EN.ATM.GHGT.KT.CE,GRC,Greece,Europe & Central Asia,High income,Not classified
1715,1715,2015,31280.0,EN.ATM.GHGT.KT.CE,ZWE,Zimbabwe,Sub-Saharan Africa,Lower middle income,Blend


In [61]:
df_greenhouse_gas_emission = df_greenhouse_gas_emission.groupby(["Region"]).value.sum()
df_greenhouse_gas_emission = df_greenhouse_gas_emission.reset_index()
df_greenhouse_gas_emission

Unnamed: 0,Region,value
0,East Asia & Pacific,149743170.0
1,Europe & Central Asia,77151910.0
2,Latin America & Caribbean,28464090.0
3,Middle East & North Africa,27901280.0
4,North America,60954130.0
5,South Asia,33008140.0
6,Sub-Saharan Africa,19835920.0


In [62]:
fig = px.pie(df_greenhouse_gas_emission, values='value', names='Region', title = "Greenhouse gas emission by region")
fig.show()

- A surprisingly high contribution towards greenhouse gas emission is seen by the East Asia & Pacific with a total of 37.7%.

### Relation between population density,fossil fuel consumption and greenhouse gas emission:

In [121]:

df_relations = df.groupby(["indicator","Region"]).value.sum()
df_relations = df_relations.reset_index().query("indicator != 'EG.FEC.RNEW.ZS'")
df_relations




data = []
columns = df_relations.indicator.unique()
for region in df.Region.unique():
    li = []
    li.append(region)
    for indicator in df_relations.indicator.unique():
        dff = df_relations.query(f"Region == '{region}'").query(f"indicator == '{indicator}'")
        try:
            li.append(float(dff.value))
        except:
          pass
    data.append(li)
df_population_fossil_fuel_greenhouse_gas_relation = pd.DataFrame(data)
df_population_fossil_fuel_greenhouse_gas_relation.columns = ["Region", *columns]
df_population_fossil_fuel_greenhouse_gas_relation
        

Unnamed: 0,Region,EG.USE.COMM.FO.ZS,EN.ATM.GHGT.KT.CE,EN.POP.DNST
0,South Asia,1328.998062,33008140.0,44202.455145
1,Europe & Central Asia,19345.555666,77151910.0,319456.010751
2,Middle East & North Africa,8107.266415,27901280.0,68586.107779
3,Sub-Saharan Africa,4254.209111,19835920.0,53924.04833
4,Latin America & Caribbean,7896.366823,28464090.0,92435.011488
5,East Asia & Pacific,6620.593711,149743170.0,431804.961155
6,North America,944.73127,60954130.0,13577.721557


In [122]:
df_population_fossil_fuel_greenhouse_gas_relation.columns = ["Region", "fossil_fuel_consumption", 'greenhouse_gas_emission','populaiton_density']
df_population_fossil_fuel_greenhouse_gas_relation

Unnamed: 0,Region,fossil_fuel_consumption,greenhouse_gas_emission,populaiton_density
0,South Asia,1328.998062,33008140.0,44202.455145
1,Europe & Central Asia,19345.555666,77151910.0,319456.010751
2,Middle East & North Africa,8107.266415,27901280.0,68586.107779
3,Sub-Saharan Africa,4254.209111,19835920.0,53924.04833
4,Latin America & Caribbean,7896.366823,28464090.0,92435.011488
5,East Asia & Pacific,6620.593711,149743170.0,431804.961155
6,North America,944.73127,60954130.0,13577.721557


In [124]:
fig = px.scatter(df_population_fossil_fuel_greenhouse_gas_relation, 
                 x="populaiton_density",
                 y="fossil_fuel_consumption", 
                 color="Region",
                 size='greenhouse_gas_emission',
                 labels={"populaiton_density":"Population Density (group sum)","fossil_fuel_consumption":"Fossil Fuel Energy Consumption"},
                 height=600,
                 title="Relation between population density,fossil fuel consumption and greenhouse gas emission")
fig.show()

- The graph shows how fossil fuel consumption directly impacts the emission of greenhouse gases despite population density.
North America, despite its low population density has a very high volume of greenhouse gas emission. And East Asia & pacific with comparably less greenhouse gas emission even with highest population density.

### Relation between population density, renewable energy consumption and greenhouse gas emission