# Global CO2 Emissions

Aims of this EDA is to identify which countries the the largest contributers to CO2 emissions and how the main contributers have tackled their emission crisis over the last 10 years. Early predicitons based on general knowledge around CO2 emissions are the the production powerhouses (China and USA) will lead countries in CO2 emissions. Considering their population count will be significantly higher than other countries, each countries most recent score for COe emisisons will be compared to its population, thus, providing an insight into the countries social conciousness around CO2 emissions.

Graphs to generate:

1. CO2 emissions trend over the years
2. Worst year for CO2 emissions
3. Top 10 largest CO2 contributors since 1975
4. Ratio of CO2 emissions in the worst year for top 10 countries

In [None]:
# Import Dependencies
%matplotlib inline

# Data Manipulaiton
import numpy as np
import pandas as pd

# Data Visualisation
import matplotlib.pyplot as plt
import missingno
import seaborn as sns
plt.style.use('seaborn-whitegrid')

# Plotly Libraris
import plotly.express as px
import plotly.graph_objects as go

# Ignoring warning 
import warnings
warnings.filterwarnings('ignore')

print("Setup Complete")

In [None]:
co2_filepath = '../input/co2-ghg-emissionsdata/co2_emission.csv'
co2_df = pd.read_csv(co2_filepath)
print('Import complete')

# Data Description

The CSV file gives all CO2 emissions for each country (from 1750 to 2017).

One line will give you an emission value for a given country at a given year.

In [None]:
#Renaming columns
co2_df=co2_df.rename(columns={"Annual CO₂ emissions (tonnes )":"CO2"})
print('Column renamed')

In [None]:
df_bin = co2_df.copy()

list(df_bin.columns)

In [None]:
df_bin.isnull().sum()

 There are many missing values for code, which is an irrelvant column for this analysis, hence, it is dropped from the subset.

In [None]:
#Dropping 'Code' from subset
df_bin.drop(['Code'], axis = 1, inplace = True)

In [None]:
df_bin.head()

# 1. Cumulative CO2 Emissions by Country

In [None]:
list(df_bin.columns)

In [None]:
df_bin.drop(['Year'], axis = 1, inplace = True)

list(df_bin.columns)

In [None]:
df_bin.head()

In [None]:
df_bin=df_bin.groupby("Entity").sum()

In [None]:
df_bin.reset_index(inplace=True)

list(df_bin.columns)

In [None]:
df_bin.head()

In [None]:
df_bin=df_bin.sort_values('CO2',ascending=False)

In [None]:
df_bin15=df_bin.head(15)
df_bin15.head()

### Barchart

In [None]:
plt.figure(figsize=(20,8))
sns.barplot(x = df_bin15["Entity"], y = df_bin15['CO2'])
plt.title("Cumulative CO2 emission per country")

### Treemap 

In [None]:
fig = px.treemap(df_bin15,path = ['Entity'],values = 'CO2')
fig.update_layout(title='Co2 Emission ',title_x=0.5)
fig.show()

Exluding global contribtion (world), the US is responsible for the majority of CO2 emissions since 1975, the EU slightly lagging behind. 

# 2. Worst Year for CO2 Emission Crisis

As 'World' is the summation of the CO2 released by every country that year, all other countries will be dropped from the subset except 'World'.

In [None]:
df_worst = co2_df.loc[co2_df['Entity'] == 'World']

In [None]:
# As country and code is not needed, they will be dropped.
df_worst.drop(['Entity'], axis = 1, inplace = True)
df_worst.drop(['Code'], axis = 1, inplace = True)

In [None]:
df_worst.head()

In [None]:
df_worst=df_worst.groupby("Year").sum()
df_worst.reset_index(inplace=True)
list(df_worst.columns)


In [None]:
#Sorting values
df_worst=df_worst.sort_values('Year',ascending=False)
df_worst.head()

### Barchart

In [None]:
plt.figure(figsize=(20,8))
sns.barplot(x = df_worst["Year"], y = df_worst['CO2'])
plt.title('1975 - 2017 Global CO2 Emissions')

The x-axis is unreadable, but since is ranges from 1975 to 2017 (left to right) it can be easily be seen that the worst years are the most recent years. Further analysis will explore the 21st Centuary.

In [None]:
df_worst17=df_worst.head(28)

### Barchart

In [None]:
plt.figure(figsize=(20,8))
sns.barplot(x = df_worst17["Year"], y = df_worst17['CO2'])
plt.title("1991 - 2017 Global CO2 Emissions")
plt.ylabel('CO2 (Tonnes)')

### Piechart

In [None]:
fig = go.Figure([go.Pie(labels=df_worst17['Year'], values=df_worst17['CO2'],
                        pull=[0.2, 0.1, 0, 0],
                        hole=0.3)])  # can change the size of hole 

fig.update_traces(hoverinfo='label+percent', textinfo='percent', textfont_size=15)
fig.update_layout(title="1991-2017 Global CO2 Emission",title_x=0.5,
                 annotations=[dict(text='CO2', x=0.50, y=0.5, font_size=20, showarrow=False)])
fig.show()

Despite how figure 1 looks, zooming into the last 17 years, the increase in CO2 emissions have been relatively incremental, which is expected with increasing demands on productivity and technology and an increasing global population. Nevertheless, the increase in CO2 emissions since 2000 in roughly 6.2 x10^7 tonnes which is still a daughtingly large number.

# 3. CO2 Released by Each Country

As analysis is to critque the result of recent efforts to mitigate to CO2 emissions crisis. Analysis will be focused on results since 2010.

In [None]:
df_ent =  co2_df.loc[co2_df['Entity'] != 'World']
print("Is 'World' in subset?: {}".format('World' in df_ent['Entity']))

In [None]:
df_ent.drop(['Code'], axis = 1, inplace = True)

df_ent.head()

In [None]:
df_ent2 = df_ent[(df_ent['Year']>= 2010)]

In [None]:
df_ent2.head()

In [None]:
df_ent2=df_ent2.groupby("Entity").sum()
df_ent2.reset_index(inplace=True)
list(df_ent2.columns)

In [None]:
df_ent2=df_ent2.sort_values('CO2',ascending=False)
df_ent2.head()

### Treemap

In [None]:
fig = px.treemap(df_ent2,path = ['Entity'],values = 'CO2')
fig.update_layout(title='Co2 Emission ',title_x=0.5)
fig.show()

In [None]:
# Showing the top 10 countries
df_ent10 = df_ent2.head(10)

#Barchart
plt.figure(figsize=(20,8))
sns.barplot(x = df_ent10["Entity"], y = df_ent10['CO2'])
plt.title("2000 - 2017 CO2 Emissions by Country")
plt.ylabel('CO2 (Tonnes)')
plt.xlabel('Country')

As expected the two of the modern-day super powers (China and USA) are the largest contributors amongst all other countries and continents.

## Showing top 10 CO2 emisisons in 2017 per country in ratio form

In [None]:
# Creating subset with 'World' values for CO2 in 2017
world_info=co2_df[co2_df["Entity"]=="World"]
world_info = world_info[world_info['Year']== 2017]
world_info = world_info['CO2']
world_CO2 = world_info.values
world_CO2

#Selecting values in 2017
df_ent17 = df_ent[df_ent['Year']==2017]
df_ent17 = df_ent17.sort_values('CO2', ascending = False) 
# df_ent17.head()
df_ent17["ratio"]=(df_ent17["CO2"]/world_CO2)*100    # Creating ratio column
df_ent17.head() 
df_ent17=df_ent17.sort_values("ratio",ascending=False)[:10]

# Plot to show ratio of CO2 emisisons amongst top 10 contributors
fig = go.Figure(go.Funnel(
    y=df_ent17["Entity"],
    x=df_ent17["ratio"] ))
fig.update_layout(title='Ratio of 2017 Emissions by Entity ',xaxis_title="Ratio",yaxis_title=" Entity ",title_x=0.5)
fig.show()

# 4. Conclusion

Whilst the USA historcially accounts for the most CO2 emissions (since 1975) amongst all entitys, China accounts for 27% of global CO2 emisisons in 2017 with USA contributing ~ 15% and India ~ 7%. Global CO2 emissions have seen a sharp increase since the 1990's (approx), however in recent years this increase has slowed down with increasing global awareness for CO2 emissions and new global policies to combat CO2 emissions (e.g. Paris 2050 and Kyoto 2020). It is safe to predict that the CO2 emissions will eventurally come to a steady-state, but this cannot be achived without proactive efforts from all countries, especially, China, USA, India, ****and Russia.