<a href="https://colab.research.google.com/github/SandyCOG/Forecasting-Carbon-Emissions-across-Continents/blob/main/Forecasting_carbon_emissions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This project centers on analyzing and predicting carbon emissions, utilizing global CO2 and GHG datasets. The data encompasses emissions by country, sector, per GDP, and per capita, offering a comprehensive perspective for enhanced insights.


Introduction: Provide a brief overview of the purpose of the analysis and what the report aims to communicate. Ensure that the language is simple, jargon-free, and engaging.
Key Findings: Highlight the main insights and discoveries from your data analysis in a straightforward and digestible manner. Utilize layman's terms to explain any analytical concepts or findings.
Data Visualizations: Include clear and intuitive visualizations that complement your findings. Ensure that charts, graphs, and tables are easily interpreted and include concise explanations.
In-depth Analysis: Detail your analysis of each question, ensuring that the explanations of your findings are clear, compelling, and non-technical. Use analogies or simple comparisons where possible to elucidate your points.
Conclusion: Summarize the key takeaways and potential implications of your findings. Provide any recommendations or next steps straightforwardly.
Appendix (if necessary): Include any additional information, charts, or data that support your analysis but may not be critical for the main body of the report.


In [42]:
#import libaries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns


In [43]:
#load datasets
fossil_c0_df1 = pd.read_csv("https://raw.githubusercontent.com/SandyCOG/Forecasting-Carbon-Emissions-across-Continents/main/fossil_CO2_by_sector_country_su.csv", sep=';') #fossil_CO2_by_sector_country_su.csv
fossil_c0_df2 = pd.read_csv("https://raw.githubusercontent.com/SandyCOG/Forecasting-Carbon-Emissions-across-Continents/main/fossil_CO2_per_GDP_by_country.csv", sep=';')  #fossil_CO2_per_GDP_by_country.csv
fossil_c0_df3 = pd.read_csv("https://raw.githubusercontent.com/SandyCOG/Forecasting-Carbon-Emissions-across-Continents/main/fossil_CO2_per_GDP_by_country.csv", sep=';')  #fossil_CO2_per_GDP_by_country.csv
fossil_c0_df4 = pd.read_csv("https://raw.githubusercontent.com/SandyCOG/Forecasting-Carbon-Emissions-across-Continents/main/fossil_CO2_totals_by_country.csv", sep=';')   #fossil_CO2_totals_by_country.csv
fossil_ghg_df1 = pd.read_csv("https://raw.githubusercontent.com/SandyCOG/Forecasting-Carbon-Emissions-across-Continents/main/GHG_by_sector_and_country.csv", sep=';')    #GHG_by_sector_and_country.csv
fossil_ghg_df2 = pd.read_csv("https://raw.githubusercontent.com/SandyCOG/Forecasting-Carbon-Emissions-across-Continents/main/GHG_per_GDP_by_country.csv", sep=';')      #GHG_per_GDP_by_country.csv
fossil_ghg_df3 = pd.read_csv("https://raw.githubusercontent.com/SandyCOG/Forecasting-Carbon-Emissions-across-Continents/main/GHG_per_capita_by_country.csv", sep=';')   #GHG_per_capita_by_country.csv
fossil_ghg_df4 = pd.read_csv("https://raw.githubusercontent.com/SandyCOG/Forecasting-Carbon-Emissions-across-Continents/main/GHG_totals_by_country.csv", sep=';')       #GHG_totals_by_country.csv


In [50]:
dataframe_list = [fossil_c0_df1, fossil_c0_df2, fossil_c0_df3, fossil_c0_df4, fossil_ghg_df1, fossil_ghg_df2, fossil_ghg_df3, fossil_ghg_df4]

for df, name in zip(dataframe_list, ['fossil_c0_df1', 'fossil_c0_df2', 'fossil_c0_df3', 'fossil_c0_df4', 'fossil_ghg_df1', 'fossil_ghg_df2', 'fossil_ghg_df3', 'fossil_ghg_df4']):
  print(f'{name}: {df.shape}')

fossil_c0_df1: (1467, 57)
fossil_c0_df2: (212, 36)
fossil_c0_df3: (212, 36)
fossil_c0_df4: (214, 56)
fossil_ghg_df1: (4830, 57)
fossil_ghg_df2: (212, 35)
fossil_ghg_df3: (212, 55)
fossil_ghg_df4: (214, 55)


In [31]:
fossil_c0_df1.isna().sum()

Substance              2
Sector                 2
EDGAR Country Code     2
Country                2
1970                  50
1971                  50
1972                  49
1973                  45
1974                  47
1975                  45
1976                  40
1977                  39
1978                  36
1979                  36
1980                  47
1981                  48
1982                  49
1983                  46
1984                  47
1985                  48
1986                  47
1987                  48
1988                  48
1989                  49
1990                  47
1991                  50
1992                  48
1993                  51
1994                  50
1995                  48
1996                  45
1997                  45
1998                  48
1999                  50
2000                  50
2001                  48
2002                  51
2003                  46
2004                  31
2005                  30


**Removing column separator from columns**

The columns in the dataset make use of column separator for decimals, converting this to point.

In [55]:
#converting columns from object to numeric

def convert_to_numeric(df):
    # Replace commas with periods and convert to numeric
    df.iloc[:, 4:] = df.iloc[:, 4:].replace(',', '.', regex=True).apply(pd.to_numeric)
    return df

list_of_dataframes = [fossil_c0_df1, fossil_ghg_df1]

# Apply the conversion function to each dataframe in the list
list_of_dataframes = [convert_to_numeric(df) for df in list_of_dataframes]


In [56]:
#converting columns from object to numeric

def convert_to_numeric(df):
    # Replace commas with periods and convert to numeric
    df.iloc[:, 3:] = df.iloc[:, 3:].replace(',', '.', regex=True).apply(pd.to_numeric)
    return df

list_of_dataframes = [fossil_c0_df2, fossil_c0_df3, fossil_c0_df4]

# Apply the conversion function to each dataframe in the list
list_of_dataframes = [convert_to_numeric(df) for df in list_of_dataframes]

In [57]:
#converting columns from object to numeric

def convert_to_numeric(df):
    # Replace commas with periods and convert to numeric
    df.iloc[:, 2:] = df.iloc[:, 2:].replace(',', '.', regex=True).apply(pd.to_numeric)
    return df

list_of_dataframes = [fossil_ghg_df2, fossil_ghg_df3, fossil_ghg_df4]

# Apply the conversion function to each dataframe in the list
list_of_dataframes = [convert_to_numeric(df) for df in list_of_dataframes]

In [44]:
#checking dataset
fossil_c0_df1.head()

Unnamed: 0,Substance,Sector,EDGAR Country Code,Country,1970,1971,1972,1973,1974,1975,...,2013,2014,2015,2016,2017,2018,2019,2020,2021,2022
0,CO2,Agriculture,AFG,Afghanistan,0.029228567,0.029228567,0.029228567,0.029228567,0.039966661,0.045309517,...,0.055157133,0.084490461,0.116966646,0.162799971,0.310880897,0.160914257,0.15043807,0.064795227,0.054360944,0.046161495
1,CO2,Agriculture,ALB,Albania,0.1133,0.1133,0.1133,0.1133,0.113614286,0.112514286,...,0.032738093,0.056623805,0.058719042,0.049604756,0.056676186,0.048976186,0.069404755,0.063066662,0.063936185,0.064880163
2,CO2,Agriculture,ARG,Argentina,0.10434285,0.10434285,0.10434285,0.10434285,0.087214278,0.077314278,...,0.999166539,1.145152229,0.892257036,1.359547443,1.278199828,1.636118833,1.703061665,1.925471156,2.108102551,2.323374909
3,CO2,Agriculture,ARM,Armenia,0.055288203,0.055288203,0.055288203,0.055288203,0.059966435,0.059966435,...,0.021685714,0.022628571,0.022628571,0.022471428,0.034257142,0.035985713,0.043738093,0.035409523,0.036038094,0.036771428
4,CO2,Agriculture,AUS,Australia,0.311142842,0.311142842,0.311142842,0.311142842,0.311142842,0.268190461,...,2.128866419,2.182923567,2.291771194,2.505223526,2.641204403,2.155371198,2.290199762,2.433042605,2.452371173,2.471969198


In [52]:
#checking dataset
fossil_c0_df2.head()

Unnamed: 0,Substance,EDGAR Country Code,Country,1990,1991,1992,1993,1994,1995,1996,...,2013,2014,2015,2016,2017,2018,2019,2020,2021,2022
0,CO2,ABW,Aruba,0.093471292,0.09849099,0.103218745,0.097050738,0.099128235,0.106855794,0.069570012,...,0.115619417,0.118461207,0.114797587,0.116795852,0.104161173,0.106899738,0.122983685,0.128353085,0.105767784,0.094146157
1,CO2,AFG,Afghanistan,0.094548388,0.101383388,0.06068518,0.077302468,0.094959089,0.056064539,0.057971681,...,0.127254112,0.118014858,0.122507514,0.10905088,0.114397788,0.107525189,0.094807687,0.071289398,0.092701446,0.093302385
2,CO2,AGO,Angola,0.164742737,0.171108906,0.187014898,0.246363328,0.230244439,0.22569967,0.24047217,...,0.131328892,0.136870117,0.145461308,0.139891371,0.119901827,0.114588797,0.118564695,0.102793394,0.105699576,0.096035342
3,CO2,AIA,Anguilla,0.024729646,0.039857308,0.031428805,0.038148938,0.048995916,0.060998789,0.062552307,...,0.092174944,0.096833334,0.112299701,0.125703897,0.148668854,0.187873051,0.122306011,0.107834823,0.098602277,0.090136584
4,CO2,ALB,Albania,0.419026004,0.383600484,0.223042246,0.191501385,0.186461492,0.145310473,0.132203156,...,0.145043454,0.152408698,0.142281288,0.128107144,0.145769656,0.138137054,0.127189117,0.11418586,0.106195393,0.104264978


In [53]:
#checking dataset

fossil_c0_df3.head()

Unnamed: 0,Substance,EDGAR Country Code,Country,1990,1991,1992,1993,1994,1995,1996,...,2013,2014,2015,2016,2017,2018,2019,2020,2021,2022
0,CO2,ABW,Aruba,0.093471292,0.09849099,0.103218745,0.097050738,0.099128235,0.106855794,0.069570012,...,0.115619417,0.118461207,0.114797587,0.116795852,0.104161173,0.106899738,0.122983685,0.128353085,0.105767784,0.094146157
1,CO2,AFG,Afghanistan,0.094548388,0.101383388,0.06068518,0.077302468,0.094959089,0.056064539,0.057971681,...,0.127254112,0.118014858,0.122507514,0.10905088,0.114397788,0.107525189,0.094807687,0.071289398,0.092701446,0.093302385
2,CO2,AGO,Angola,0.164742737,0.171108906,0.187014898,0.246363328,0.230244439,0.22569967,0.24047217,...,0.131328892,0.136870117,0.145461308,0.139891371,0.119901827,0.114588797,0.118564695,0.102793394,0.105699576,0.096035342
3,CO2,AIA,Anguilla,0.024729646,0.039857308,0.031428805,0.038148938,0.048995916,0.060998789,0.062552307,...,0.092174944,0.096833334,0.112299701,0.125703897,0.148668854,0.187873051,0.122306011,0.107834823,0.098602277,0.090136584
4,CO2,ALB,Albania,0.419026004,0.383600484,0.223042246,0.191501385,0.186461492,0.145310473,0.132203156,...,0.145043454,0.152408698,0.142281288,0.128107144,0.145769656,0.138137054,0.127189117,0.11418586,0.106195393,0.104264978


In [54]:
#checking dataset

fossil_c0_df4.head()

Unnamed: 0,Substance,EDGAR Country Code,Country,1970,1971,1972,1973,1974,1975,1976,...,2013,2014,2015,2016,2017,2018,2019,2020,2021,2022
0,CO2,ABW,Aruba,0.025213789,0.028827752,0.039472108,0.044289439,0.043469147,0.057396274,0.05642291,...,0.424889796,0.435250166,0.436735938,0.453740207,0.42684207,0.461101331,0.53384758,0.453586967,0.437952665,0.455064864
1,CO2,AFG,Afghanistan,1.734053007,1.733842398,1.693672367,1.733882869,2.190253892,2.028877749,1.892591452,...,8.69107407,8.279686389,8.719594839,7.937268307,8.546887543,8.128904313,7.44790224,5.468681001,5.639214369,5.675770659
2,CO2,AGO,Angola,8.948152992,8.533778718,10.38370373,11.36732743,11.82785646,10.92409928,7.310833038,...,27.81283283,30.38516604,32.59642457,30.53968511,26.13739933,24.65034199,25.32660459,20.7190364,21.56059954,20.18566845
3,CO2,AIA,Anguilla,0.002177587,0.002177689,0.00227319,0.00211848,0.002359836,0.002593654,0.002444145,...,0.027961269,0.02791705,0.028026636,0.028361313,0.029084088,0.028244835,0.027602021,0.022801673,0.022015917,0.022819879
4,CO2,AIR,International Aviation,168.6025154,168.6025154,178.8060502,186.4739584,179.4544919,173.5397005,173.8524567,...,481.6120758,497.3298679,525.8753816,548.4834487,583.3149976,609.3204042,619.2339043,295.392661,340.8376858,420.3664792


In [61]:
#checking dataset
fossil_ghg_df1.head()

Unnamed: 0,Substance,Sector,EDGAR Country Code,Country,1970,1971,1972,1973,1974,1975,...,2013,2014,2015,2016,2017,2018,2019,2020,2021,2022
0,CO2,Agriculture,AFG,Afghanistan,0.029229,0.029229,0.029229,0.029229,0.039967,0.04531,...,0.055157,0.08449,0.116967,0.1628,0.310881,0.160914,0.150438,0.064795,0.054361,0.046161
1,CO2,Agriculture,ALB,Albania,0.1133,0.1133,0.1133,0.1133,0.113614,0.112514,...,0.032738,0.056624,0.058719,0.049605,0.056676,0.048976,0.069405,0.063067,0.063936,0.06488
2,CO2,Agriculture,ARG,Argentina,0.104343,0.104343,0.104343,0.104343,0.087214,0.077314,...,0.999167,1.145152,0.892257,1.359547,1.2782,1.636119,1.703062,1.925471,2.108103,2.323375
3,CO2,Agriculture,ARM,Armenia,0.055288,0.055288,0.055288,0.055288,0.059966,0.059966,...,0.021686,0.022629,0.022629,0.022471,0.034257,0.035986,0.043738,0.03541,0.036038,0.036771
4,CO2,Agriculture,AUS,Australia,0.311143,0.311143,0.311143,0.311143,0.311143,0.26819,...,2.128866,2.182924,2.291771,2.505224,2.641204,2.155371,2.2902,2.433043,2.452371,2.471969


In [58]:
#checking dataset
fossil_ghg_df2.head()

Unnamed: 0,EDGAR Country Code,Country,1990,1991,1992,1993,1994,1995,1996,1997,...,2013,2014,2015,2016,2017,2018,2019,2020,2021,2022
0,ABW,Aruba,0.10584,0.110602,0.115211,0.108833,0.110675,0.11868,0.08093,0.118309,...,0.126221,0.129302,0.12524,0.127188,0.114007,0.116399,0.132654,0.139996,0.115727,0.102756
1,AFG,Afghanistan,0.449669,0.507757,0.465382,0.628269,0.817943,0.561866,0.651665,0.737066,...,0.452435,0.444916,0.438469,0.419966,0.425273,0.415871,0.385993,0.366971,0.475962,0.478661
2,AGO,Angola,0.510143,0.503236,0.520981,0.762617,0.97466,0.975982,1.027785,0.998562,...,0.391848,0.383724,0.402118,0.398956,0.37565,0.363591,0.362508,0.355639,0.341278,0.316286
3,AIA,Anguilla,0.035708,0.054927,0.043255,0.050091,0.061326,0.074167,0.076553,0.066703,...,0.10935,0.115066,0.133502,0.149504,0.176325,0.223699,0.146227,0.132924,0.122122,0.110988
4,ALB,Albania,0.729208,0.795336,0.584378,0.536527,0.550559,0.460684,0.405479,0.401371,...,0.258876,0.269044,0.258515,0.239347,0.252916,0.241784,0.224396,0.20867,0.192454,0.185543


In [59]:
#checking dataset
fossil_ghg_df3.head()

Unnamed: 0,EDGAR Country Code,Country,1970,1971,1972,1973,1974,1975,1976,1977,...,2013,2014,2015,2016,2017,2018,2019,2020,2021,2022
0,ABW,Aruba,0.764874,0.842243,1.017002,1.093028,1.075665,1.30451,1.290385,1.472615,...,4.495219,4.577104,4.566385,4.713839,4.438256,4.751366,5.429571,4.648065,4.478428,4.641895
1,AFG,Afghanistan,1.558155,1.514547,1.308854,1.32743,1.4023,1.428421,1.400988,1.389203,...,0.973788,0.95288,0.925066,0.882018,0.894256,0.864369,0.814933,0.739738,0.744043,0.731973
2,AGO,Angola,2.971847,2.889799,3.202473,3.349278,3.355909,3.109663,2.526316,3.028038,...,3.191958,3.164378,3.234489,3.022751,2.749379,2.541601,2.436028,2.183618,2.053871,1.900082
3,AIA,Anguilla,0.664745,0.659985,0.670368,0.640851,0.674738,0.707861,0.679611,0.692013,...,2.319825,2.294322,2.280345,2.284678,2.313661,2.235358,2.174801,1.839092,1.81783,1.873253
4,ALB,Albania,3.84137,3.726499,4.055711,3.74412,3.776578,3.779692,3.951595,4.08805,...,2.917092,3.083604,3.025938,2.891528,3.167459,3.145276,2.975907,2.672676,2.682786,2.708985


In [60]:
#checking dataset
fossil_ghg_df4.head()

Unnamed: 0,EDGAR Country Code,Country,1970,1971,1972,1973,1974,1975,1976,1977,...,2013,2014,2015,2016,2017,2018,2019,2020,2021,2022
0,ABW,Aruba,0.045176,0.050063,0.060857,0.065847,0.065108,0.079128,0.078179,0.088896,...,0.463848,0.475081,0.476461,0.494114,0.467189,0.502077,0.575822,0.494731,0.479192,0.496683
1,AFG,Afghanistan,17.336192,17.292793,15.342251,15.96606,17.278443,17.984253,17.989107,18.153414,...,30.899954,31.214446,31.208495,30.56723,31.773007,31.43984,30.322851,28.150674,28.953697,29.117879
2,AGO,Angola,20.138364,20.018416,22.721001,24.375911,25.083208,23.889924,19.960419,24.620945,...,82.985493,85.186645,90.110604,87.096037,81.888045,78.215749,77.435411,71.682513,69.613898,66.480058
3,AIA,Anguilla,0.004256,0.004258,0.004358,0.004201,0.004451,0.004692,0.004526,0.004627,...,0.033171,0.033174,0.033318,0.033731,0.034494,0.033631,0.033,0.028107,0.027267,0.028099
4,AIR,International Aviation,171.160455,171.147982,181.492019,189.261213,182.123246,176.107638,176.412012,192.354128,...,488.025465,503.952576,532.878284,555.78522,591.079314,617.430356,627.478302,299.325868,345.376001,425.963735


In [45]:
fossil_c0_df1.dtypes

Substance             object
Sector                object
EDGAR Country Code    object
Country               object
1970                  object
1971                  object
1972                  object
1973                  object
1974                  object
1975                  object
1976                  object
1977                  object
1978                  object
1979                  object
1980                  object
1981                  object
1982                  object
1983                  object
1984                  object
1985                  object
1986                  object
1987                  object
1988                  object
1989                  object
1990                  object
1991                  object
1992                  object
1993                  object
1994                  object
1995                  object
1996                  object
1997                  object
1998                  object
1999                  object
2000          