### Source: Hannah Ritchie and Max Roser (2020) - "CO₂ and Greenhouse Gas Emissions". Published online at OurWorldInData.org. Retrieved from: 'https://ourworldindata.org/co2-and-other-greenhouse-gas-emissions' [Online Resource]

#### Reuse our work freely
#### All visualizations, data, and code produced by Our World in Data are completely open access under the Creative Commons BY license. You have the permission to use, distribute, and reproduce these in any medium, provided the source and authors are credited.

#### The data produced by third parties and made available by Our World in Data is subject to the license terms from the original third-party authors. We will always indicate the original source of the data in our documentation, so you should always check the license of any such third-party data before use and redistribution.

#### All of our charts can be embedded in any site.

In [1]:
import pandas as pd

In [2]:
df = pd.read_csv('../../resources/annual-co-emissions-by-region.csv')

In [3]:
df.head()

Unnamed: 0,Entity,Code,Year,Annual CO2 emissions
0,Afghanistan,AFG,1750,0.0
1,Afghanistan,AFG,1751,0.0
2,Afghanistan,AFG,1752,0.0
3,Afghanistan,AFG,1753,0.0
4,Afghanistan,AFG,1754,0.0


In [4]:
df = df[df['Year'] >= 2010]

In [5]:
df.drop(columns='Code', inplace=True)
df.rename(columns={'Entity': 'Country'}, inplace=True)
df.reset_index(drop=True, inplace=True)

In [6]:
df.head(20)

Unnamed: 0,Country,Year,Annual CO2 emissions
0,Afghanistan,2010,8397779.0
1,Afghanistan,2011,12105790.0
2,Afghanistan,2012,10218510.0
3,Afghanistan,2013,8440766.0
4,Afghanistan,2014,7774340.0
5,Afghanistan,2015,7904133.0
6,Afghanistan,2016,6744628.0
7,Afghanistan,2017,6859825.0
8,Afghanistan,2018,10452670.0
9,Afghanistan,2019,10720330.0


In [7]:
df[df['Country'] == 'North Macedonia']

Unnamed: 0,Country,Year,Annual CO2 emissions
1560,North Macedonia,2010,8500480.0
1561,North Macedonia,2011,9196640.0
1562,North Macedonia,2012,8734976.0
1563,North Macedonia,2013,7771344.0
1564,North Macedonia,2014,7459904.0
1565,North Macedonia,2015,7023888.0
1566,North Macedonia,2016,6987248.0
1567,North Macedonia,2017,7507536.0
1568,North Macedonia,2018,6980909.0
1569,North Macedonia,2019,8041363.0


In [8]:
# rename North Macedonia to Macedonia
df['Country'][df['Country'] == 'North Macedonia'] = 'Macedonia'

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [9]:
# verify change
df['Country'][df['Country'] == 'Macedonia']

1560    Macedonia
1561    Macedonia
1562    Macedonia
1563    Macedonia
1564    Macedonia
1565    Macedonia
1566    Macedonia
1567    Macedonia
1568    Macedonia
1569    Macedonia
Name: Country, dtype: object

In [10]:
# load main dataset
main_df = pd.read_csv('../../dataset/MAIN_happ_temp_cw_gg_pop_pm.csv')

In [11]:
# get countries in main dataset
main_countries = main_df['Country'].unique()

In [12]:
# get countries in this dataset
df_countries = df['Country'].unique()

In [13]:
# drop country if not in main dataset
for country in df_countries:
    if country not in main_countries:
        df = df[df['Country'] != country]

In [14]:
new_countries = df['Country'].unique()

In [15]:
# missing one country compared to main dataset
for country in main_countries:
    if country not in new_countries:
        print(country)

In [16]:
# reset index
df.reset_index(drop=True, inplace=True)

In [17]:
# verify change
df.head(20)

Unnamed: 0,Country,Year,Annual CO2 emissions
0,Afghanistan,2010,8397779.0
1,Afghanistan,2011,12105788.0
2,Afghanistan,2012,10218514.0
3,Afghanistan,2013,8440766.0
4,Afghanistan,2014,7774340.0
5,Afghanistan,2015,7904133.0
6,Afghanistan,2016,6744628.0
7,Afghanistan,2017,6859825.0
8,Afghanistan,2018,10452666.0
9,Afghanistan,2019,10720332.0


In [18]:
# export for potential use
df.to_csv('../../dataset/annual_co2_2010-2019(updated_all_countries).csv', index=False)