# Actors in Climate Negotiations

This notebook shows how to get several characteristics of different countries or group of countries. In particular, we want to combine:
- population data
- economic data (GDP, both nominal and PPP)
- emissions data
- electricity consumption data

## Load prerequisities

(The prerequisities had to be already installed, e.g. ``pip install -r notebooks/requirements.txt`` in terminal.)

In [1]:
import numpy as np
import pandas as pd
import world_bank_data as wb
import pycountry

The history saving thread hit an unexpected error (DatabaseError('database disk image is malformed')).History will not be written to the database.


## Population and economic data

We can load the data from World Bank:
- [SP.POP.TOTL](https://data.worldbank.org/indicator/SP.POP.TOTL) indicator = population data
- [NY.GDP.MKTP.CD](https://data.worldbank.org/indicator/NY.GDP.MKTP.CD) indicator = nominal GDP
- [NY.GDP.MKTP.PP.KD](https://data.worldbank.org/indicator/NY.GDP.MKTP.PP.KD) indicator = GDP, PPP in constant 2017 international dollars

Nominal GDP makes more sense when we compare the overall economic sizes of countries, whereas GDP expressed in purchasing power parity (PPP) is better for showing the differences in economic wellbeing over countries.

World Bank data are keyed by [alpha-3 ISO 3166](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-3) country codes, so it is very easy to join it with additional country data later. We load only the most recent value for each country.

In [2]:
to_drop = ['Series', 'Year']

# population
pop = wb.get_series('SP.POP.TOTL', id_or_value='id', mrv=1).reset_index().drop(columns=to_drop)

# GDP PPP
gdp_ppp = wb.get_series('NY.GDP.MKTP.PP.KD', id_or_value='id', mrv=1, gapfill='Y').reset_index().drop(columns=to_drop)

# GDP
gdp = wb.get_series('NY.GDP.MKTP.CD', id_or_value='id', mrv=1, gapfill='Y').reset_index().drop(columns=to_drop)

In [3]:
# merge and rename
df = pd.merge(pd.merge(pop, gdp), gdp_ppp) \
    .rename(columns={'SP.POP.TOTL': 'pop', 'NY.GDP.MKTP.CD': 'gdp', 'NY.GDP.MKTP.PP.KD': 'gdp_ppp', 'Country': 'code'})

In [4]:
df.head()

Unnamed: 0,code,pop,gdp,gdp_ppp
0,AFE,677243299.0,898474100000.0,2294226000000.0
1,AFW,458803476.0,786585000000.0,1836663000000.0
2,ARB,436080728.0,2530186000000.0,5997727000000.0
3,CSS,7442291.0,68415250000.0,110476800000.0
4,CEB,102246330.0,1644372000000.0,3203326000000.0


## GHG emissions data

Emissions data are loaded from EDGAR database (Emissions Database for Global Atmospheric Research) published by Joint Research Centre of European Commission. We use the newest version [EDGAR v6.0](https://edgar.jrc.ec.europa.eu/dataset_ghg60) that contains emissions data for CO2, CH4 and N2O for all countries up to 2018.

The EDGAR files were downloaded and unzipped into ``../data/edgar/v6.0/``.

In [5]:
# EDGAR v6.0
ghgs = ['CO2', 'CH4', 'N2O']
edgar_files = ['CO2_excl_short-cycle_org_C', 'CH4', 'N2O']
edgar = None

for gas in ghgs:
    ef = 'CO2_excl_short-cycle_org_C' if gas == 'CO2' else gas
    ey = 2018 # if gas == 'CO2' else 2015
    filename = f'../data/edgar/v6.0/v60_{ef}_1970_{ey}.xls'
    frame = pd.read_excel(filename, sheet_name='TOTALS BY COUNTRY', header=9)
    frame = frame[['Country_code_A3'] + [f'Y_{y}' for y in range(1970, ey + 1)]] \
                  .rename(columns={'Country_code_A3': 'code', **{f'Y_{y}': y for y in range(1970, ey + 1)}})
    frame = frame.groupby('code').sum()
    frame.columns = frame.columns.rename('year')
    frame = frame.unstack().rename(gas).reset_index()
    frame = frame[~frame['code'].isin(['SEA', 'AIR'])]
    if edgar is None:
        edgar = frame
    else:
        edgar = pd.merge(edgar, frame, how='outer')

In [6]:
edgar.tail()

Unnamed: 0,year,code,CO2,CH4,N2O
11265,2018,PCN,,0.0,
11266,2018,SGS,,0.0,
11267,2018,TKL,,0.009969,0.000401
11268,2018,UMI,,0.0,
11269,2018,WLF,,0.212954,0.01298


We use only values for 2018 and combine these three gases into CO2eq based on GWP100 values from IPCC AR5.

In [7]:
ghg = edgar.query('year == 2018').drop(columns=['year']).sort_values('code').reset_index(drop=True)
ghg['ghg'] = ghg['CO2'] + 28 * ghg['CH4'] + 265 * ghg['N2O']

In [8]:
ghg.head()

Unnamed: 0,code,CO2,CH4,N2O,ghg
0,ABW,874.590901,1.14347,0.055769,921.386896
1,AFG,8710.281298,620.084743,15.055613,30062.391483
2,AGO,26343.833629,1721.085685,16.852362,79000.108746
3,AIA,28.222981,0.153591,0.001592,32.945529
4,ALB,5479.763088,96.805143,3.410042,9093.968249


In [9]:
# And combine emissions data with World Bank data

df = pd.merge(df, ghg)
print(f'The total emissions were {ghg.ghg.sum() / 1e6:.4g} Gt CO2eq before merge and {df.ghg.sum() / 1e6:.4g} after merge Gt CO2eq.')

The total emissions were 49.26 Gt CO2eq before merge and 48.84 after merge Gt CO2eq.


The countries in the two data are not identical, so the total world emissions are slightly lower after the merge, by less than 1 percent. This should not affect comparison of the selected groups of countries.

## Electricity consumption data

We use data provided by EMBER in their [Global Electricity Review 2021](https://ember-climate.org/project/global-electricity-review-2021/). The full dataset was downloaded into ``../data/ember/``. Unfortunately the dataset does not contain ISO 3166 country codes, so we use ``pycountry`` package with additional country renaming to map the countries correctly. We use the electricity demand data for 2019 (the data for 2020 are incomplete in the dataset) and we use ``Demand`` variable.

In [10]:
# Fixing country names in EMBER data

missing = ['EU-27', 'EU27+1', 'Kosovo', 'Netherlands Antilles', 'U.S. Pacific Islands', 'Wake Island', 'World']
convert = {
    'Bolivia': 'Bolivia, Plurinational State of',
    'Brunei': 'Brunei Darussalam',
    'Burma': 'Myanmar',
    'Congo-Brazzaville': 'Congo',
    'Congo-Kinshasa': 'Congo, The Democratic Republic of the',
    'Cote d\'Ivoire': 'Côte d\'Ivoire',
    'Falkland Islands': 'Falkland Islands (Malvinas)',
    'Gambia, The': 'Gambia',
    'Iran': 'Iran, Islamic Republic of',
    'Laos': 'Lao People\'s Democratic Republic',
    'Macau': 'Macao',
    'North Korea': 'Korea, Democratic People\'s Republic of',
    'Palestinian Territories': 'Palestine, State of',
    'Reunion': 'Réunion',
    'Russia': 'Russian Federation',
    'Saint Helena': 'Saint Helena, Ascension and Tristan da Cunha',
    'Saint Vincent/Grenadines': 'Saint Vincent and the Grenadines',
    'South Korea': 'Korea, Republic of',
    'Syria': 'Syrian Arab Republic',
    'The Bahamas': 'Bahamas',
    'U.S. Virgin Islands': 'Virgin Islands, U.S.'
}

def country_lookup(x):
    #print(x)
    if x in missing:
        return None
    if x in convert:
        x = convert[x]
    return pycountry.countries.lookup(x).alpha_3

In [11]:
ember = pd.read_excel('../data/ember/Data-Global-Electricity-Review-2021.xlsx', sheet_name='Data', skiprows=1)
ember = ember[(ember['Year'] == 2019) & (ember['Variable'] == 'Demand')].copy()
ember['code'] = ember.Area.apply(country_lookup)
ember = ember.dropna(subset=['code'])[['code', 'Generation (TWh)']].rename(columns={'Generation (TWh)': 'electricity'})

In [12]:
ember.head()

Unnamed: 0,code,electricity
71399,AFG,6.420333
71416,ALB,2.26294
71433,DZA,74.237227
71450,ASM,0.163
71467,AGO,12.477753


In [13]:
# combine all the datasources together again

df = pd.merge(pd.merge(pop, gdp), gdp_ppp) \
    .rename(columns={'SP.POP.TOTL': 'pop', 'NY.GDP.MKTP.CD': 'gdp', 'NY.GDP.MKTP.PP.KD': 'gdp_ppp', 'Country': 'code'})
df = pd.merge(df, ghg)
df = pd.merge(df, ember, how='left')
df['electricity'] = df['electricity'].fillna(0)

In [14]:
# include country names to the dataset
df['country'] = df['code'].apply(lambda x: pycountry.countries.get(alpha_3=x).name)
df = df[['country', 'code', 'pop', 'gdp', 'gdp_ppp', 'ghg', 'electricity']].sort_values('country').copy()

In [15]:
df.head()

Unnamed: 0,country,code,pop,gdp,gdp_ppp,ghg,electricity
0,Afghanistan,AFG,38928341.0,19807070000.0,77037690000.0,30062.391483,6.420333
1,Albania,ALB,2837743.0,14799620000.0,37728960000.0,9093.968249,2.26294
2,Algeria,DZA,43851043.0,145163900000.0,468402800000.0,265761.145392,74.237227
3,American Samoa,ASM,55197.0,638000000.0,,16.363991,0.163
4,Angola,AGO,32866268.0,62306910000.0,203707900000.0,79000.108746,12.477753


- ``pop`` = country population (the most recent values are for 2020 for most countries)
- ``gdp`` = nominal GDP (dtto)
- ``gdp_ppp`` = GDP PPP in constant 2017 international dollars (dtto)
- ``ghg`` = greenhouse gases emissions in 2018 in Gg CO2eq (gigagrams, i.e. thousands of tonnes, the default unit in EDGAR database; includes CO2, CH4, N2O only)
- ``electricity`` = electricity demand in TWh in 2019

In [16]:
df.to_csv('../outputs/actors-in-climate-negotiations-countries.csv', index=False)

## Actors in negotiations

Some countries are relatively similar, in terms of their characteristics and in terms of the impact of climate change. We define several groups, so it is easier to reason about the main actors in the negotiations.

In [17]:
# EU 27
eu = ['AUT', 'BEL', 'BGR', 'HRV', 'CYP', 'CZE', 'DNK', 'EST', 'FIN', 'FRA', 'DEU', 'GRC', 'HUN', 'IRL', 'ITA', 'LVA', 'LTU', 'LUX', 'MLT', 'NLD', 'POL', 'PRT', 'ROU', 'SVK', 'SVN', 'ESP', 'SWE']

# Oil states of Persian Gulf
oil = ['IRQ', 'IRN', 'KWT', 'SAU', 'ARE', 'BHR', 'QAT', 'OMN']

# Africa, excluding only South Africa as it differs significantly from other African countries
africa = ['LBY', 'DZA', 'TUN', 'EGY', 'DJI', 'ERI', 'MYT', 'SHN', 'SOM', 'ESH', 'GNQ', 'SYC', 'MUS', 'GAB', 'BWA', 'NAM', 'SWZ', 'MAR', 'AGO', 'CPV', 'COG', 'NGA', 'SDN', 'GHA', 'MRT', 'ZMB', 'CIV', 'CMR', 'SEN', 'STP', 'LSO', 'KEN', 'TZA', 'ZWE', 'COM', 'TCD', 'BEN', 'MLI', 'SSD', 'GIN', 'RWA', 'UGA', 'BFA', 'ETH', 'GMB', 'GNB', 'TGO', 'MDG', 'SLE', 'LBR', 'MOZ', 'MWI', 'NER', 'COD', 'BDI', 'CAF']

# Small islands states (includes a few similar coastal countries)
islands = ['ATG', 'BHS', 'BHR', 'BRB', 'BLZ', 'CPV', 'COM', 'CUB', 'DMA', 'DOM', 'FJI', 'GRD', 'GNB',
           'GUY', 'HTI', 'JAM', 'KIR', 'MDV', 'MHL', 'FSM', 'MUS', 'NRU', 'PLW', 'PNG', 'WSM', 'STP',
           'SGP', 'KNA', 'LCA', 'VCT', 'SYC', 'SLB', 'SUR', 'TLS', 'TON', 'TTO', 'TUV', 'VUT', 'ASM',
           'ABW', 'BMU', 'VGB', 'CYM', 'CUW', 'PYF', 'GUM', 'NCL', 'PRI', 'SXM', 'TCA', 'VIR']

# Southeast Asia: Indonesia, Phillipines, Malaysia, Brunei, Thailand, Cambodia, Vietnam, Laos, Myanmar, Bangladesh
se_asia = ['IDN', 'PHL', 'MYS', 'BRN', 'THA', 'KHM', 'VNM', 'LAO', 'MMR', 'BGD']

# South and central America (excluding what is already included in small islands group)
cs_america_all = ['VIR', 'AIA', 'ATG', 'ARG', 'ABW', 'BHS', 'BRB', 'BLZ', 'BOL', 'BVT', 'VGB', 'CHL', 'CUW', 'DMA', 'DOM', 'ECU', 'FLK', 'GUF', 'GRD', 'GLP', 'GTM', 'GUY', 'HTI', 'HND', 'JAM', 'SGS', 'CYM', 'COL', 'CRI', 'CUB', 'MTQ', 'MEX', 'MSR', 'NIC', 'PAN', 'PRY', 'PER', 'PRI', 'SLV', 'SUR', 'LCA', 'BLM', 'KNA', 'MAF', 'SXM', 'VCT', 'TTO', 'TCA', 'URY', 'VEN']
cs_america = [x for x in cs_america_all if x not in islands]

In [18]:
groups = {
    'European Union + Great Britain': eu + ['GBR'],
    'United States': ['USA'],
    'China': ['CHN'],
    'India': ['IND'],
    'Russia': ['RUS'],
    'Brazil': ['BRA'],
    'Japan + Korea': ['JPN', 'KOR'],
    'Oil states of Persian Gulf': oil,
    'Small islands': islands,
    'Africa excluding South Africa': africa + ['LBY', 'DZA', 'TUN', 'EGY'],
    'Southeast Asia': se_asia,
    'South and central America': cs_america
}

In [19]:
# to aggregate the dataframe, create the inverted dictionary
inv_groups = {v: k for k, vs in groups.items() for v in vs}
df['region'] = df['code'].apply(lambda x: inv_groups[x] if x in inv_groups else None)

In [20]:
# world totals
num_cols = ['pop', 'gdp', 'gdp_ppp', 'ghg', 'electricity']
world = df[num_cols].sum()

In [21]:
# aggregate regions and add Czechia
agg = df.groupby('region')[num_cols].sum().reset_index()
cze = df[df['code'] == 'CZE'][num_cols]
cze['region'] = 'Czechia'
agg = pd.concat([agg, cze]).sort_values('region').reset_index(drop=True)

In [22]:
# add relative values

agg['pop_ratio'] = agg['pop'] / world['pop']
agg['gdp_ratio'] = agg['gdp'] / world['gdp']
agg['gdp_per_capita'] = agg['gdp_ppp'] / agg['pop']
agg['ghg_ratio'] = agg['ghg'] / world['ghg']
agg['ghg_per_capita'] = 1000 * agg['ghg'] / agg['pop']  # convert to t CO2eq per capita
agg['electricity_ratio'] = agg['electricity'] / world['electricity']
agg['electricity_per_capita'] = agg['electricity'] / agg['pop'] * 1e6  # convert to MWh per capita

In [23]:
agg

Unnamed: 0,region,pop,gdp,gdp_ppp,ghg,electricity,pop_ratio,gdp_ratio,gdp_per_capita,ghg_ratio,ghg_per_capita,electricity_ratio,electricity_per_capita
0,Africa excluding South Africa,1264772000.0,2060606000000.0,5562327000000.0,2606189.0,586.966648,0.164262,0.024697,4397.889635,0.053361,2.0606,0.022959,0.464089
1,Brazil,212559400.0,1444733000000.0,2989432000000.0,1293380.0,640.30928,0.027606,0.017316,14063.982505,0.026482,6.084792,0.025046,3.012378
2,China,1402112000.0,14722730000000.0,23009780000000.0,13389080.0,7314.84,0.182099,0.176458,16410.797797,0.274137,9.54922,0.286119,5.217015
3,Czechia,10698900.0,243530400000.0,409974600000.0,128246.5,72.423118,0.00139,0.002919,38319.337663,0.002626,11.986894,0.002833,6.769214
4,European Union + Great Britain,515010000.0,17900400000000.0,21383320000000.0,4287960.0,3224.381668,0.066887,0.214544,41520.208299,0.087795,8.325975,0.126121,6.260814
5,India,1380004000.0,2622984000000.0,8443360000000.0,3614166.0,1376.58234,0.179228,0.031438,6118.35733,0.073999,2.618953,0.053845,0.99752
6,Japan + Korea,177616600.0,6695398000000.0,7412659000000.0,1994473.0,1498.822344,0.023068,0.080247,41734.043715,0.040836,11.229091,0.058626,8.438526
7,Oil states of Persian Gulf,181178000.0,1839105000000.0,4206790000000.0,2721666.0,1056.890886,0.02353,0.022042,23219.101983,0.055725,15.02206,0.04134,5.83344
8,Russia,144104100.0,1483498000000.0,3875686000000.0,2315331.0,1040.03548,0.018715,0.01778,26895.046298,0.047406,16.067073,0.040681,7.217252
9,Small islands,66124220.0,805540400000.0,1108568000000.0,346150.9,174.899182,0.008588,0.009655,16764.931656,0.007087,5.234858,0.006841,2.64501


- ``pop`` = country population (the most recent values are for 2020 for most countries)
- ``gdp`` = nominal GDP (dtto)
- ``gdp_ppp`` = GDP PPP in constant 2017 international dollars (dtto)
- ``ghg`` = greenhouse gases emissions in 2018 in Gg CO2eq (gigagrams, i.e. thousands of tonnes, the default unit in EDGAR database; includes CO2, CH4, N2O only)
- ``electricity`` = electricity demand in TWh in 2019

- ratios are proportions of world totals
- ``gdp_per_capita`` = GDP PPP per capita, in constant 2017 international dollars
- ``ghg_per_capita`` = greenhouse gases emissions per capita in 2018 in t CO2eq
- ``electricity_per_capita`` = electricity demand per capita in 2019 in MWh

In [24]:
agg.to_csv('../outputs/actors-in-climate-negotiations-regions.csv', index=False)