# Running CEDS Scenarios

In this notebook we document how to process and run data from the CEDS database.

In [1]:
%matplotlib inline

from os import listdir
from os.path import join, dirname
from pprint import pprint

import pandas as pd
import pyam
import pymagicc
from pymagicc.io import MAGICCData
import matplotlib.pyplot as plt
plt.style.use('bmh') 

import expectexception

<IPython.core.display.Javascript object>

In [2]:
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

In [3]:
TEST_DATA_PATH = join("..", "tests", "test_data")

## Reading in a CEDS csv

To read in CEDS csv's, we make use of the `pyam` library which is specifically designed for this purpose.

In [4]:
def read_ceds_csv(file_to_read):
    return pyam.IamDataFrame(
        data=file_to_read,
        encoding="utf-8"
    )

ceds_pyam_df = read_ceds_csv(join(TEST_DATA_PATH, "ceds-format-example.csv"))
ceds_pyam_df  # this just shows the type of ceds_pyam_df
ceds_pyam_df.data  # this returns the underlying DataFrame

INFO:root:Reading `../tests/test_data/ceds-format-example.csv`


<pyam.core.IamDataFrame at 0x1135b3be0>

Unnamed: 0,model,scenario,region,variable,unit,year,value
1138,MODEL-NAME-HYPHENS,SCENARIO-A-B-CDE-2,R5ASIA,Emissions|BC,Mt BC/yr,2015,2
1229,MODEL-NAME-HYPHENS,SCENARIO-A-B-CDE-2,R5LAM,Emissions|BC,Mt BC/yr,2015,5
1312,MODEL-NAME-HYPHENS,SCENARIO-A-B-CDE-2,R5MAF,Emissions|BC,Mt BC/yr,2015,9
1395,MODEL-NAME-HYPHENS,SCENARIO-A-B-CDE-2,R5OECD,Emissions|BC,Mt BC/yr,2015,2
1486,MODEL-NAME-HYPHENS,SCENARIO-A-B-CDE-2,R5REF,Emissions|BC,Mt BC/yr,2015,10
1577,MODEL-NAME-HYPHENS,SCENARIO-A-B-CDE-2,World,Emissions|BC,Mt BC/yr,2015,10
3414,MODEL-NAME-HYPHENS,SCENARIO-A-B-CDE-2,R5ASIA,Emissions|BC,Mt BC/yr,2020,7
3505,MODEL-NAME-HYPHENS,SCENARIO-A-B-CDE-2,R5LAM,Emissions|BC,Mt BC/yr,2020,9
3588,MODEL-NAME-HYPHENS,SCENARIO-A-B-CDE-2,R5MAF,Emissions|BC,Mt BC/yr,2020,9
3671,MODEL-NAME-HYPHENS,SCENARIO-A-B-CDE-2,R5OECD,Emissions|BC,Mt BC/yr,2020,8


## Checking an `IamDataFrame`

It is very easy to check that the sum of a given variable's sub-categories is equal to its declared total and that the sum of regions gives the world total.

We show how in the next cell.

In [5]:
# show check_internal_consistency method here

## Reshaping an `IamDataFrame`

Here we show how to reshape an `IamDataFrame` to get it into the format expected by `openscm` so we can then write files with the data in it.

Note: we normally want to take this step last, after we have done all our aggregation etc., as it means that we no longer have an `IamDataFrame` and can't use all the helpful tools it provides any more.

In [23]:
def reshape_pyam_df(pyam_df):
    raw_df = pyam_df.data
    
    reindexed_df = raw_df.set_index(
        ["model", "scenario", "region", "variable", "unit", "year"]
    ).unstack().T
    
    reindexed_df.index = reindexed_df.index.get_level_values("year")
    reindexed_df.index.name = "YEAR"
    
    models = reindexed_df.columns.get_level_values("model")
    scenarios = reindexed_df.columns.get_level_values("scenario")
    regions = reindexed_df.columns.get_level_values("region")
    variables = reindexed_df.columns.get_level_values("variable")
    units = reindexed_df.columns.get_level_values("unit")
    todos = ["SET"] * len(units)
    
    reindexed_df.columns = pd.MultiIndex.from_arrays(
        [models, scenarios, variables, todos, units, regions],
        names=("MODEL", "SCENARIO", "VARIABLE", "TODO", "UNITS", "REGION"),
    )
    
    return reindexed_df

In [24]:
reshape_pyam_df(ceds_pyam_df)

MODEL,MODEL-NAME-HYPHENS,MODEL-NAME-HYPHENS,MODEL-NAME-HYPHENS,MODEL-NAME-HYPHENS,MODEL-NAME-HYPHENS,MODEL-NAME-HYPHENS,MODEL-NAME-HYPHENS,MODEL-NAME-HYPHENS,MODEL-NAME-HYPHENS,MODEL-NAME-HYPHENS,...,NAME-MODEL-2,NAME-MODEL-2,NAME-MODEL-2,NAME-MODEL-2,NAME-MODEL-2,NAME-MODEL-2,NAME-MODEL-2,NAME-MODEL-2,NAME-MODEL-2,NAME-MODEL-2
SCENARIO,SCENARIO-A-B-CDE-2,SCENARIO-A-B-CDE-2,SCENARIO-A-B-CDE-2,SCENARIO-A-B-CDE-2,SCENARIO-A-B-CDE-2,SCENARIO-A-B-CDE-2,SCENARIO-A-B-CDE-2,SCENARIO-A-B-CDE-2,SCENARIO-A-B-CDE-2,SCENARIO-A-B-CDE-2,...,SCENARIO-NAME-HYPHENS,SCENARIO-NAME-HYPHENS,SCENARIO-NAME-HYPHENS,SCENARIO-NAME-HYPHENS,SCENARIO-NAME-HYPHENS,SCENARIO-NAME-HYPHENS,SCENARIO-NAME-HYPHENS,SCENARIO-NAME-HYPHENS,SCENARIO-NAME-HYPHENS,SCENARIO-NAME-HYPHENS
VARIABLE,Emissions|BC,Emissions|BC|Agricultural Waste Burning,Emissions|BC|Energy Sector,Emissions|BC|Forest Burning,Emissions|BC|Grassland Burning,Emissions|BC|Industrial Sector,Emissions|BC|Peat Burning,Emissions|BC|Residential Commercial Other,Emissions|BC|Transportation Sector,Emissions|BC|Waste,...,Emissions|VOC|Energy Sector,Emissions|VOC|Forest Burning,Emissions|VOC|Grassland Burning,Emissions|VOC|Industrial Sector,Emissions|VOC|International Shipping,Emissions|VOC|Peat Burning,Emissions|VOC|Residential Commercial Other,Emissions|VOC|Solvents Production and Application,Emissions|VOC|Transportation Sector,Emissions|VOC|Waste
TODO,SET,SET,SET,SET,SET,SET,SET,SET,SET,SET,...,SET,SET,SET,SET,SET,SET,SET,SET,SET,SET
UNITS,Mt BC/yr,Mt BC/yr,Mt BC/yr,Mt BC/yr,Mt BC/yr,Mt BC/yr,Mt BC/yr,Mt BC/yr,Mt BC/yr,Mt BC/yr,...,Mt VOC/yr,Mt VOC/yr,Mt VOC/yr,Mt VOC/yr,Mt VOC/yr,Mt VOC/yr,Mt VOC/yr,Mt VOC/yr,Mt VOC/yr,Mt VOC/yr
REGION,R5ASIA,R5ASIA,R5ASIA,R5ASIA,R5ASIA,R5ASIA,R5ASIA,R5ASIA,R5ASIA,R5ASIA,...,World,World,World,World,World,World,World,World,World,World
YEAR,Unnamed: 1_level_6,Unnamed: 2_level_6,Unnamed: 3_level_6,Unnamed: 4_level_6,Unnamed: 5_level_6,Unnamed: 6_level_6,Unnamed: 7_level_6,Unnamed: 8_level_6,Unnamed: 9_level_6,Unnamed: 10_level_6,Unnamed: 11_level_6,Unnamed: 12_level_6,Unnamed: 13_level_6,Unnamed: 14_level_6,Unnamed: 15_level_6,Unnamed: 16_level_6,Unnamed: 17_level_6,Unnamed: 18_level_6,Unnamed: 19_level_6,Unnamed: 20_level_6,Unnamed: 21_level_6
2015,2,8,7,5,6,1,5,1,8,8,...,5,2,8,5,2,1,0,8,2,6
2020,7,1,4,2,7,9,5,8,2,1,...,10,2,5,4,4,6,4,6,9,7
2030,3,9,6,1,7,4,5,4,2,1,...,4,7,6,5,2,1,1,4,7,5
2040,10,8,6,10,4,9,8,8,4,8,...,2,0,1,4,8,0,8,1,8,10
2050,7,2,9,8,3,0,4,2,6,6,...,4,1,0,7,1,10,4,5,4,2
2060,0,9,1,10,2,5,0,8,10,8,...,8,5,9,0,9,8,6,3,6,10
2070,0,6,3,10,3,0,0,8,6,6,...,6,2,4,5,1,5,7,9,2,6
2080,2,1,6,0,4,6,6,3,9,7,...,2,7,9,5,2,6,3,1,4,0
2090,10,6,6,4,3,10,4,7,1,0,...,7,10,2,4,7,8,8,10,6,6
2100,8,8,4,4,1,3,1,2,0,2,...,7,3,8,0,1,6,3,9,5,5


## Converting IAM data to MAGICC data

The `pyam` library provides some very natural ways of filtering their DataFrames. These are detailed in [their tutorial](https://github.com/IAMconsortium/pyam/blob/master/tutorial/pyam_first_steps.ipynb). Here we use them to help convert IAM data into the emissions variables, regions and units used by MAGICC.

In [25]:
tdf = ceds_pyam_df.filter(
    level=1,
    model="MODEL-NAME-HYPHENS",
    scenario="SCENARIO-A-B-CDE-2",
    region="World",
)

In [34]:
def convert_ceds_to_openscm_variables(ceds_var):
    raw_ceds_var = ceds_var.replace("Emissions|", "")
    
    special_cases = {
        "Sulfur": "SOX",
    }
    
    if raw_ceds_var in special_cases:
        raw_var = special_cases[raw_ceds_var]
    else:
        raw_var = raw_ceds_var.replace("-", "").upper()
        
    return raw_var + "_EMIS"

def convert_ceds_to_openscm_world_table(pyam_df):
    magicc_df = pyam.IamDataFrame(data=pyam_df.data.copy())
    magicc_df = magicc_df.filter(
        level=1,
        region="World"
    )
    magicc_df.data.variable = magicc_df.data.variable.apply(convert_ceds_to_magicc_variables)
    magicc_df.data.region = magicc_df.data.region.str.upper()
    return magicc_df

In [36]:
openscm_world_df = convert_ceds_to_openscm_world_table(ceds_pyam_df)
openscm_world_df.head()
openscm_world_df.variables()

Unnamed: 0,model,scenario,region,variable,unit,year,value
1577,MODEL-NAME-HYPHENS,SCENARIO-A-B-CDE-2,WORLD,BC_EMIS,Mt BC/yr,2015,10
3853,MODEL-NAME-HYPHENS,SCENARIO-A-B-CDE-2,WORLD,BC_EMIS,Mt BC/yr,2020,4
6129,MODEL-NAME-HYPHENS,SCENARIO-A-B-CDE-2,WORLD,BC_EMIS,Mt BC/yr,2030,4
8405,MODEL-NAME-HYPHENS,SCENARIO-A-B-CDE-2,WORLD,BC_EMIS,Mt BC/yr,2040,4
10681,MODEL-NAME-HYPHENS,SCENARIO-A-B-CDE-2,WORLD,BC_EMIS,Mt BC/yr,2050,6


0            BC_EMIS
1          C2F6_EMIS
2          CCL4_EMIS
3           CF4_EMIS
4         CFC11_EMIS
5        CFC113_EMIS
6        CFC114_EMIS
7        CFC115_EMIS
8         CFC12_EMIS
9           CH3_EMIS
10      CH3CCL3_EMIS
11        CH3CL_EMIS
12          CH4_EMIS
13           CO_EMIS
14          CO2_EMIS
15     HCFC141B_EMIS
16     HCFC142B_EMIS
17       HCFC22_EMIS
18          HFC_EMIS
19    HALON1202_EMIS
20    HALON1211_EMIS
21    HALON1301_EMIS
22    HALON2402_EMIS
23          N2O_EMIS
24          NH3_EMIS
25          NOX_EMIS
26           OC_EMIS
27          SF6_EMIS
28          SOX_EMIS
29          VOC_EMIS
Name: variable, dtype: object

In [37]:
reshape_pyam_df(openscm_world_df)

MODEL,MODEL-NAME-HYPHENS,MODEL-NAME-HYPHENS,MODEL-NAME-HYPHENS,MODEL-NAME-HYPHENS,MODEL-NAME-HYPHENS,MODEL-NAME-HYPHENS,MODEL-NAME-HYPHENS,MODEL-NAME-HYPHENS,MODEL-NAME-HYPHENS,MODEL-NAME-HYPHENS,...,NAME-MODEL-2,NAME-MODEL-2,NAME-MODEL-2,NAME-MODEL-2,NAME-MODEL-2,NAME-MODEL-2,NAME-MODEL-2,NAME-MODEL-2,NAME-MODEL-2,NAME-MODEL-2
SCENARIO,SCENARIO-A-B-CDE-2,SCENARIO-A-B-CDE-2,SCENARIO-A-B-CDE-2,SCENARIO-A-B-CDE-2,SCENARIO-A-B-CDE-2,SCENARIO-A-B-CDE-2,SCENARIO-A-B-CDE-2,SCENARIO-A-B-CDE-2,SCENARIO-A-B-CDE-2,SCENARIO-A-B-CDE-2,...,SCENARIO-NAME-HYPHENS,SCENARIO-NAME-HYPHENS,SCENARIO-NAME-HYPHENS,SCENARIO-NAME-HYPHENS,SCENARIO-NAME-HYPHENS,SCENARIO-NAME-HYPHENS,SCENARIO-NAME-HYPHENS,SCENARIO-NAME-HYPHENS,SCENARIO-NAME-HYPHENS,SCENARIO-NAME-HYPHENS
VARIABLE,BC_EMIS,C2F6_EMIS,CCL4_EMIS,CF4_EMIS,CFC113_EMIS,CFC114_EMIS,CFC115_EMIS,CFC11_EMIS,CFC12_EMIS,CH3CCL3_EMIS,...,HCFC142B_EMIS,HCFC22_EMIS,HFC_EMIS,N2O_EMIS,NH3_EMIS,NOX_EMIS,OC_EMIS,SF6_EMIS,SOX_EMIS,VOC_EMIS
TODO,SET,SET,SET,SET,SET,SET,SET,SET,SET,SET,...,SET,SET,SET,SET,SET,SET,SET,SET,SET,SET
UNITS,Mt BC/yr,kt C2F6/yr,kt CCl4/yr,kt CF4/yr,kt CFC-113/yr,kt CFC-114/yr,kt CFC-115/yr,kt CFC-11/yr,kt CFC-12/yr,kt CH3CCl3/yr,...,kt HCFC-142b/yr,kt HCFC-22/yr,Mt CO2-equiv/yr,kt N2O/yr,Mt NH3/yr,Mt NOx/yr,Mt OC/yr,kt SF6/yr,Mt SO2/yr,Mt VOC/yr
REGION,WORLD,WORLD,WORLD,WORLD,WORLD,WORLD,WORLD,WORLD,WORLD,WORLD,...,WORLD,WORLD,WORLD,WORLD,WORLD,WORLD,WORLD,WORLD,WORLD,WORLD
YEAR,Unnamed: 1_level_6,Unnamed: 2_level_6,Unnamed: 3_level_6,Unnamed: 4_level_6,Unnamed: 5_level_6,Unnamed: 6_level_6,Unnamed: 7_level_6,Unnamed: 8_level_6,Unnamed: 9_level_6,Unnamed: 10_level_6,Unnamed: 11_level_6,Unnamed: 12_level_6,Unnamed: 13_level_6,Unnamed: 14_level_6,Unnamed: 15_level_6,Unnamed: 16_level_6,Unnamed: 17_level_6,Unnamed: 18_level_6,Unnamed: 19_level_6,Unnamed: 20_level_6,Unnamed: 21_level_6
2015,10,10,5,8,6,2,5,10,4,1,...,1,3,8,3,10,2,10,7,4,5
2020,4,8,3,9,6,10,10,1,7,6,...,2,6,5,3,4,0,4,2,1,0
2030,4,7,4,2,8,0,5,7,7,2,...,4,0,3,0,3,7,9,1,5,5
2040,4,1,5,8,9,0,4,3,5,0,...,9,0,7,3,2,5,4,1,5,0
2050,6,7,1,2,7,1,10,0,7,7,...,8,7,6,6,1,9,3,5,9,8
2060,1,1,6,8,6,10,5,10,0,1,...,1,4,3,10,8,10,5,0,6,8
2070,4,2,5,2,0,10,7,3,0,0,...,7,3,4,3,8,4,0,0,5,7
2080,9,2,8,8,3,2,7,7,6,7,...,8,8,9,2,9,10,1,0,1,5
2090,1,8,6,7,2,9,3,6,8,4,...,6,5,8,1,5,4,1,7,4,2
2100,2,8,8,6,6,2,7,4,0,10,...,3,4,2,10,4,0,7,4,1,9


## WIP from here down

In [66]:
# ceds openscm mapping
ceds_openscm_var_mapping = {
    "AIR": ["Aircraft"],
    "SHIP": ["International Shipping"],
    "AFOLULUC": ["Agricultural Waste Burning", "Agriculture", "Forest Burning", "Grassland Burning", "Peat Burning", "Aggregate - Agriculture and LUC"],
    "FOSSIL": ["Energy Sector", "Industrial Sector", "Residential Commercial Other", "Solvents Production and Application", "Transportation Sector", "Waste"]
}
# need better name for this
TMP_INDEX = ['model', 'scenario', 'region', 'year', 'unit']

In [121]:
def convert_ceds_to_openscm_regional_sectoral_table(pyam_df):
    magicc_df = pyam.IamDataFrame(data=pyam_df.data.copy())
    magicc_df = magicc_df.filter(
        level='1-', 
        keep=False
    )
#     filter(
#         region="World", 
#         keep=False
#     )
    
    handled_vars = []
    metadata = {}
    output_df = pd.DataFrame()
    for variable in magicc_df.variables():
        base_var = "|".join(variable.split("|")[:2])
        
        for category, suffixes in ceds_openscm_var_mapping.items():
            openscm_var = "{}_{}".format(
                convert_ceds_to_magicc_variables(variable.split("|")[1]),
                category,
            )
            if openscm_var in handled_vars:
                continue
            handled_vars.append(openscm_var)
            
            contrib_vars = ["{}|{}".format(base_var, s) for s in suffixes]
            
            var_cat_df = magicc_df.data[magicc_df.data.variable.isin(contrib_vars)]
            var_cat_df = pd.DataFrame(var_cat_df.groupby(TMP_INDEX).sum()['value'])
            var_cat_df = pd.concat([var_cat_df], keys=[openscm_var], names=['variable'])
            
            output_df = pd.concat([output_df, var_cat_df])
            metadata[openscm_var] = "Sum of {}".format(", ".join(contrib_vars))
    
    return output_df, metadata

In [128]:
convert_ceds_to_openscm_regional_sectoral_table(ceds_pyam_df)[0].head(100)

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,Unnamed: 4_level_0,Unnamed: 5_level_0,value
variable,model,scenario,region,year,unit,Unnamed: 6_level_1
BC_EMIS_AIR,MODEL-NAME-HYPHENS,SCENARIO-A-B-CDE-2,World,2015,Mt BC/yr,4
BC_EMIS_AIR,MODEL-NAME-HYPHENS,SCENARIO-A-B-CDE-2,World,2020,Mt BC/yr,2
BC_EMIS_AIR,MODEL-NAME-HYPHENS,SCENARIO-A-B-CDE-2,World,2030,Mt BC/yr,4
BC_EMIS_AIR,MODEL-NAME-HYPHENS,SCENARIO-A-B-CDE-2,World,2040,Mt BC/yr,3
BC_EMIS_AIR,MODEL-NAME-HYPHENS,SCENARIO-A-B-CDE-2,World,2050,Mt BC/yr,4
BC_EMIS_AIR,MODEL-NAME-HYPHENS,SCENARIO-A-B-CDE-2,World,2060,Mt BC/yr,2
BC_EMIS_AIR,MODEL-NAME-HYPHENS,SCENARIO-A-B-CDE-2,World,2070,Mt BC/yr,5
BC_EMIS_AIR,MODEL-NAME-HYPHENS,SCENARIO-A-B-CDE-2,World,2080,Mt BC/yr,2
BC_EMIS_AIR,MODEL-NAME-HYPHENS,SCENARIO-A-B-CDE-2,World,2090,Mt BC/yr,5
BC_EMIS_AIR,MODEL-NAME-HYPHENS,SCENARIO-A-B-CDE-2,World,2100,Mt BC/yr,6
