## Contents of this notebook

- 0: Load packages, scripts etc.
- 1: Construct trees. Uses the technology data stored in an .xlsx-file to construct dictionaries and other objects that fully characterize the production tree. These are then combined to an actual tree object using the nestingtree.nestingtree class.
- 2: Adding parameters to the database. Only some of the share-parameters etc. can be deduced directly from technology data. The rest of these as well as starting values for endogenous and exogenous variables must be collected in a database.
- 3: The model. Uses the tree object as well as the prepared database to construct a model using the *gmspython* class and its childclass *abate*.
The script ends by exporting the model as a pickle to be loaded in the calibration script.

### Loads

In [1]:
clean_up=True # removes gams-related files in work-folder if true
%run StdPackages.ipynb
os.chdir(py['main'])
import abatement, techdata_to_tree, sys, ShockFunction
os.chdir(curr)
data_folder = os.getcwd()+'\\Data'
gams_folder = data_folder + "\\..\\gamsmodels\\Main"
#Functions
def flatten_list(list_):
    return [item for sublist in list_ for item in sublist]

The file_gams_py_gdb0.gdx is still active and was not deleted.
The file_gams_py_gdb1.gdx is still active and was not deleted.


# **1: Construct trees**

#### Load technology data.
This runs the script in located in the file *techdata_to_tree.py*:

In [2]:
inputfile = "techdata_newID.xlsx"
output = techdata_to_tree.load_techcats(pd.read_excel(data_folder + "/" + inputfile, sheet_name=["inputdisp", "endofpipe", "inputprices"]))

#### The output of the code is a dictionary with three keys, referring to the two modules + a list of inputprices stored in the xlsx-file as well:

In [3]:
output.keys()

dict_keys(['ID', 'inputprices', 'EOP'])

The two different modules correspond to the two types of technology catalogs that the model can handle.\
*Input-displacing* (ID) and *End of pipe* (EOP).

In [4]:
modules = ["ID", "EOP"]

ID contains a dictionary related to input-displacing technologies, EOP does so for end of pipe (we use this naming convention throughout).\
They contain the following keys:

In [5]:
for module in modules:
    print("Keys of " + module + ": ", output[module].keys(), "\n")

Keys of ID:  dict_keys(['techs_inputs', 'techs', 'components', 'upper_categories', 'mu', 'Q2P', 'unit_costs', 'current_coverages', 'coverage_potentials', 'IO_tech', 'IO_tech_inputs', 'baseline_U_inputs']) 

Keys of EOP:  dict_keys(['techs_inputs', 'techs', 'components', 'upper_categories', 'mu', 'Q2P', 'unit_costs', 'current_coverages', 'coverage_potentials']) 



As an example, "techs" shows, for each technology, which technology good it produces, e.g. for input-displacing:

In [6]:
output["ID"]["techs"]

{'ID_1': ['U_ID_1_1', 'U_ID_1_2', 'U_ID_1_3'],
 'ID_2': ['U_ID_2_1', 'U_ID_2_2', 'U_ID_2_3'],
 'ID_3': ['U_ID_3_1', 'U_ID_3_2'],
 'ID_4': ['U_ID_4_1', 'U_ID_4_2', 'U_ID_4_3'],
 'ID_5': ['U_ID_5_1', 'U_ID_5_2']}

And "upper_categories" shows the mapping between energy services (E) and their respective components (C) for ID.\
All of the trees in the `output` object are dictionaries where the keys are nodes in the tree and the values are lists of connected branches.

In [7]:
output["ID"]["upper_categories"]

{'EL': ['C_EL_1', 'C_EL_2', 'C_EL_3', 'C_EL_4', 'C_EL_5', 'C_EL_base'],
 'EH': ['C_EH_1', 'C_EH_2', 'C_EH_3', 'C_EH_base'],
 'ER': ['C_ER_1', 'C_ER_2', 'C_ER_base']}

... and the mapping between emission types M ($CO_2$, $SO_2$ etc.) and the components C.\
Note importantly that this mapping does not constitute an actual part of the tree, because components are the outputs of the EOP sector.

In [8]:
output["EOP"]["upper_categories"]

{'CO2': [], 'SO2': ['C_SO2_1'], 'NOX': []}

### Initialize nesting tree, call it "Abatement"
The construction of the production tree starts with an initialization of the *nesting_tree* class:

In [9]:
nt = nesting_tree.nesting_tree(name='Abatement')

The "trees" attribute starts off empty, reflecting that no (sub)trees have been added yet. 

In [10]:
nt.trees

{}

#### The following cells add each of the subtrees to the tree-object, "nt"

##### First, input-displacing subtrees

Upper part, connecting energy services to components (this is an input-tree, so we do not supply additional keyword arguments).\
Again, we use the naming convention that a prefix reflects whether the tree is related to input-displacement (ID) or end of pipe (EOP).
The prefix is followed by an underscore and then letters reflecting which elements in the tree are connected.

The first tree connects energy services (E) to components (C), related to input-displacing technologies (ID)

In [11]:
nt.add_tree(output["ID"]["upper_categories"], tree_name = 'ID_EC', **{"type_f":"CES_norm"})

Before proceeding to the next tree, lets print some information from this tree.\
First, we see that the tree has been added to the list of trees in the "aggregate" tree, nt:

In [12]:
nt.trees

{'ID_EC': <nesting_trees.nt at 0x168879dc808>}

Second, information about the tree is automatically added when the tree is added, e.g. the "type_f" attribute states that the functional form in this subtree is constant elasticity of substitution (CES)

In [13]:
nt.trees["ID_EC"].__dict__

{'name': 'ID_EC',
 'tree': {'EL': ['C_EL_1',
   'C_EL_2',
   'C_EL_3',
   'C_EL_4',
   'C_EL_5',
   'C_EL_base'],
  'EH': ['C_EH_1', 'C_EH_2', 'C_EH_3', 'C_EH_base'],
  'ER': ['C_ER_1', 'C_ER_2', 'C_ER_base']},
 'type_io': 'input',
 'version': 'std',
 'n': 'n',
 'nn': 'nn',
 'nnn': 'nnn',
 'map_': 'map_ID_EC',
 'kno': 'kno_ID_EC',
 'bra': 'bra_ID_EC',
 'inp': 'inp_ID_EC',
 'out': 'out_ID_EC',
 'temp_namespace': None,
 'type_f': 'CES_norm',
 'database': <DataBase.GPM_database at 0x168879dc608>}

However, even though the namespace includes e.g. "map_":"map_ID_EC", the database of the tree still does not contain this mapping. This requires running the run() method.\
Instead of doing that for each tree individually, we instead add all trees and call the run_all() methods which also calls the run() on each tree.

Next, we add the middle part, connecting components C to their technology goods U

In [14]:
nt.add_tree(output["ID"]["components"], tree_name = "ID_CU", **{"type_f":"MNL"})

The two remaining parts are the bottom ones. \
First, the one that connects technologies to their outputs (technology goods). That these are outputs is specified explicitly by using the `type_io` keyword.\
Second, the one that connects technologies to their inputs (non-capital inputs X, and capital K)

In [15]:
nt.add_tree(output["ID"]["techs"], tree_name="ID_TU", **{'type_io': 'output', 'type_f': 'CET_norm'})
nt.add_tree(output["ID"]["techs_inputs"], tree_name="ID_TX")

Baseline components and technology goods, and their respective sets of inputs are also part of the aggregate tree. We distinguish between those with their own inputs (these baseline technologies are the ones that come from knowing which energy mix a (set of) technologies replaces. The remaining baseline technology goods (and all baseline components by construction) are outputs of the "IO technology". This IO technology in turn draws on all inputs from the economy. This is the one we calibrate to make sure we replicate IO data. 

In [16]:
nt.add_tree(output["ID"]["IO_tech"], tree_name="ID_IOCU", **{"type_io":"output", "type_f":"CET_norm"})
nt.add_tree(output["ID"]["IO_tech_inputs"], tree_name="ID_IOX")
nt.add_tree(output["ID"]["baseline_U_inputs"], tree_name="ID_UbaseX")

#### Next, we add the end of pipe subtrees.
First, the one connecting components (which are the final product in the end-of-pipe tree) to the technology goods from which they are created:

In [17]:
nt.add_tree(output["EOP"]["components"], tree_name = "EOP_CU")

Second, the bottom part. This consists of technologies and their outputs and inputs respectively, similarly to with input-displacing technologies:

In [18]:
nt.add_tree(output["EOP"]["techs"], tree_name="EOP_TU", **{'type_io': 'output', 'type_f': 'CET_norm'})
nt.add_tree(output["EOP"]["techs_inputs"], tree_name="EOP_TX")

##### Now, all trees have been added:

In [19]:
nt.trees

{'ID_EC': <nesting_trees.nt at 0x168879dc808>,
 'ID_CU': <nesting_trees.nt at 0x168879ea388>,
 'ID_TU': <nesting_trees.nt at 0x168879f74c8>,
 'ID_TX': <nesting_trees.nt at 0x168879f7488>,
 'ID_IOCU': <nesting_trees.nt at 0x16887a07788>,
 'ID_IOX': <nesting_trees.nt at 0x168879f3908>,
 'ID_UbaseX': <nesting_trees.nt at 0x168879f3a48>,
 'EOP_CU': <nesting_trees.nt at 0x168879dcb08>,
 'EOP_TU': <nesting_trees.nt at 0x168879de108>,
 'EOP_TX': <nesting_trees.nt at 0x168879de448>}


The next step is to use the method `run_all`. This runs through a number of steps, where it sets up sets/subsets/mappings identifying which elements e.g. are inputs, intermediate goods, and outputs in the aggregate tree, i.e. the combination of the subtrees we just added. *Tutorial_nesting_tree* includes a brief review of these.\
There are still some of the objects in `output` that we have not used. We will return to these later as they become relevant.

In [20]:
nt.run_all()

This constructs the aggregate tree. \
For a few highlights, let's check:
1. The outputs of the aggregate tree (aggregate simply refers to the combination of all the individual (sub)trees\
This includes energy services (from ID) and components (from EOP)

In [21]:
list(nt.database.series["out"].vals)

['ER', 'EH', 'EL', 'C_SO2_1']

2. The outputs of a particular tree, say EOP_TU. Note that in this particular case, the reported list is empty. This is because an 'output' refers to whether it is an output of the aggregate tree and not just this particular one.

In [22]:
list(nt.trees["EOP_TU"].database.series["out_EOP_TU"].vals)

[]

3. Likewise, we can check the inputs of EOP_TU\
This reports that the Us are inputs, which they really are not. This is not a problem, because the tree is constructed in the correct way. But not that U's are not inputs. Not of the aggregate tree (in which Us are intermediates), but not in the TU-tree either, since here, they are outputs (they are produced by technologies).

In [23]:
nt.trees["EOP_TU"].database.series["inp_EOP_TU"].vals

Index(['U_EOP_t1_1'], dtype='object', name='n')

Lastly, lets print the objects that the `nt` class instance contains:

In [24]:
nt.__dict__

{'name': 'Abatement',
 'version': 'std',
 'trees': {'ID_EC': <nesting_trees.nt at 0x168879dc808>,
  'ID_CU': <nesting_trees.nt at 0x168879ea388>,
  'ID_TU': <nesting_trees.nt at 0x168879f74c8>,
  'ID_TX': <nesting_trees.nt at 0x168879f7488>,
  'ID_IOCU': <nesting_trees.nt at 0x16887a07788>,
  'ID_IOX': <nesting_trees.nt at 0x168879f3908>,
  'ID_UbaseX': <nesting_trees.nt at 0x168879f3a48>,
  'EOP_CU': <nesting_trees.nt at 0x168879dcb08>,
  'EOP_TU': <nesting_trees.nt at 0x168879de108>,
  'EOP_TX': <nesting_trees.nt at 0x168879de448>},
 'database': <DataBase.GPM_database at 0x168879ea5c8>,
 'n': 'n',
 'nn': 'nn',
 'nnn': 'nnn',
 'inp': 'inp',
 'out': 'out',
 'int': 'int',
 'fg': 'fg',
 'wT': 'wT',
 'map_all': 'map_all',
 'kno_out': 'kno_out',
 'kno_inp': 'kno_inp',
 'prune_trees': {'OnlyQ', 'bra', 'inp', 'kno', 'out'}}

# **2: Adding parameters to the database**

The next step is to construct a database that contains all the share-parameters that we can deduce from technology data, as well as appropriate starting values for endogenous variables. These starting values will be set to the value that they would have if the entire tree was Leontief.
The end goal is to make sure the database contained by `nt` includes all these parameters and starting values. 
To do so, we construct an empty database, add the share-parameters and starting values to this, and then finally merge that database with the one already in `nt`.

The GPM database of the nesting_tree instance (`nt`) does not contain the $\mu$ parameters calculated from technology data yet.\
The $\mu$-parameters are stored under the key of the same name in `output`.\
Let's check out the database of `nt` now, to see that it does not include any symbol called $\mu$ yet:

In [25]:
print("The database nt.database is of type " + str(type(nt.database)))
print("Does the database include a symbol called mu? ") 
if "mu" in nt.database.series:
    print("YES!")
else:
    print("NO!")

The database nt.database is of type <class 'DataBase.GPM_database'>
Does the database include a symbol called mu? 
NO!


First we create an empty database:

In [26]:
db = DataBase.GPM_database()

We store the share-parameters from technology data in an object called `mu`. This does not contain all share-parameters of the tree (e.g. it does not include the share parameters of the inputs under baseline technology goods, or the share parameters for technology goods under components).

In [27]:
mu = output["ID"]["mu"].append(output["EOP"]["mu"])

Check out the mu parameters (prints just the five first entries here).\
Note: The mu-object is a series with a multiindex, where the first level is called `n` and the second level `nn`. The first level contains branches and the second level contains nodes.

In [28]:
mu.head()

n                 nn  
ID_1_electricity  ID_1    0.475
ID_1_oil          ID_1    0.475
ID_1_K            ID_1    4.050
C_EL_1            EL      0.050
U_ID_1_1          ID_1    0.500
Name: mu, dtype: float64

#### Calculate starting values (corresponding to the solution if the entire tree was Leontief).
For demands/quantities, we use the fact that Leontief demand is given by 
$$q_j = \mu_j \left(\frac{p_i}{p_j}\right)^\sigma q_i = \mu_j q_i$$ for CES demand where $i$ refers to a node and $j$ refers to the branch. The CET and the MNL form is identical under Leontief.\
For prices, we generally use that when we know the price of each branch under a node as well as the quantities of these branches, the zero profit condition can be rearranged to give us the price of the node:
$$ q_i p_i = \sum_j q_j p_j \quad \Leftrightarrow \quad  p_i = \frac{\sum_j q_j p_j}{q_i}$$

Output quantities are held in the `qS` object. Output quantities are held fixed in the partial equilibrum (the output prices are the variables that adjust), and we simply set them to `output_quantity` in this example.

In [29]:
output_quantity = 100
qS = pd.Series([], name="qS", dtype="float64")

`qS` is now an empty series to which we add the value of 100 for each output of the tree:

In [30]:
for out in nt.database.series["out"]:
    qS[out] = output_quantity

All other quantities are kept in the object `qD`, which we initialize:

In [31]:
qD = pd.Series([], name="qD", dtype="float64")

First, we calculate components' starting values for input-displacing (we did it for EOP with qS, because components are outputs there). \
We use the demand equation stated earlier (multiplication of share-parameter and relevant node/component). This calculates quantities for all components, including baseline components

In [32]:
for E in output["ID"]["upper_categories"]:
    for C in output["ID"]["upper_categories"][E]:
        qD[C] = mu.loc[(C, E)] * qS[E]

Next, we use that we have estimates of U to set these (current coverages split according to overlap), and then calculate the $\mu$s residually

In [33]:
for index, curr_coverage in output["ID"]["current_coverages"].iteritems():
    qD[index[0]] = curr_coverage * qS[index[1]]

The $\mu$s of technology goods (in their component nest) residually. This includes the share-parameters for the baseline technology goods $\bar U$. Since we still have not set the starting values for the quantities of these baseline technology goods, we set these using the share-parameters that are calculated here as well.

In [34]:
for C in output["ID"]["components"]:
    #The ':-1' leaves out the baseline technology good, which we handle separately afterwards
    nonbase_U = output["ID"]["components"][C][:-1]
    base_mu = 1
    for U in nonbase_U:
        mu[(U, C)] =  qD[U]/qD[C]
        base_mu -= mu[(U, C)]
    #Quantities of baseline U
    mu[(output["ID"]["components"][C][-1], C)] = base_mu
    qD[output["ID"]["components"][C][-1]] = base_mu * qD[C]
    if base_mu <= 0:
        print(C)
        print(nonbase_U)
        raise Exception("base_mu is not positive")


For end of pipe, we simply set the share-parameters of the technology goods under components as $1/N$ where $N$ is the number of branches/technology goods under a given component. Having set these, we can calculate the corresponding quantities using the CES Leontief demand equation.

In [35]:
for C in output["EOP"]["components"]:
    for U in output["EOP"]["components"][C]:
        mu.loc[(U, C)] = 1/len(output["EOP"]["components"][C])
        qD[U] = mu.loc[(U, C)] * qS[C]

Then, we calculate the starting values of technologies $\tau$ as the sum of their relevant $U$, since the CET function should be scale-preserving.\
The drawback here, for EOP only, is that the quantities of $U$ do not necessarily adhere to the relative sizes of their share-parameters (which are set based on the technology catalog). 

In [36]:
for module in modules:
    for tech in output[module]["techs"]:
        qD[tech] = qD[output[module]["techs"][tech]].sum()

Lastly, we set the starting values of the inputs $X$ that go into technologies. These again use the demand equation. The share-parameters of these were given directly from technology data.

In [37]:
for module in modules:
    techs = output[module]["techs_inputs"]
    for tech in techs:
        for inp in techs[tech]:
            qD[inp] = mu.loc[(inp, tech)] * qD[tech]

Set the quantity of the IO-technology as the sum of the quantities of the goods ($U_0$ and $C_0$) that it provides. Afterwards, calculate the relevant share parameters using the fraction of IO-tech to the goods. 

In [38]:
#Quantity of the IO technology
qD["IO_tech"] = 0
for base in output["ID"]["IO_tech"]["IO_tech"]:
    qD["IO_tech"] += qD[base]
#share parameters for each the non-replacing baseline technologies:
for base in output["ID"]["IO_tech"]["IO_tech"]:
    mu[(base, "IO_tech")] = qD[base] / qD["IO_tech"]

##### Prices:

The next part contains the setting of starting values for prices.
The prices of inputs (the very bottom of the tree) are fully exogenous, whereas all other prices are endogenous. The prices of outputs (those whose quantities are contained in `qS` are kept in the object `PbT` (refers to "prices before taxes") whereas the rest of the prices are kept in `PwT`.

Objects for storing prices:

In [39]:
PwT = pd.Series([], name="PwT", dtype="float64")
PbT = pd.Series([], name="PbT", dtype="float64")

Prices of inputs are also stated in the catalog:

In [40]:
for module in modules:
    for t, inputs in output[module]["techs_inputs"].items():
        for inp in inputs:
            PwT[inp] = output["inputprices"][inp.split("_")[-1]]

Add the prices of technologies: $p^\tau$. These correspond to the weighted average of the prices of its inputs.

In [41]:
for module in modules:
    for t, inputs in output[module]["techs_inputs"].items():
        PwT[t] = pd.concat([PwT[inputs], qD[inputs]], axis=1).product(axis=1).sum() / qD[t]

These prices are equal to unit costs by construction, which we check here for good measure. The differences between prices and unit costs are:

In [42]:
round(output["ID"]["unit_costs"].set_index("tech")["unit_cost"] - PwT[output["ID"]["unit_costs"]["tech"]], 2)

tech
ID_1    0.0
ID_2    0.0
ID_3    0.0
ID_4    0.0
ID_5   -0.0
dtype: float64

For $p^U$, we simply assume $p^U = p^\tau$, which, since the share-parameters in the CET nest splitting $\tau$ into its produced technology goods sum to one, is consistent with zero profits (mroe generally, scale preservation ensures zero profits with this assumption)

In [43]:
for module in modules:
    for t in output[module]["techs"]:
        for U in output[module]["techs"][t]:
            PwT[U] = PwT[t]

Prices on components for end of pipe should be allocated to PbT because that includes the prices of outputs:

In [44]:
for C, U_list in output["EOP"]["components"].items():
    PbT[C] = pd.concat([PwT[U_list], qD[U_list]], axis=1).product(axis=1).sum() / qS[C]

Insert baseline input quantities and baseline technology good prices (only relevant for ID since baselines do not exist in EOP).

This also (for now, should be set with input-output macro data later) sets the share-parameters of the inputs for the baseline technology goods. It assumes that the baseline uses all energy inputs in equal proportions and that these share-parameters sum to 1. Using these share-parameters, the corresponding quantities are added to `qD`.

In [45]:
inputs = output["ID"]["IO_tech_inputs"]["IO_tech"]
for inp in inputs:
    mu[(inp, "IO_tech")] = 1/len(inputs)
    PwT[inp] = output["inputprices"][inp.split("_")[-1]]
    qD[inp] = mu[(inp, "IO_tech")] * qD["IO_tech"]
#Price of the IO technology
PwT["IO_tech"] = pd.concat([PwT[inputs], qD[inputs]], axis=1).product(axis=1).sum() / qD["IO_tech"]

In [46]:
for base in output["ID"]["IO_tech"]["IO_tech"]:
    PwT[base] = PwT["IO_tech"] 

Quantities and prices of inputs to replacement-baseline-technologies and the cost of these as well:

In [47]:
for base_U, inputs in output["ID"]["baseline_U_inputs"].items():
    for inp in inputs:
#         mu[(inp, base_U)] = 1/len(inputs)    
        qD[inp] = mu[(inp, base_U)] * qD[base_U]
        PwT[inp] = output["inputprices"][inp.split("_")[-1]]
    PwT[base_U] =  pd.concat([PwT[inputs], qD[inputs]], axis=1).product(axis=1).sum() / qD[base_U]

Prices of components:

In [48]:
for C, U_list in output["ID"]["components"].items():
    PwT[C] = pd.concat([PwT[U_list], qD[U_list]], axis=1).product(axis=1).sum() / qD[C]

Prices of energy services ("upper categories" in ID). 

In [49]:
for E, C_list in output["ID"]["upper_categories"].items():
    PbT[E] = pd.concat([PwT[C_list], qD[C_list]], axis=1).product(axis=1).sum() / qS[E]

We now add the calculated values to the (still) empty database `db` instantiated earlier.

In [50]:
qS.index.name = "n"
qD.index.name = "n"
PwT.index.name = "n"
PbT.index.name = "n"
db["qS"] = qS
db["qD"] = qD
db["PwT"] = PwT
db["PbT"] = PbT
db["mu"] = mu

Merge `db` with the database attached to the nesting tree object `nt`:

In [51]:
DataBase.GPM_database.merge_dbs(nt.database, db, "first")

As a confirmation that this went succesfully, we check that the databse in `nt` contains a symbol called 'mu':

In [52]:
nt.database.series["mu"].vals.head()

n                 nn  
ID_1_electricity  ID_1    0.475
ID_1_oil          ID_1    0.475
ID_1_K            ID_1    4.050
C_EL_1            EL      0.050
U_ID_1_1          ID_1    0.500
Name: mu, dtype: float64

Success! We now proceed to use the class *abate*, a childclass of *gmspython*.

## Add sets used for calibration to the database

#### g_exo_vals consists of $\bar \sigma$ and $\bar \mu$. These are targets in the minimzation object

$\bar \sigma$ parameters used for the minimization object. These are the sigmas in the MNL nests (C->U)

In [53]:
sigmabar = pd.Series([], name="sigmabar", dtype="float64")
for c in output["ID"]["components"].keys():
    sigmabar[c] = 1

$\bar \mu$ parameters used for minimization object. These are the share parameters in the CET split from technologies to technology goods.

In [54]:
mubar = mu.loc[pd.IndexSlice[mu.index.get_level_values(0).isin(flatten_list(list(output["ID"]["techs"].values()))), \
                     mu.index.get_level_values(1).isin(list(output["ID"]["techs"].keys()))]]
mubar.name = "mubar"

#### g_tech_endo consists of parameters that are endogenized in the calibration procedure

In [55]:
mu_EC = mu.loc[pd.IndexSlice[pd.Series(mu.index.get_level_values(0).str.startswith("C")), \
                     pd.Series(mu.index.get_level_values(1).str.startswith("E"))]].index

mu_IOtech = mu.loc[pd.IndexSlice[mu.index.get_level_values(0).str.startswith("IO"), mu.index.get_level_values(0).str.startswith("IO")]].index

sigma_CU = sigmabar.index
sigma_CU.name = "n"

mu_CU = mu.loc[mu.index.get_level_values(0).isin(flatten_list(output["ID"]["components"].values())), mu.index.get_level_values(1).isin(output["ID"]["components"].keys())].index
mu_CU.name = "n"

mu_tautoU = mu.loc[pd.IndexSlice[mu.index.get_level_values(0).str.startswith("U"), mu.index.get_level_values(1).isin(output["ID"]["techs"].keys())]].index

Share-parameters of Leontief technologies (to be kept exogenous throughout)

In [56]:
t = pd.IndexSlice[pd.Series(mu.index.get_level_values(1).isin(output["ID"]["techs"].keys())), pd.Series(mu.index.get_level_values(1).isin(output["ID"]["techs"].keys()))]

t2 = pd.IndexSlice[pd.Series(mu.index.get_level_values(1).isin(output["ID"]["baseline_U_inputs"].keys())), \
                   pd.Series(mu.index.get_level_values(1).isin(output["ID"]["baseline_U_inputs"].keys()))]

mu_leontief_techs = mu.loc[t2].index.append(mu.loc[t].index)

tech_endo_mu = mu_EC.append(mu_IOtech).append(mu_CU).append(mu_tautoU)

tech_endo_sigma = sigma_CU

g_endovars_exoincalib includes the endogenous variables that we exogenize in calibration because we have data on them (C due to potentials, sum of U due to current applications, sum of dX for IO tech and replacing tech due to IO data) 

In [57]:
def multiindex_series(idx_level_names, idx_name=None, series_name=None):
    if idx_name is None and series_name is not None:
        idx_name = series_name
    elif idx_name is not None and series_name is None:
        series_name = idx_name
    elif idx_name is None and series_name is None:
        raise Exception("Supply either index name or series name")
    idx = pd.MultiIndex(levels=[[]]*len(idx_level_names), codes=[[]]*len(idx_level_names), names=idx_level_names)
    ser = pd.Series(index=idx, dtype=float)
    ser.rename(series_name, inplace=True)
    #ser.index.name = idx_name
    return ser

In [58]:
def find_key_from_value(d, value):
    assert isinstance(d, dict)
    out = []
    for (k, v) in d.items():
        if value in v:
            out.append(k)
    if len(out) == 1:
        return out[0]
    else:
        raise Exception("Value exists in multiple keys")

The sumU and sumX objects are constructed because we need to calibrate to those. sumU hits current application and sumX hits IO input use data.

In [59]:
sumU = multiindex_series(idx_level_names=["n", "nn"], series_name="sumU")
for t in output["ID"]["techs"]:
    for U in output["ID"]["techs"][t]:
        C = find_key_from_value(output["ID"]["components"], U)
        E = find_key_from_value(output["ID"]["upper_categories"], C)
        sumU[("sumU_" + t + "_" + E, U)] = qD[U]

In [60]:
sumU_map = sumU.index

In [61]:
sumU_calibvalues = sumU.copy()
sumU_calibvalues.index = sumU_calibvalues.index.droplevel(1)
sumU_calibvalues = sumU_calibvalues.groupby("n").sum()

In [62]:
def find_true_input(inp, Q2P):
    true = list(Q2P[Q2P.get_level_values(0).isin([inp])].get_level_values(1))[0]
    return true

In [63]:
#sumX. The sum of energy use for baseline technologies (IO_tech + replacing baselines)
sumX = multiindex_series(idx_level_names=["n", "nn"], series_name="sumX")
for x in output["ID"]["IO_tech_inputs"]["IO_tech"]:
    inp = find_true_input(x, output["ID"]["Q2P"])
    sumX[(inp, x)] = np.nan
#pd.IndexSlice[pd.Series(mu.index.get_level_values(1).isin(output["ID"]["techs"].keys())), pd.Series(mu.index.get_level_values(1).isin(output["ID"]["techs"].keys()))]
#output["ID"]["Q2P"]

for t in output["ID"]["techs_inputs"]:
    for x in output["ID"]["techs_inputs"][t]:
        sumX[(find_true_input(x, output["ID"]["Q2P"]), x)] = np.nan

for baseU in output["ID"]["baseline_U_inputs"]:
    for x in output["ID"]["baseline_U_inputs"][baseU]:
        sumX[(find_true_input(x, output["ID"]["Q2P"]), x)] = np.nan

#change name of the sum of an input to "sum_x"
sumX = sumX.reset_index()
sumX["n"] = "sum_" + sumX["n"]
sumX = sumX.set_index(["n", "nn"])["sumX"].index

In [64]:
endovars_exoincalib_sumU = sumU_map.copy()
endovars_exoincalib_sumX = sumX
endovars_exoincalib_C = mu_EC.get_level_values(0)

In [65]:
qD_exo_U =  mu_tautoU.get_level_values(0)

In [66]:
calib_values_currentapplications = pd.concat([qD[qD.index.str.startswith("U") & ~qD.index.str.contains("EOP")], qD[qD.index.str.startswith("C")]]) 

In [67]:
calib_values_potentials = mu.loc[pd.IndexSlice[pd.Series(mu.index.get_level_values(0).str.startswith("C")), \
                                 pd.Series(mu.index.get_level_values(1).str.startswith("E"))]]

In [68]:
db = DataBase.GPM_database()
alwaysexo_mu = mu.drop(tech_endo_mu).index

In [69]:
db["tech_endoincalib_sigma"] = tech_endo_sigma
db["tech_endoincalib_mu"] = tech_endo_mu
db["params_alwaysexo_mu"] = alwaysexo_mu
db["endovars_exoincalib_sumU"] = endovars_exoincalib_sumU
db["endovars_exoincalib_sumX"] = endovars_exoincalib_sumX
db["endovars_exoincalib_C"] = endovars_exoincalib_C
db["calib_values_currentapplications"] = calib_values_currentapplications
db["calib_values_potentials"] = calib_values_potentials

In [70]:
tech_endo_sigma

Index(['C_EL_1', 'C_EL_2', 'C_EL_3', 'C_EH_1', 'C_EL_4', 'C_EL_5', 'C_EH_2',
       'C_ER_1', 'C_EH_3', 'C_ER_2'],
      dtype='object', name='n')

In [71]:
#Minimization stuff
db["minobj"] = 1
db["weight_sigma"] = 1
db["weight_mu"] = 1
db["minobj_mu"] = mubar
db["minobj_sigma"] = sigmabar
db["minobj_mu_subset"] = mubar.index
db["minobj_sigma_subset"] = sigmabar.index

In [72]:
db["sumUaggs"] = endovars_exoincalib_sumU.get_level_values(0).unique()
db["sumU2U"] = endovars_exoincalib_sumU
db["qsumU"] = sumU_calibvalues

In [73]:
db["sumXaggs"] = endovars_exoincalib_sumX.get_level_values(0).unique()
db["sumX2X"] = endovars_exoincalib_sumX
db["qsumX"] = pd.Series([10]*len(endovars_exoincalib_sumX.get_level_values(0).unique()), index=endovars_exoincalib_sumX.get_level_values(0).unique(), name="qsumX")

# **3: The model**

We now build a gams model using a *gmspython* childclass (*abate*). We build the partial equilibrium model from the following principles:
* The firm takes input prices as given. These are prices with taxes ('PwT') that are defined over the global set *inp*. In this case this includes e.g. the price on electricity and capital.
* The firm further takes the quantity of supply as given (it has to be either this or output prices for the model to be square). We denote the quantity of supply 'qS', and define it over the subset of goods that are outputs from the sector *out*. 
* The firm's demand for intermediate goods, and their prices are endogenous to the firm: In our case this involves $\tau$, $U$ and $C$ in ID and only $\tau$ and $U$ in EOP. These intermediate goods are not traded on any market, but is simply a construct we use to build the nests. In this way the quantity of $U$ is both supplied, and demanded by the firm/sector itself. As we do not need both a supply and demand variable (they are the same in this case), we let the quantity/prices for intermediate goods go under the variables 'qD'/'PwT', defined over the subset *int*.
* The demand for inputs is also endogenous to the firm. We denote this 'qD' defined over the subset of goods 'inp'.
* As briefly mentioned above, we either have to let the output prices / quantities be endogeonous (and the other exogenous) for the module to be square. In this case we let the price on outputs be endogenous. We denote the price 'PbT' for price before taxes. This is defined over the subset *out*. Distinguishing between PbT and PwT allows for the flexibility of including a tax module / including a mark-up on profits at a later point.

We will get back to the way we initialize the *gmspython* module next; here we initialize the class to be able to print the variables needed for this specific model.
The model, denoted `m`, is an instance of the class `abate` which is itself a childclass of the more general `gmspython` class.

In [74]:
m = abatement.abate(nt=nt, tech_db=db, work_folder=work_folder, **{'data_folder': gams_folder, "name":"Abatement"})
m.model.database.update_all_sets(clean_up=False) #Makes n include the sum variables

Recall that running a model requires constructing three methods/properties for `m`:

1. `m.initialize_variables()`
2. `m.endo_groups` and `m.exo_groups`
3. `m.add_blocks()`

We briefly review each in turn.

`m.initialize_variables()` sets default initial values for the variables and parameters that are not already present in the attached database (which we constructed earlier). For example, we did not specify the values for the mark-up parameters anywhere, so the method `initialize_variables()` *must* specify the value that these parameters should have. We can see that in this case they will simply be set to 1 (no markup):

The default values for markups are zero, i.e. no markups:

In [75]:
m.default_var_series("markup")

n
ER         0
EH         0
EL         0
C_SO2_1    0
Name: markup, dtype: int64

We run the method using the keyword argument `check_variables:True`. This makes the program check whether the database contains all the necessary values of a specific variable/parameter (say, $\mu$), and if not, merges the default value (in this case equal to 1) onto the series of share-parameters already contained in the database. 

In [76]:
m.initialize_variables(**{"check_variables":True})

Since all of our starting values were set 'as if' the entire tree was Leontief, it is worth noting that the default values for substitutions of elasticity/transformation are 1, e.g. for $\sigma$:

In [77]:
m.default_var_series("sigma").head()

n
U_ID_C_EL_3_base    1
ER                  1
ID_2                1
C_EH_3              1
IO_tech             1
Name: sigma, dtype: int64

The second requirement(s) are the properties `endo_groups` and `exo_groups`. These collect the variables in groups and specify which of these subgroups are part of the two upper groups of endogenous and exogenous variables/parameters respectively. The choice of whether a variable should be endogenous or exogenous changes e.g. depending on whether the model is about to be solved for calibration purposes or not.
In the basic case (not calibration mode), the endogenous variables are split into multiple groups:

In [78]:
m.endo_groups.keys()

dict_keys(['Abatement_g_prices_alwaysendo', 'Abatement_g_quants_alwaysendo', 'Abatement_g_prices_exoincalib', 'Abatement_g_quants_exoincalib'])

... which themselves include the actual variables. Here we print the values of prices with taxes in the endovars group (the first 5 only). This uses the method *.var_endo*.

In [79]:
m.var_endo("PwT").head()

n
C_EH_1       2.000000
C_EH_2       2.777778
C_EH_3       2.700000
C_EH_base    1.166667
C_EL_1       1.800000
Name: PwT, dtype: float64

And next we print the exogenous prices with taxes by doing the same thing, but for exogenous groups instead. This is found in the var_exo method:

In [80]:
m.var_exo("PwT").head()

n
EOP_t1_K            1.0
EOP_t1_oil          1.0
ID_1_K              1.0
ID_1_electricity    1.0
ID_1_oil            1.0
Name: PwT, dtype: float64

Finally, we need to specify the *add_blocks* method. This is the method that writes blocks of equations to the model. In our case, we write a different block of equations for each tree.
For example, a set of equations are:

In [81]:
print(m.eqtext(list(m.ns_local)[0]))

E_zp_out_ID_EC[n]$(out_ID_EC[n])..	PbT[n]*qS[n] =E= sum(nn$(map_ID_EC[nn,n]), qD[nn]*PwT[nn]);
	E_zp_nout_ID_EC[n]$(kno_no_ID_EC[n])..	PwT[n]*qD[n] =E= sum(nn$(map_ID_EC[nn,n]), qD[nn]*PwT[nn]);
	E_q_out_ID_EC[n]$(bra_o_ID_EC[n])..	qD[n] =E= sum(nn$(map_ID_EC[n,nn]), mu[n,nn] * (PbT[nn]/PwT[n])**(sigma[nn]) * qS[nn] / sum(nnn$(map_ID_EC[nnn,nn]), mu[nnn,nn] * (PbT[nn]/PwT[nnn])**(sigma[nn])));
	E_q_nout_ID_EC[n]$(bra_no_ID_EC[n])..	qD[n] =E= sum(nn$(map_ID_EC[n,nn]), mu[n,nn] * (PwT[nn]/PwT[n])**(sigma[nn]) * qD[nn] / sum(nnn$(map_ID_EC[nnn,nn]), mu[nnn,nn] * (PwT[nn]/PwT[nnn])**(sigma[nn])));


#### Model summary:

To sum up, the model we have in mind here has the following main settings (where a \$ denotes a condition):
* Endogenous variables: $PwT\$(int)$, $PbT\$(out)$, $qD\$(wT)$. (Recall that the subset wT is the union of intermediate goods and inputs)
* Exogenous variables: $PwT\$(inp)$, $qS\$(out)$. 
* Equations: For the CES nest we have the following two equations:
    $$\begin{align}
        q_j =& \mu_j\left(\dfrac{p_j}{p_i}\right)^{-\sigma}q_i, \tag{CES-1}\\
        p_iq_i =& \sum_j q_jp_j \tag{CES-2}
    \end{align}$$
    (CES-1) is the CES demand function that has to hold for all *branches* ($q_j$) where $q_i$ is the relevant knot in the nesting tree. (CES-2) is a zero-profit condition that has to hold for production functions with constant returns to scale (alternatively, we can use a price index). This has to hold for every *knot* where $j$ sums over the relevant branches. A similar thing has to hold for the CET tree.

### **Running the model**
Running the model, including the invoking the methods and properties described in the previous section, we simply run the following:

In [82]:
m.write_and_run(kwargs_init={'check_variables':True})

To check whether the model was successfully solved, we check the modelstat (16.0 means solved correctly, 5.0 means not)

In [83]:
m.model_instances["baseline"].modelstat

5.0

The model does not run when all sigmas are not set to be Leontief, i.e. when they use their default value of 1.

We wish to solve the model while instead of the default value of substitution and transformation elasticities of 0.1, we set them equal to zero (in practice not exactly, but very close).
Since the starting values have been set under the assumption of Leontief nests, this will help the solver find an initial solution.

In [84]:
m.model.settings.databases["Abatement_0"]["sigma"].vals.loc[:] = 0.0001
m.model.settings.databases["Abatement_0"]["eta"].vals.loc[:] = -0.0001
#m.model.settings.databases["Abatement"]["sigma"].vals[condition] = 0.2
#m.write_and_run(options_run={'output':sys.stdout})
m.write_and_run()
if m.model_instances["baseline"].modelstat == 16.0:
    print("\nSuccess! The modelstat was 16.0")


Success! The modelstat was 16.0


Save model by pickling it, using the export method:

In [85]:
m.export()

'C:\\Users\\zgr679\\Documents\\GitHub\\GPM_v05\\examples\\Abatement\\Data\\..\\gamsmodels\\Main\\gmspython_Abatement'

The model is calibrated in the next jupyter file.