In [1]:
%run stdPackages.ipynb # this imports a lot of useful packages 
import base

# The ```lpBlock``` class - Part II: Compilation and execution

We start by importing the database and the instance of the block with the structured arguments:

In [2]:
with open(os.path.join(d['data'],'blockPartI'), "rb") as file:
    block = pickle.load(file)
with open(os.path.join(d['data'],'blockPartI_db'), "rb") as file:
    db = pickle.load(file)

This part II takes you through the compiling process as it may provide some insights into the troubleshooting process - if a model is misspecified. The main compilation function is implemented via the ```__call__(self, execute = None)``` method which can be called using the syntax ```self(execute = None)```. This method runs through five core steps that we go through in the following sections (the kwarg ```execute``` can be passed to only run a subset of the methods):

>[**1. compileParameters**](#compile): Takes the arguments that we specified in [**part I**](./lpCompiler_PartI.ipynb) and constructs new components.  
>[**2. settingsFromCompiled**](#settings): Extracts some useful meta-data from the compiled data.  
>[**3. inferGlobalDomains**](#globalDomains): Uses the compiled elements to establish global indices for the entire model.  
>[**4. getDenseArgs**](#denseArgs): Stacks and rearranges the compiled data into large vectors/matrices that are consistent with the shapes that the solver ```scipy.optimize.linprog``` takes as inputs.  
>[**5. lp_args**](#lpArgs): Returns the 'denseArgs' as a dictionary of sparse arrays that can be used directly in the solver.  
>[**6. High level summary**](#summary): Summarizes the code required to compile and solve the model. 

## 0. Part I

Part I ends with the model structure added to the lpBlock instance. Specifically, the model structure is added to the dictionary in ```parameters```: 

In [3]:
block.parameters.keys()

dict_keys(['c', 'l', 'u', 'b_eq', 'b_ub', 'A_eq', 'A_ub'])

Each type (```c,l,u,b_eq,b_ub,A_eq,A_ub```) stores the arguments from earlier in dictionaries as well. Recall that we added a component using the syntax:

```python
block.add_c(varName = 'E', value = pyDbs.adjMultiIndex.bc(db['mc'], block.globalDomains['E']), component = None, conditions = None)
```

This is added to ```block.parameters['c']``` with the key tuple ```(varName, component)``` and the value ```value```; in this case:

In [4]:
block.parameters['c'][('E',None)]

id            h
Conv. plant   1    15
              2    15
Wind turbine  1     5
              2     5
Name: 0, dtype: int64

The ```component``` option can be used to define the costs of $E$ in two steps. For instance, say that we want a variation on the model, where a new generator (e.g. Photovoltaics) is added with a cost of 3 in both hours. This can be done by calling:

```python
block.add_c(varName = 'E', value = pd.Series([3, 3], index = pd.MultiIndex.from_tuples([('PV',1,),('PV',2)], names = ['id','h'])), component = 'added PV')
```

If we had not specified a new ```component``` here, we would instead overwrite the old argument and remove the conventional plant and the wind turbine from the model.

## 1. compileParameters <a id='compile'></a>

In [5]:
block.compileParameters()

The first step of the compilation takes the arguments defined in the ```self.parameters``` dictionary and reshape them into a common structure. We want this common structure because we ultimately need to stack everything in to vectors and matrices, even though the variables/constraints may be defined over different sets / number of sets. Furthermore, we impose this common structure to be able to create a single sort/ordering of all model components.

The compilation strategy here boils down to representing all coefficients on variables (e.g. from ```c, l, u```) with a 2-dimensional index: The first level is always called ```_vsymbol``` and simply stores the name of the variable. The second level ```_vindex``` is a 1d representation of whatever domain the variable in question may be defined over. Generally, we use the following nd-to-1d mapping of domains:
* If a variable $x$ is a scalar: The variable is defined over no sets. In this case ```_vindex = None```.
* If a variable $x[s]$ is defined over a 1d set: In this case ```_vindex = s```.
* If a variable $x[s_1,..., s_n]$ is defined over nd sets: In this case ```_vindex``` represents the nd sets as a 1d set of tuples.

This step is carried out using the ```base.fIndexVariable(variableName, value, btype = 'v') ``` method. You can see an example of this here:

In [6]:
x = base.fIndexVariable('E', block.parameters['c'][('E',None)])
x 

_vsymbol  _vindex          
E         (1, Conv. plant)     15
          (2, Conv. plant)     15
          (1, Wind turbine)     5
          (2, Wind turbine)     5
Name: 0, dtype: int64

You may also note that - as we will see later - we have included a simple method that reverses this, i.e. a method that returns the original symbol. This is done by calling ```base.vIndexVariable```:

In [7]:
base.vIndexVariable(x, 'E', x.index._n) 

h  id          
1  Conv. plant     15.0
2  Conv. plant     15.0
1  Wind turbine     5.0
2  Wind turbine     5.0
dtype: float64

We deal with constraint vectors ```b_eq, b_ub``` in a similar manner: They are defined with a 2-dimensional index, where the first level captures the constraint name and the second level a 1d representation of the relevant domain. In our case, for instance, the equilibrium constraint has a vector of zeros defined over the hours in the model:

In [8]:
block.compiled['b_eq']['equilibrium']

_eqsymbol    _eqindex
equilibrium  1           0
             2           0
Name: 0, dtype: int64

Finally, for the coefficient matrices ```A_eq, A_ub```, we reshape them into a combination of the variable and constraint forms outlined above: Thus, they are defined over 4 index levels - the first two levels represents the variable and the last two represents the relevant constraint. In our case, ```A_eq``` for the equilibrium constraint and the variable $E$ is returned as:

In [9]:
block.compiled['A_eq'][('equilibrium','E')]

_vsymbol  _vindex            _eqsymbol    _eqindex
E         (1, Conv. plant)   equilibrium  1           1
          (2, Conv. plant)   equilibrium  2           1
          (1, Wind turbine)  equilibrium  1           1
          (2, Wind turbine)  equilibrium  2           1
dtype: int64

## 2. settingsFromCompiled <a id='settings'></a>

This intermediate step collects some meta-data on the model structure. Specifically, we create four attributes: ```allvars, allconstr, alldomains, allconstrdomains```:

In [10]:
block.settingsFromCompiled()

1. *```allvars```: The list of variables in the model:*

In [11]:
block.allvars

['D', 'E']

2. *```allconstr```: Dictionary with keys = constraint type (```eq,ub```) and values = lists of constraints:*

In [12]:
block.allconstr

{'eq': ['equilibrium'], 'ub': []}

3. *```alldomains```: Dictionary with keys = variable names and values = list of domains for the relevant variables:*

In [13]:
block.alldomains

{'E': ['h', 'id'], 'D': ['c', 'h']}

4. *```allconstrdomains```: Dictionary with keys = constraint names and values = list of domains for the relevant constraints:*

In [14]:
block.allconstrdomains

{'equilibrium': ['h_constr']}

## 3. inferGlobalDomains <a id='globalDomains'></a>

This step sums up the domains for all variables and constraints. Specifically, we collect four attributes: ```gIndex, globalVariableIndex, globalConstraintIndex, globalMaps```.

In [15]:
block.inferGlobalDomains()

1. ```gIndex```: This is a dictionary with keys = variable names and values = full indices for the relevant symbol. We collect this "full index" for the variables by looking over how the variables are used in ```c,l,u``` vectors *and* how they are used in ```A_eq, A_ub```. This feature is what enables us to rely on *default values* which simplifies how we specify the model structure: For instance, recall from part I that we did not specify anywhere that the lower bound on both $E$ and $D$ was zero; when we solve the model later on, we still have these lower bounds, because we can look up in the ```gIndex``` and fill in zeros on the entire domain of the variables.

In [16]:
block.gIndex

{'D': MultiIndex([('D', ('Consumer 1', 1)),
             ('D', ('Consumer 1', 2)),
             ('D', ('Consumer 2', 1)),
             ('D', ('Consumer 2', 2))],
            names=['_vsymbol', '_vindex']),
 'E': MultiIndex([('E',  (1, 'Conv. plant')),
             ('E', (1, 'Wind turbine')),
             ('E',  (2, 'Conv. plant')),
             ('E', (2, 'Wind turbine'))],
            names=['_vsymbol', '_vindex'])}

2. ```globalVariableIndex```: Stacks the indices from ```gIndex``` to establish one single index that spans the entire vector $\mathbf{x}$ that the solver eventually returns as the solution. In our case:

In [17]:
block.globalVariableIndex

MultiIndex([('D',   ('Consumer 1', 1)),
            ('D',   ('Consumer 1', 2)),
            ('D',   ('Consumer 2', 1)),
            ('D',   ('Consumer 2', 2)),
            ('E',  (1, 'Conv. plant')),
            ('E', (1, 'Wind turbine')),
            ('E',  (2, 'Conv. plant')),
            ('E', (2, 'Wind turbine'))],
           names=['_vsymbol', '_vindex'])

3. ```globalConstraintIndex```: Does a similar thing to ```globalVariableIndex```, but for the constraints instead (split into equality constraints ('eq') and upper bound constraints ('ub') respectively):

In [18]:
block.globalConstraintIndex

{'eq': MultiIndex([('equilibrium', 1),
             ('equilibrium', 2)],
            names=['_eqsymbol', '_eqindex']),
 'ub': None}

4. ```globalMaps```: For the three types of global indices - *variables ('v'), equality constraints ('eq'), and upper bound constraints ('ub')* - this defines mappings from our global pandas indices to an integer index. Recall from part I that the solver ```optimize.linprog``` simply works over stacked arrays (without any indices/symbol names). These maps are useful for translating the element in the stacked arrays passed to/from the solver into the indices that we find meaningful.

Consider, for instance, the variables in our model ($D, E$):

In [19]:
block.globalMaps['v']

_vsymbol  _vindex          
D         (Consumer 1, 1)      0
          (Consumer 1, 2)      1
          (Consumer 2, 1)      2
          (Consumer 2, 2)      3
E         (1, Conv. plant)     4
          (1, Wind turbine)    5
          (2, Conv. plant)     6
          (2, Wind turbine)    7
dtype: int64

When the solver returns the solution vector $\mathbf{x}$ as an array, we can use this to extract the $D$ and $E$ vectors ($\mathbf{x}[0:4]$ and $\mathbf{x}[4:8]$, respectively). 

## 4. getDenseArgs <a id='denseArgs'></a>

This final step collects the arguments that we pass to the solver in a dictionary ```self.denseArgs```. It relies on the global indices established in step 3, ```inferGlobalDomains```, to fill in default values when they are missing and stack and sorts the full vectors to get them ready for the ```scipy.optimize.linprog``` solver. The method is called 'dense', because it generally fills in the default values. However, it is important to note that the coefficient matrices ```A_eq, A_ub``` are *still* defined sparsely (i.e. without a lot of the final zeros). This makes the model run much faster, when the size increases.

This final step collects the arguments that we pass to the solver in a dictionary ```self.denseArgs```. It relies on the global indices established in step 3, ```inferGlobalDomains```, to fill in default values when they are missing and stack and sorts the full vectors to get them ready for the ```scipy.optimize.linprog``` solver. The method is called 'dense', because it generally fills in the default values. However, it is important to note that the coefficient matrices ```A_eq, A_ub``` are *still* defined sparsely (i.e. without a lot of the final zeros). This makes the model run much faster, when the size increases.

In [20]:
block.getDenseArgs()

For instance, this is the step where you can verify that the lower bound vector ```l``` is filled with zeros, even if we did not specify this in the model structure in part I:

In [21]:
block.denseArgs['l']

_vsymbol  _vindex          
D         (Consumer 1, 1)      0.0
          (Consumer 1, 2)      0.0
          (Consumer 2, 1)      0.0
          (Consumer 2, 2)      0.0
E         (1, Conv. plant)     0.0
          (1, Wind turbine)    0.0
          (2, Conv. plant)     0.0
          (2, Wind turbine)    0.0
dtype: float64

If we look at the coefficient matrix, however, note that this still does not include a lot of the zeros:

In [22]:
block.denseArgs['A_eq']

_vsymbol  _vindex            _eqsymbol    _eqindex
D         (Consumer 1, 1)    equilibrium  1          -1.0
          (Consumer 1, 2)    equilibrium  2          -1.0
          (Consumer 2, 1)    equilibrium  1          -1.0
          (Consumer 2, 2)    equilibrium  2          -1.0
E         (1, Conv. plant)   equilibrium  1           1.0
          (2, Conv. plant)   equilibrium  2           1.0
          (1, Wind turbine)  equilibrium  1           1.0
          (2, Wind turbine)  equilibrium  2           1.0
dtype: float64

When we pass this to the solver, this is rearranged into a sparse matrix with rows corresponding to the constraint index and columns as variable indices:

In [23]:
block.lp_A_eq

<2x8 sparse matrix of type '<class 'numpy.float64'>'
	with 8 stored elements in COOrdinate format>

This sparse matrix can be printed and inspected by calling:

In [24]:
block.lp_A_eq.toarray()

array([[-1.,  0., -1.,  0.,  1.,  1.,  0.,  0.],
       [ 0., -1.,  0., -1.,  0.,  0.,  1.,  1.]])

## 5. lp_args <a id='lpArgs'></a>

This is the property that the ```___call__``` ultimately returns. It is a dictionary of sparse arguments that can be directly used in the solver:

In [25]:
block.lp_args

{'c': array([-10., -10., -20., -20.,  15.,   5.,  15.,   5.]),
 'A_ub': None,
 'b_ub': None,
 'A_eq': <2x8 sparse matrix of type '<class 'numpy.float64'>'
 	with 8 stored elements in COOrdinate format>,
 'b_eq': array([0., 0.]),
 'bounds': array([[0.  , 0.5 ],
        [0.  , 0.5 ],
        [0.  , 0.5 ],
        [0.  , 0.5 ],
        [0.  , 0.5 ],
        [0.  , 0.5 ],
        [0.  , 0.5 ],
        [0.  , 0.25]])}

Thus, we solve the linear model by calling (*the notation ```**dict``` is used to 'unpack' the dictionary to function kwargs*):

In [26]:
sol = optimize.linprog(**block.lp_args)
sol

        message: Optimization terminated successfully. (HiGHS Status 7: Optimal)
        success: True
         status: 0
            fun: -12.5
              x: [-0.000e+00  0.000e+00  5.000e-01  5.000e-01  0.000e+00
                   5.000e-01  2.500e-01  2.500e-01]
            nit: 2
          lower:  residual: [-0.000e+00  0.000e+00  5.000e-01  5.000e-01
                              0.000e+00  5.000e-01  2.500e-01  2.500e-01]
                 marginals: [ 0.000e+00  5.000e+00  0.000e+00  0.000e+00
                              5.000e+00  0.000e+00  0.000e+00  0.000e+00]
          upper:  residual: [ 5.000e-01  5.000e-01  0.000e+00  0.000e+00
                              5.000e-01  0.000e+00  2.500e-01  0.000e+00]
                 marginals: [ 0.000e+00  0.000e+00 -1.000e+01 -5.000e+00
                              0.000e+00 -5.000e+00  0.000e+00 -1.000e+01]
          eqlin:  residual: [ 0.000e+00  0.000e+00]
                 marginals: [ 1.000e+01  1.500e+01]
        ineqlin:  r

## 6. High level summary <a id='summary'></a>

As we mentioned in the outset of this note, the compilation method is summarized by the ```___call__``` method. Thus, we compile and run the model by simply calling:

In [27]:
optimize.linprog(**block()) # the notation block() is python shorthand for block.__call__()

        message: Optimization terminated successfully. (HiGHS Status 7: Optimal)
        success: True
         status: 0
            fun: -12.5
              x: [-0.000e+00  0.000e+00  5.000e-01  5.000e-01  0.000e+00
                   5.000e-01  2.500e-01  2.500e-01]
            nit: 2
          lower:  residual: [-0.000e+00  0.000e+00  5.000e-01  5.000e-01
                              0.000e+00  5.000e-01  2.500e-01  2.500e-01]
                 marginals: [ 0.000e+00  5.000e+00  0.000e+00  0.000e+00
                              5.000e+00  0.000e+00  0.000e+00  0.000e+00]
          upper:  residual: [ 5.000e-01  5.000e-01  0.000e+00  0.000e+00
                              5.000e-01  0.000e+00  2.500e-01  0.000e+00]
                 marginals: [ 0.000e+00  0.000e+00 -1.000e+01 -5.000e+00
                              0.000e+00 -5.000e+00  0.000e+00 -1.000e+01]
          eqlin:  residual: [ 0.000e+00  0.000e+00]
                 marginals: [ 1.000e+01  1.500e+01]
        ineqlin:  r