In [2]:
import numpy as np
import pandas as pd

# The *constraints* Argument

Most of the optimizers wrapped in estimagic cannot deal natively with anything but box constraints. Thus all other constraints have to be plugged into the utility function to convert the problem into an unconstrainde optimization. This is called reparametrization or kernel transformation. 

Typically, users implement such reparametrizations manually and write functions to convert between the parameters of interest and their reparametrized version. 

Estimagic does this for you, for a large number of constraints that are typically used in econometric applications. Below we show you how to use those constraints with simplified examples inspired by real projects.

You don't have to understand any of the example in detail, but only look at the index of their ``params`` DataFrame to see how you can use the constraints in your own projects. 

## Selecting Elements of DataFrames

Typically, a constraint will only apply to a subset of parameters. Before sarting to explain how to specify constraints in estimagic, we will therefore briefly explain how to select subsets of rows of a DataFrame. Feel free to skip.

Lets first look at a simple example DataFrame:

In [6]:
index = pd.MultiIndex.from_product(
    [["a", "b"], np.arange(3)], names=["category", "number"]
)

df = pd.DataFrame(
    data=[0.1, 0.45, 0.55, 0.75, 0.85, -1.0], index=index, columns=["value"]
)

In [7]:
df

Unnamed: 0_level_0,Unnamed: 1_level_0,value
category,number,Unnamed: 2_level_1
a,0,0.1
a,1,0.45
a,2,0.55
b,0,0.75
b,1,0.85
b,2,-1.0


To select subsets of the rows we have two options: [loc](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.loc.html) and [query](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.query.html). 

``loc`` is best if the rows we want to select correspond to an entry in the index, reading from the left. For example, we can select all parameters of category "a" by:

In [8]:
df.loc["a"]

Unnamed: 0_level_0,value
number,Unnamed: 1_level_1
0,0.1
1,0.45
2,0.55


In order to only get the second row, we would do:

In [9]:
df.loc[("a", 1)]

value    0.45
Name: (a, 1), dtype: float64

For these examples, ``query`` would be much more verbose:

In [10]:
df.query("category == 'a'")

Unnamed: 0_level_0,Unnamed: 1_level_0,value
category,number,Unnamed: 2_level_1
a,0,0.1
a,1,0.45
a,2,0.55


However, if we wanted to select all columns where number equals 1, loc would be more cumbersome:

In [11]:
df.loc[[("a", 1), ("b", 1)]]

Unnamed: 0_level_0,Unnamed: 1_level_0,value
category,number,Unnamed: 2_level_1
a,1,0.45
b,1,0.85


Imagine how that would look like if we had twenty categories! For such more cases, query is a much better solution:

In [12]:
df.query("number == 1")

Unnamed: 0_level_0,Unnamed: 1_level_0,value
category,number,Unnamed: 2_level_1
a,1,0.45
b,1,0.85


In order to specify constraints for a parameter, you specify either ``loc`` or ``query``, this will be passed on as an argument to `params_df.loc[]` or `params_df.query()`, respectively.

Note that the value ist optional here. If you don't specify it, estimagic will fix the parameter at the start value. 

## General Structure of Constraints

``minimize`` and ``maximize`` can take a list with any number of constraints. A constraint in estimagic is a dictionary. The following keys are mandatory for all types of constraints:

- "loc" or "query" but not both: This will select the subset of parameters to which the constraint applies. If you use "loc", the corresponding value can be any expression that is valid for ``DataFrame.loc``. Check the examples above or the pandas [documentation](https://tinyurl.com/y5dgptct) to see what is valid. If you use ``query`` the corresponding key can be any condition accepted by query. Again, check the examples above or the [documentation](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.query.html>) if you are not familiar with this method. 

- "type": This can take any of the following values:
    - **"fixed"**: The selected parameters are fixed to a value.
    - **"probability"**: The selected parameters sum to one and are between zero and one.
    - **"increasing"**: The selected parameters are increasing. 
    - **"decreasing"**: The seletced parameters are decreasing.
    - **"equality"**: The selected parameters are equal to each other
    - **"pairwise_equality"**: Several sets of parameters are pairwise equal to each other.
    - **"covariance"**: The selected parameters are variances and covariances.
    - **"sdcorr"**: The selected parameters are standard deviations and correlations
    - **"linear"**: The selected parameters satisfy a linear constraint with equality or inequalities.
    
   
Depending on the type of constraint there might be additional values. Each type of constraint is described in more detail below.


## fixed constraints

To diagnose what goes wrong in difficult optimizations you often want to fix some of the parameters. Of course, you could just remove them from your parameter vector, but again, it's very handy if the parameter vector that arrives in your utility function always looks exactly the same. Therefore, estimagic can fix the parameters for you. A good example of a parameter that is fixed is a discount factor in a structural model. In the robinson example from above, this looks like this:

In [10]:
constraints = [{"loc": "delta", "type": "fixed", "value": 0.95}]

## probability constraints

Probability constraints are similar to sum constraints, but they always sum to 1 and there is the additional constraint that they are all between zero and one. Probability constraints are therefore also pratical for shares or parameters of certain production functions. Let's assume we have a params DataFrame with "shares" in the fist index level. As you probably guess by now, the constraint will look as follows:

In [11]:
constraints = [{"loc": "shares", "type": "probability"}]

## increasing and decreasing constraints

As the name suggests, increasing constraints ensure that the selected parameters are increasing. The prime example are cutoffs in ordered choice models as for example the ordered logit model [Ordered Logit Example](../getting_started/ordered_logit_example.ipynb)

The constraint then looks as follows:

In [12]:
constraints = [{"loc": "cutoffs", "type": "increasing"}]

Decreasing constraints are defined analogously.

## equality constraints

Equality constraints ensure that all selected parameters are equal. This sounds useless because one could simply leave all but one parameters out. But it does very often make the parsing of the parameter vector much easier. For example in dynamic models where you sometimes want to keep parameters time-invariant and sometimes not. The code often becomes much easier if you do not need if-conditions to handle those two (or potentially many more) cases and instead let estimagic handle them for you. An example could be the simple DataFrame from the very beginning, where "a" could be the name of a parameter and "number" could enumerate periods in the model.

In [13]:
# make sure the equality constraint is satisfied
df = df.copy()
df.loc["b", "value"] = 2
df

Unnamed: 0_level_0,Unnamed: 1_level_0,value
category,number,Unnamed: 2_level_1
a,0,0.1
a,1,0.45
a,2,0.55
b,0,2.0
b,1,2.0
b,2,2.0


Keep parameter "b" time-invariant would be as simple as:

In [40]:
constraints = [{"loc": "b", "type": "equality"}]

Under the hood this will optimize over just one b-parameter and set the other b-parameters equal to that one parameter.

## pairwise_equality constraints

Pairwise equality constraints are different from all other constraints because they correspond to several sets of parameters. Let's assume we want to keep the parameters "a" and "b" pairwise equal, then the constraint looks like this:

In [15]:
constraints = [{"locs": ["a", "b"], "type": "pairwise_equality"}]

Alternatively, you could have an entry "queries" where the corresponding value is a list of query strings. Both "locs" and "queries" can have any number of entries. 

## Covariance Constraints

In maximum likelihood estimations, you often have to estimate a covariance matrix of a contribution. 

Of course such a covariance matrix has to be a valid, i.e. positive semi-definite covariance matrix. This is where the "covariance" constraint comes in handy. The covariance constraint assumes that the parameters selected by its ``"loc"`` or ``"query"`` field correspond to the lower triangle of a covariance matrix. The elements are ordered in C-order, i.e starting with the first and only non-zero element of the first row, then the first and second element of the second row and so on. 

It's easier to see this in an example taken from the [respy](https://github.com/OpenSourceEconomics/respy) package. In this example Robinson chooses between the three options fishing, relaxin in his hammock and talking to friday. 

In [None]:
params = pd.read_csv("robinson-crusoe-covariance.csv").set_index(["category", "name"])
params

The parameters that form the covariance matrix are the ones where category equals shocks_cov. The constraint could not be easier to express:

In [14]:
constraints = [{"loc": "shocks_cov", "type": "covariance"}]

That's all. To look at the resulting covariance matrix, we can use another nice function from estimagic:

In [44]:
from estimagic.optimization.utilities import cov_params_to_matrix

cov_params_to_matrix(params.loc["shocks_cov", "value"])

array([[ 1. ,  0. , -0.2],
       [ 0. ,  1. ,  0. ],
       [-0.2,  0. ,  1. ]])

Behind the scenes, estimagic will not estimate the covariance matrix but it's cholesky factor and then construct the covariance matrix for you. This guarantees that the covariance matrix is valid. If you are interested in the details, you can check out this [paper](https://tinyurl.com/y2n55cfb), but the main message of this example is that you don't have to bother about what happens behind the scenes and can instead spend your time on doing research or - if you are like Robinson - relax in a hammock. 

Note that the names in the index is not used at all to determine which element goes where. Otherwise estimagic would have to make assumptions on your index and we don't want to do that. 

Covariance constraint is not compatible with any other type of constraints, including box constraints. Also, you don't have to add box constraints to keep the keep the variances positive because estimagic does this for you. 

## sdcorr Constraints

Most of the time, it is more intuitive to look at correlations and standard deviations than at covariance matrices. If this is the case, you want to use an "sdcorr" constraint instead of the "covariance" constraint. The scdcorr constraint assumes that that the first elements are standard deviations and the rest is the lower triangle (excluding the diagonal) of a correlation matrix. Again, the names in the index are ignored by estimagic. 

Under the hood the same transformation as in the covariance constraint is used. It is also not possible to combine the scdorr constraint with other constraints. 

Let's look at the same example:

In [45]:
params = pd.read_csv("robinson-crusoe-sdcorr.csv").set_index(["category", "name"])
params

Unnamed: 0_level_0,Unnamed: 1_level_0,value
category,name,Unnamed: 2_level_1
delta,delta,0.95
wage_fishing,exp_fishing,0.1
wage_fishing,contemplation_with_friday,0.4
nonpec_fishing,constant,-1.0
nonpec_friday,constant,-1.0
nonpec_friday,not_fishing_last_period,-1.0
nonpec_hammock,constant,2.5
nonpec_hammock,not_fishing_last_period,-1.0
shocks_sdcorr,sd_fishing,1.0
shocks_sdcorr,sd_friday,1.0


The constraint is then just:

In [20]:
constraints = [{"loc": "shocks_sdcorr", "type": "sdcorr"}]

And of course there is another helper function in the utilities module:

In [21]:
from estimagic.optimization.utilities import sdcorr_params_to_sds_and_corr

In [22]:
sds, corr = sdcorr_params_to_sds_and_corr(params.loc["shocks_sdcorr", "value"])
sds

array([1., 1., 1.])

In [23]:
corr

array([[ 1. ,  0. , -0.2],
       [ 0. ,  1. ,  0. ],
       [-0.2,  0. ,  1. ]])

## linear constraints

"linear" constraints have many of the above constraints as special cases. They are a bit more complicated to write but can be very powerful. You should only write a linear constraint if your constraint can't be expressed as one of the special cases. 

They can be used to express constraints of the form:

`lower <=  weights.dot(x) <= upper` or `weights.dot(x) = value`, where `x` are the selected parameters. 

Linear constraints have the following additional fields beside the `loc` or `query` and `type` field:

- weights: This will be used to construct the vector `a`. It can be a numpy array, pandas Series, list or a float in which case the weigths for all selected parameters are equal to that number.
- value: float
- lower: float
- upper: float

You can specify either value or lower and upper bounds. Suppose you have the following params DataFrame:

In [15]:
params = pd.DataFrame(
    index=pd.MultiIndex.from_product([["a", "b", "c"], [0, 1, 2]]), 
    data=[[2], [1], [0], [1], [3], [4], [1], [1], [1]],
    columns=["value"]
)
params

Unnamed: 0,Unnamed: 1,value
a,0,2
a,1,1
a,2,0
b,0,1
b,1,5
b,2,4
c,0,1
c,1,1
c,2,1


Suppose you want to express the following constraints:

1. The first parameter in the a category is two times the second parameter in that category.
2. The mean of the b parameters is larger than 3
3. The sum of the last three parameters is between 0 and 5

Then the constraints would look as follows:

In [None]:
constraints = [
    {"loc": "a", "type": "linear", "weights": [1, -2, 0], "value": 0},
    {"loc": "b", "type": "linear", "weights": 1 / 3, "upper": 3},
    {"loc": "c", "type": "linear", "weights": 1, "lower": 0, "upper": 5},
]

## Constraint killers

All constraints can have an additional key called "id". An example could be:

In [17]:
constraints = [
    {"loc": "a", "type": "equality", "id": 0},
    {"loc": "b", "type": "increasing", "id": 1}
]

In structural economic models, the list of constraints can become quite large and cumbersome to write. Therefore packages that implement such models will often write the constraints for you and only allow you to complement them with additional user constraints. But what if you want to relax some of the constraints they implement automatically? For this we have constraint killers. They take the following form:

In [18]:
killer = {"kill": 0}

For example, the following two lists of constraints will be equivalent:

In [19]:
constraints1 = [
    {"loc": "a", "type": "equality", "id": 0}, 
    {"loc": "b", "type": "increasing", "id": 1},
    {"kill": 0}
]
constraints2 = [{"loc": "b", "type": "increasing", "id": 1}]

If you write a package that implements constraints for the user, the following are best practices:
1. Give the user the chance to add additional constraints
2. Add "id" entries to all constraints
3. Give the user the possibility to look at the constraints that were constructed automatically