# Example of a D-efficient RUM design with ChoiceDesign

This notebook illustrates how to use **ChoiceDesign** to generate a simple D-efficient experimental design for a Random Utility Maximisation (RUM) model. Given a set of attributes, coding, availability and prior parameters, ChoiceDesign uses a variation of the random swapping algorithm [1] to minimise the D-error of the information matrix of a Multinomial Logit (MNL) model.

## Step 1: Load modules, define design parameters and set attributes

The following line loads `EffDesign`, which is the class of efficient designs:

In [30]:
from choicedesign.design import EffDesign
from biogeme.expressions import Beta, Variable
import biogeme.database as db
import biogeme.models as models

The main parameters of `EffDesign` are:
* `alts`: List of alternative names, to generate the design matrix
* `ncs`:Number of choice situations.

Each attribute is defined as a dictionary that contains the following keys:

* `name`: a string with the attribute name,
* `levels`: a list of levels of the attribute,
* `avail`: a list that details whether the attribute is part of a specific alternative. Each element is one of the alternative names of the parameter `alts`.

The following lines define 2 alternatives, named `alt1` and `alt2`, 16 choice situations and 4 attributes, named from $A$ to $D$, which are available on both alternatives

In [31]:
Alts = ['alt1','alt2']
NCS = 16

A = {	'name':		'A',
        'levels':	[1,2,3],
        'avail':    ['alt1','alt2']}

B = {	'name':		'B',
        'levels':	[10,15,15.5],
        'avail':    ['alt1','alt2']}

C = {	'name':		'C',
        'levels':	[0,3,5],
        'avail':    ['alt1','alt2']}

D = {	'name':		'D',
        'levels':	[0,1,2],
        'avail':    ['alt1','alt2']}

## Step 2: Construct experimental design object and generate initial design matrix

The second step consists of constructing the experimental design object. The following lines define an object named `design` using `EffDesign`, which requires the following parameters:

* `alts`: The list of alternative names
* `ncs`: The number of choice situations
* `atts_list`: The list where each attribute was defined

In [32]:
design = EffDesign(
    alts=Alts,
    ncs=NCS,
    atts_list=[A,B,C,D])

After the design object is defined, the method `gen_initdesign()` generates the initial design matrix. This method accepts the following optional parameters:

* `cond`: List of conditions that the final design must hold. Each element is a string that contains a single condition. Conditions can be of the form of binary relations (e.g., `X > Y` where `X` and `Y` are attributes of a specific alternative) or conditional relations (e.g., `if X > a then Y < b` where `a` and `b` are values). Users can specify multiple conditions when the operator `if` is defined, separated by the operator `&`.

* `seed`: Random seed

In [33]:
init_design = design.gen_initdesign()
init_design

Unnamed: 0,alt1_A,alt1_B,alt1_C,alt1_D,alt2_A,alt2_B,alt2_C,alt2_D
0,1.0,10.0,3.0,1.0,1.0,10.0,3.0,1.0
1,3.0,15.0,3.0,1.0,1.0,15.5,5.0,2.0
2,3.0,15.5,5.0,0.0,3.0,15.5,0.0,0.0
3,1.0,15.5,0.0,0.0,1.0,15.5,0.0,0.0
4,3.0,10.0,5.0,1.0,1.0,10.0,3.0,1.0
5,1.0,10.0,3.0,1.0,2.0,15.5,0.0,2.0
6,1.0,15.5,5.0,1.0,1.0,15.5,3.0,2.0
7,3.0,10.0,5.0,2.0,1.0,15.0,0.0,1.0
8,2.0,10.0,0.0,2.0,2.0,15.0,0.0,0.0
9,1.0,15.0,0.0,2.0,3.0,10.0,0.0,0.0


## Step 3: Set the utility functions and the Biogeme model object

`ChoiceDesign` allows users to customise the utility functions of the efficient design. The syntax is based on Biogeme: each attribute of the utility function is defined as a Biogeme `Variable` object, while each prior parameter is defined as a Biogeme `Beta` object.

Regarding variables, there must be the same number of `Variable` objects as the number of attributes in the initial design matrix. Furthermore, the `name` parameter of the `Variable` object must coincide with the attribute name in the initial design matrix.

In [34]:
alt1_A = Variable('alt1_A')
alt1_B = Variable('alt1_B')
alt1_C = Variable('alt1_C')
alt1_D = Variable('alt1_D')

alt2_A = Variable('alt2_A')
alt2_B = Variable('alt2_B')
alt2_C = Variable('alt2_C')
alt2_D = Variable('alt2_D')

The `Beta` parameters follow the same syntax as Biogeme objects:

* `name`: The parameter name
* `value`: The prior value
* `lowerbound` and `upperbound`: Must be set to `None`
* `fixed`: Must be set to zero

The following lines define four parameters:

In [35]:
beta_A = Beta('beta_A',-0.1,None,None,0)
beta_B = Beta('beta_B',-0.02,None,None,0)
beta_C = Beta('beta_C',0.1,None,None,0)
beta_D = Beta('beta_D',0.15,None,None,0)


Then, the utility functions must be defined. We will asume a linear utility function for each alternative.

In [36]:
V1 = beta_A * alt1_A + beta_B * alt1_B + beta_C * alt1_C + beta_D * alt1_D
V2 = beta_A * alt2_A + beta_B * alt2_B + beta_C * alt2_C + beta_D * alt2_D

The utility functions must be stored in a dictionary object. In this dictionary, each key is a consecutive number from 1 to the number of alternatves. The values of each key are the corresponding utility functions:

In [37]:
V = {1: V1, 2: V2}

## Step 3: Optimise the initial design, given the utility functions and priors:

The method `optimise()` starts the D-erro minimisation routine, given the initial design matrix and the utility functions. This method requires the following parameters:

* `init_design`: The objective design matrix to optimise
* `V`: The dictionary object with utility functions
* `model`: The base model of the efficient design. By default is `mnl` for a Multinomial Logit model.

In addition, `optimise()` admits the following optimal parameters:

* `n_blocks`: number of blocks of the final design. Must be a multiple of the number of choice situations,
* `iter_lim`: number of iterations before the algorithm stops.
* `noimprov_lim`: Number of iterations without improvement before the algorithm stops,
* `time_lim`: time (in minutes) before the algorithm stops,
* `seed`: Random seed
* `verbose`:Whether status messages and progress are shown.

The outputs of `optimise` are:

* `optimal_design`: The optimised design matrix
* `init_perf`: The initial D-Error
* `final_perf`: The D-error of the last stored design
* `final_iter`: The last iteration number
* `ubalance_ratio`: The utility balance ratio. a 0% value indicates strict dominance of an alternative, whereas 100% indicates equal market shares. 

The following line starts the optimisation routine during 1 minute:

In [38]:
optimal_design, init_perf, final_perf, final_iter, ubalance_ratio = design.optimise(init_design=init_design,V=V,model='mnl',time_lim = 1, verbose = True)

Evaluating initial design
Optimization complete 0:00:59 / D-error: 0.037242
Elapsed time: 0:01:00
D-error of initial design:  0.08166
D-error of last stored design:  0.037242
Utility Balance ratio:  94.6 %
Algorithm iterations:  28732



Lastly, the optimal design can be printed:

In [39]:
optimal_design

Unnamed: 0,CS,alt1_A,alt1_B,alt1_C,alt1_D,alt2_A,alt2_B,alt2_C,alt2_D
0,1.0,1.0,10.0,5.0,1.0,3.0,15.0,0.0,1.0
1,2.0,3.0,10.0,5.0,0.0,1.0,15.5,0.0,2.0
2,3.0,2.0,15.0,3.0,0.0,1.0,10.0,3.0,2.0
3,4.0,2.0,15.0,5.0,2.0,2.0,15.0,0.0,0.0
4,5.0,3.0,15.5,3.0,2.0,1.0,10.0,0.0,0.0
5,6.0,3.0,15.0,0.0,1.0,1.0,15.0,5.0,1.0
6,7.0,1.0,15.5,0.0,1.0,3.0,10.0,3.0,1.0
7,8.0,2.0,10.0,5.0,0.0,2.0,15.5,0.0,2.0
8,9.0,3.0,10.0,0.0,2.0,1.0,15.5,5.0,0.0
9,10.0,1.0,15.5,3.0,1.0,3.0,10.0,3.0,0.0


## (optional) Evaluate the design
The method `evaluate()` allows to evaluate a design stored in a data frame, under the specification provided when `EffDesign` was initialised. `evaluate()` requires the following parameters:

* `optimal_design`: The objective design matrix to evaluate
* `V`: The dictionary object with utility functions
* `model`: The base model of the efficient design. By default is `mnl` for a Multinomial Logit model.

In [40]:
perf, ubalance = design.evaluate(optimal_design,V,model='mnl')

print(perf, ubalance)

0.03724166070894624 94.6010523299103


# References

[1] Quan, W., Rose, J. M., Collins, A. T., & Bliemer, M. C. (2011). A comparison of algorithms for generating efficient choice experiments.

