# Example of a D-efficient RUM design with ChoiceDesign

This notebook illustrates how to use **ChoiceDesign** to generate an D-efficient experimental design for a Random Utility Maximisation (RUM) model with alternative-specific constants. Given a set of attributes, coding and prior parameters, ChoiceDesign uses a variation of the random swapping algorithm [1] to minimise the D-error of the information matrix of a Multinomial Logit (MNL) model.

## Step 1: Load modules and define attributes

The following line loads `RUMDesign`, which is the class of designs for RUM models:

In [1]:
from choicedesign.design import RUMDesign

Each attribute is defined as a dictionary that contains the following keys:

* `name`: a string with the attribute name,
* `levels`: a list of levels of the attribute,
* `coding`: if `numeric`, then the attribute is not coded, while if `dummy`, the attribute is coded as a set of dummy variables. The first level is a assumed as the base level. Dummy attributes must have correlative levels starting from zero (e.g., `[0, 1, ...]`),
* `par`: The prior parameters of the attribute. If `coding` is `dummy`, then $levels-1$ parameters must be defined.

The following lines define 7 attributes, named from $A$ to $G$. The first six attributes are defined as dummy variables, whereas the last attribute is defined as numeric:

In [2]:
Att1 = {	'name':		'A',
            'levels':	[0,1,2],
            'coding':	'dummy',
            'par':		[-0.1,-0.2]}

Att2 = {	'name':		'B',
            'levels':	[0,1,2],
            'coding':	'dummy',
            'par':		[0.1,0.15]}

Att3 = {	'name':		'C',
            'levels':	[0,1],
            'coding':	'dummy',
            'par':		[-0.1]}

Att4 = {	'name':		'D',
            'levels':	[0,1],
            'coding':	'dummy',
            'par':		[-0.4]}

Att5 = {	'name':		'E',
            'levels':	[0,1],
            'coding':	'dummy',
            'par':		[-0.1]}

Att6 = {	'name':		'F',
            'levels':	[0,1],
            'coding':	'dummy',
            'par':		[-0.4]}

Att7 = {	'name':		'G',
            'levels':	[100,250,500,750],
            'coding':	'numeric',
            'par':		[-0.004]}

## Step 2: Construct experimental design object

The second step consists of constructing the experimental design object. `RUMDesign` admits the following parameters:

* `atts_list`: a list of attributes,
* `n_alts`: an integer with the number of alternatives of the design,
* `ncs`: an integer with the number of choice situations,

Additionally, `RUMDesign` admits the following optional parameters:

* `optout`: whether the design must contain an opt-out alternative. If `True`, then a new alternative is create at the end of the design. By default `False`.
* `asc`: a dictionary that specifies alternative-specific constants (ASC) and prior parameters. Each key defines an ASC for the corresponding alternative and each value is the associated ASC prior parameter.

The following lines define an object named `design` that contains the RUM design with the seven attributes, two alternatives, 60 choice situations and an opt-out alternative with an ASC. The prior parameter of the ASC is -1:

In [3]:
design = RUMDesign(
    atts_list=[Att1,Att2,Att3,Att4,Att5,Att6,Att7],
    n_alts=2,
    ncs=60,
    optout=True,
    asc = {3: -1})

## Step 3: Optimise and show the final design

The method `optimise()` starts the optimisation routine using the random swapping algorithm. The user can provide conditions that the final design must hold, the number of blocks of the final design and must define one or more stopping criteria.

`optimise()` admits the following options:

* A list of conditions that the final design must hold. Each element is a string that contains a single condition. Conditions can be of the form of binary relations (e.g., `X > Y` where `X` and `Y` are attributes of a specific alternative) or conditional relations (e.g., `if X > a then Y < b` where `a` and `b` are values). Users can specify multiple conditions when the operator `if` is defined, separated by the operator `&`,
* `n_blocks`: number of blocks of the final design. Must be a multiple of the number of choice situations,
* `iter_lim`: number of iterations before the algorithm stops.
* `noimprov_lim`: Number of iterations without improvement before the algorithm stops,
* `time_lim`: time (in minutes) before the algorithm stops,
* `seed`: Random seed
* `verbose`:Whether status messages and progress are shown.

The following line starts the optimisation routine during 1 minute:

In [4]:
optimal_design, init_perf, final_perf, final_iter, ubalance_ratio = design.optimise(time_lim = 1, verbose = True)

Generating the initial design matrix
Optimization complete 0:00:59 / D-error: 0.053359
Elapsed time: 0:01:00
D-error of initial design:  0.066302
D-error of last stored design:  0.053359
Utility Balance ratio:  46.67 %
Algorithm iterations:  237023



The optimisation returns the following objects:

* `optimal_design`: a Pandas data frame with the final (optimal) design,
* `init_perf`: D-error of the initial design,
* `final_perf`: D-error of the final design, 
* `final_iter`: the total number of iterations,
* `ubalance_ratio`: the utility balance ratio

The final design can be shown and exported to a CSV or spreadsheet, as it is a pandas data frame:

In [5]:
optimal_design

Unnamed: 0,CS,alt1_asc3,alt1_A,alt1_B,alt1_C,alt1_D,alt1_E,alt1_F,alt1_G,alt2_asc3,alt2_A,alt2_B,alt2_C,alt2_D,alt2_E,alt2_F,alt2_G,optout_asc3
0,1,0,2,2,1,0,1,1,100,0,1,1,0,1,0,0,250,1
1,2,0,0,1,1,0,0,0,750,0,2,0,0,1,1,1,750,1
2,3,0,1,1,1,0,0,1,100,0,0,2,0,1,1,0,250,1
3,4,0,2,0,0,1,0,1,100,0,1,2,1,0,1,0,250,1
4,5,0,1,2,0,1,1,1,750,0,2,1,1,0,0,0,750,1
5,6,0,0,0,1,0,0,1,750,0,1,2,0,1,1,0,750,1
6,7,0,2,2,0,1,1,1,100,0,0,1,1,0,0,0,500,1
7,8,0,0,1,1,1,0,0,250,0,1,2,0,0,1,1,250,1
8,9,0,2,2,1,0,0,1,100,0,1,1,0,1,1,0,250,1
9,10,0,0,1,1,1,1,1,750,0,2,0,0,0,0,0,750,1


# References

[1] Quan, W., Rose, J. M., Collins, A. T., & Bliemer, M. C. (2011). A comparison of algorithms for generating efficient choice experiments.

