logitr: An R package for estimating multinomial and mixed logit models in "preference" and "willingness to pay" space utility specifications
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
R
data-raw
data
examples
man
vignettes
.Rbuildignore
.gitignore
DESCRIPTION
NAMESPACE
README.md
todo.txt

README.md

logitr

logitr estimates multinomial (MNL) and mixed logit (MXL) models in R. Models can be estimated using "Preference" space or "Willingness-to-pay (WTP)" space utility parameterizations. The current version includes support for:

  • Homogeneous multinomial logit (MNL) models
  • Heterogeneous mixed logit (MXL) models (support for normal and log-normal parameter distributions).
  • Preference space utility parameterization.
  • WTP space utility parameterization.
  • A multistart optimization loop with random starting points in each iteration (useful for non-convex problems like MXL models or models with WTP space utility parameterizations).
  • A simulation function for computing the expected shares of a set of alternatives using an estimated model.

MXL models assume uncorrelated heterogeneity covariances and are estimated using maximum simulated likelihood based on the algorithms in Kenneth Train's book Discrete Choice Methods with Simulation, 2nd Edition (New York: Cambridge University Press, 2009).

Table of Contents

Installation

  1. Make sure you have the devtools library installed:

install.packages('devtools')

  1. Load the devtools library and install the logitr package:
library('devtools')
install_github('jhelvy/logitr')
library('logitr')

Required Libraries

logitr requires the nloptr library.

The main optimization loop uses the nloptr function to minimize the negative log-likelihood function. nloptr is used instead of the Base R optim because it allows for both the objective and gradient functions to be included in one function. This speeds up computation time considerably because both the objective and gradient functions require many of the same calculations (e.g. computing the probabilities), which only have to be computed once in nloptr (optim requires separate objective and gradient functions, so many calculations are repeated within each iteration of the optimization loop).

Using logitr()

(For a detailed example, see the './examples/example.R' file)

The main model estimation function is the logitr() function:

model = logitr(data, choiceName, obsIDName, parNames, priceName=NULL,
               randPars=NULL, randPrice=NULL, modelSpace='pref',
               options=list(...))

The function returns a list of values, so assign the model output to a variable (e.g. "model") to store the output values.

Arguments

Argument Description Default
data The choice data, formatted as a data.frame object (see the Data File Setup Section for details). --
choiceName The name of the column that identifies the choice variable. --
obsIDName The name of the column that identifies the obsID variable. --
parNames The names of the parameters to be estimated in the model. Must be the same as the column names in the data argument. For WTP space models, do not include price in parNames. See the Details About parNames Argument Section for more details. --
priceName The name of the column that identifies the price variable. Only required for WTP space models. NULL
randPars A named vector whose names are the random parameters and values the destribution: 'n' for normal or 'ln' for log-normal. NULL
randPrice The random distribution for the price parameter: 'n' for normal or 'ln' for log-normal. Only used for WTP space MXL models. NULL
modelSpace Set to 'wtp' for WTP space models. 'pref'
options A list of options (see the Options Section for details). --

Options

Argument Description Default
numMultiStarts Number of times to run the optimization loop, each time starting from a different random starting point for each parameter between startParBounds. Recommended for non-convex models, such as WTP space models and MXL models. 1
keepAllRuns Set to TRUE to keep all the model information for each multistart run. If TRUE, the logitr() function will return a list with two values: models (a list of each model), and bestModel (the model with the largest log-likelihood value). FALSE
startParBounds Set the lower and upper bounds for the starting parameters for each optimization run, which are generated by runif(n, lower, upper). c(-1, 1)
startVals A vector of values to be used as starting values for the optimization. Only used for the first run if numMultiStarts > 1. NULL
useAnalyticGrad Set to FALSE to use numerically approximated gradients instead of analytic gradients during estimation (which is slower). TRUE
scaleInputs By default each variable in data is scaled to be between 0 and 1 before running the optimization routine because it usually helps with stability, especially if some of the variables have very large or very small values (e.g. > 10^3 or < 10^-3). Set to FALSE to turn this feature off. TRUE
standardDraws By default, a new set of standard normal draws are generated during each call to logitr (the same draws are used during each multistart too). The user can override those draws by providing a matrix of standard normal draws if desired. NULL
numDraws The number of draws to use for MXL models for the maximum simulated likelihood. 200
drawType The type of draw to use for MXL models for the maximum simulated likelihood. Set to 'normal' to use random normal draws or 'halton' for Halton draws. 'halton'
printLevel The print level of the nloptr optimization loop. Type nloptr.print.options() for more details. 0
xtol_rel The relative x tolerance for the nloptr optimization loop. Type nloptr.print.options() for more details. 1.0e-8
xtol_abs The absolute x tolerance for the nloptr optimization loop. Type nloptr.print.options() for more details. 1.0e-8
ftol_rel The relative f tolerance for the nloptr optimization loop. Type nloptr.print.options() for more details. 1.0e-8
ftol_abs The absolute f tolerance for the nloptr optimization loop. Type nloptr.print.options() for more details. 1.0e-8
maxeval The maximum number of function evaluations for the nloptr optimization loop. Type nloptr.print.options() for more details. 1000

Values

Value Description
coef The model coefficients at convergence.
standErrs The standard errors of the model coefficients at convergence.
logLik The log-likelihood value at convergence.
nullLogLik The null log-likelihood value (if all coefficients are 0).
gradient The gradient of the log-likelihood at convergence.
hessian The hessian of the log-likelihood at convergence.
numObs The number of observations.
numParams The number of model parameters.
startPars The starting values used.
multistartNumber The multistart run number for this model.
time The user, system, and elapsed time to run the optimization.
iterations The number of iterations until convergence.
message A more informative message with the status of the optimization result.
status An integer value with the status of the optimization (positive values are successes). Type logitr.statusCodes() for a detailed description.
modelSpace The model space ('pref' or 'wtp').
standardDraws The draws used during maximum simulated likelihood (for MXL models).
randParSummary A summary of any random parameters (for MXL models).
parSetup A summary of the distributional assumptions on each model parameter ("f"="fixed", "n"="normal distribution", "ln"="log-normal distribution").
options A list of all the model options.
multistartSummary A summary of the log-likelihood values for each multistart run.

Data File Setup

The data must be a data.frame object and arranged such that each row is an alternative from a choice observation. The choice observations do not have to be symmetric (i.e. they could each have a different number of alternatives). The columns must include all variables that will be used as model covariates. In addition, the following variables must be included:

  • obsID: A sequence of numbers that identifies each unique choice occasion. For example, if the first three choice occasions had 2 alternatives each, then the first 9 rows of the obsID variable would be 1,1,2,2,3,3.
  • choice: A dummy variable that identifies which alternative was chosen (1=chosen, 0=not chosen).

WTP space models: You must include a price variable (entries should be the price values).

Details About parNames Argument:

The model assumes that the deterministic part of the utility function is linear in parameters (v = beta ' x). Accordingly, each parameter in the parNames argument is an additive part of v. For example, for the utility model u = beta1 * price + beta2 * size + error, then the parNames argument should be c('price', 'size'), and there should be two columns of data in the data argument called price and size. If you wanted to add a third parameter, say price^2, then you should create a separate variable in the data.frame called something like priceSquared and your parNames argument would be c('price', 'size', 'priceSquared').

WTP space models: The parNames should be the WTP parameters, and the price parameter is denoted by the separate argument priceName. For example, for the utility model u = lambda(beta1 * size - price) + error, then the parNames argument should be c('size') and the priceName argument should be 'price'.

Using summary() with logitr

The logitr package also includes a summary function that has several variations:

  • For a single model run, it prints some summary information, including the model space, log-likelihood value at the solution, and a summary table of the model coefficients.
  • For MXL models, the function also prints a summary of the random parameters.
  • If the keepAllRuns option is set to TRUE, the function will print a summary of all the multistart runs followed by a summary of the best model (as determined by the largest log-likelihood value).

To understand the status code of any model, type logitr.statusCodes(), which prints a description of each status code from the nloptr optimization routine.

Computing and Comparing WTP

For models in the preference space, you can get a summary table of the implied WTP by using:

wtp(prefSpaceModel, priceName)

To compare the WTP between two equivalent models in the preference space and WTP spaces, use:

wtpCompare(prefSpaceModel, wtpSpaceModel, priceName)

Simulation

After estimating a model, often times modelers want to use the results to simulate the expected shares of a particular set of alternatives. This can be done using the function simulateShares(). The simulation reports the expected share as well as a confidence interval for each alternative:

shares = simulateShares(model, alts, priceName=NULL, alpha=0.025)

Arguments:

Argument Description Default
model A MNL or MXL model estimated using the logitr package. --
alts A data frame of the alternatives. Each row should be an alternative, and each column an attribute for which there is a corresponding coefficient in the estimated model. --
priceName The name of the column in alts that identifies price (only required for WTP space models). NULL
alpha The significance level for the confidence interval (e.g. 0.025 results in a 95% CI). 0.025

Author, Version, and License Information

  • Author: John Paul Helveston (www.jhelvy.com)
  • Date First Written: Sunday, September 28, 2014
  • Most Recent Update: Thursday, September 6, 2018
  • License: GPL-3
  • Latest Version: 1.2

Citation Information

If you use this package for in a publication, I would greatly appreciate it if you cited it. You can get the citation information by typing this into R:

citation('logitr')