
# Discrete Choice Methods with Simulation 
## Introduction

An agent faces a choice among a set of options.
The outcome of the decision is denoted $y$ and is discrete.
Some factors determining the agent's choice are observed by the researcher and some are not. 
The observed factors are denoted $x$ and the unobserved ones $\epsilon$.
The function $h$ such that $y=h(x, \epsilon)$ is caled the behavioral process.  

The probability that the agent chooses an outcome is the probability that the unobserved factors are such that the behavioral process results in that outcome.
$P(y|x) = P(\epsilon \mbox{ s.t. } h(x, \epsilon) = y)$.  
This probability can be expressed as the integral $\int 1_{h(x, \epsilon)=y}f(\epsilon)d\epsilon$ where $f(\epsilon)$ is the density of the unobserved terms.  
There are three posibilities to evaluate that integral:
- Complete Closed-Form Expression: Solve the integral analytically
- Complete Simulation: As integration over a density is a form of averaging
- Partial Simulation, Partial Closed Form  

We will take a look at behavioral models that have been proposed to describe the choice process.

## Properties of Discrete Choice Models

### The Choice Set
Decision makers have to choose among alternatives and the set of alternatives is called the choice set.
- The alternatives must be mutually exclusive from the decision maker's perpective
- The decision maker necessarily chooses one of the alternatives
- The number of alternatives must be finite

### Derivation of Choice Probabilities
We assume that the decision maker will maximize its utility.  
Random Utility Models: A decision maker $n$ faces a choice among $J$ alternatives. $U_{nj} = $ Utility that $n$ obtains from alternative $j$.  
Behavioral model: choose alternative $i$ if and only if $U_{ni} > U_{nj} \forall j \neq i $  
The researcher does not observe the decision maker's utility. The researcher observes some attributes of the alternatives as faced by the decision maker denoted $x_{nj}$ and some attributes of the decision maker $s_n$.  
Representative utility: $V_{nj} = V(x_{nj}, s_n)$.  
Utility is decomposed as $U_{nj} = V_{nj} + \epsilon_{nj}$ where the researcher does not know $\epsilon_{nj}$ and thus treats these terms as random.
Joint density of the random vector $\epsilon_n' = \{\epsilon_{n1}, ..., \epsilon_{nJ}\} = f(\epsilon_n)$  
Probability that $n$ chooses alternative $i$:   
$$P_{ni} = P(U_{ni}>U_{nj}, \forall j\neq i) $$  
$$P_{ni} = P(\epsilon_{nj} - \epsilon_{ni} < V_{ni} - V_{nj}, \forall j\neq i) $$   
$$P_{ni} = \int 1_{\{\epsilon_{nj} - \epsilon_{ni} < V_{ni} - V_{nj}, \forall j\neq i\}}f(\epsilon_n)d\epsilon_n $$ where $f(\epsilon_n)$ represents the density of the unobserved portion of utility.  
Logit and nested logit have closed-form expressions for this integral.

### Specific Models
Different choice models are derived under different specifications of the density of unobserved factors $f(\epsilon_n)$.  
Logit assumes that the unobserved factors are uncorrelated over alternatives and have the same variance for all alternatives.  
GEV (Generalized extreme value) models place the alternatives into several groups called nests with unobserved factors having the same correlation for all alternatives within a nest and no correlation for alternatives in different nests.

## Logit
A decision maker $n$ faces $J$ alternatives. 
The utility that the decision maker obtains from alternative $j$ is decomposed into a part known by the researcher and an unknown part treated by the researcher as random: $U_{nj} = V_{nj} + \epsilon_{nj}$.  
The logit model assumes that each $\epsilon_{nj}$ is independently, identically distributed extreme value. 
The assumption that the unobserved portion of utility for one alternative is unrelated to the unobserved portion of utility for another alternative is very restrictive.   
The density for each unobserved component of utility is $f(\epsilon_{nj}) = e^{-\epsilon_{nj}}e^{-e^{-\epsilon_{nj}}}$
The cumulative distribution is $F(\epsilon_{nj}) = e^{-e^{-\epsilon_{nj}}}$  
The difference between two extreme value variables is distributed logistic. 
Thus if $\epsilon_{nji}^* = \epsilon_{nj} - \epsilon_{ni}$ then $F(\epsilon_{nji}^*) = \frac{\epsilon_{nji}^*}{1+\epsilon_{nji}^*}$  

We now derive the logit choice probabilities:  
$$ P_{ni} = P(V_{ni} + \epsilon_{ni} > V_{nj} + \epsilon_{nj} \forall j \neq i)$$  
$$ P_{ni} = P(\epsilon_{nj} < \epsilon_{ni} + V_{ni} + V_{nj}) $$  
$$ P_{ni} = \frac{e^{V_{ni}}}{\sum_j e^{V_{nj}}} = \frac{e^{\beta x_{ni}}}{\sum_j e^{\beta x_{nj}}}$$ since we can say that the representative utility is linear in parameters, where $x_{nj}$ is a vector of observed variables relating to alternative $j$.

## GEV
When the unobserved portions of utility are correlated, a more general model than standard logit is needed.  
The set of alternatives $j$ are partitioned into $K$ nonoverlapping subsets denoted $B_1, ..., B_K$ and called nests.
The utility that person $n$ obtains from alternative $j$ in nest $B_k$ is $U_{nj} = V_{nj} + \epsilon_{nj}$  
The nested logit model is obtained by assuming that the vector of unobserved utility $\epsilon_n = \{\epsilon_{n1}, ..., \epsilon_{nJ}\}$ has cumulative distribution 
$\exp(-\sum_{k=1}^K (\sum_{j \in B_k} e^{-\frac{\epsilon_{nj}}{\lambda_k}})^{\lambda_k})$  

So the probability to choose alternative $i \in B_k$: $P_{ni} = \frac{e^{V_{ni}/\lambda_k}(\sum_{j \in B_k}e^{V_{nj}/\lambda_k})^{\lambda_k -1}}{\sum_{l=1}^K(\sum_{j \in B_l} e^{V_{nj}/\lambda_l})^{\lambda_l}}$

# Application to the Revenue Management Problem

We assume that the decision maker can choose either to buy a ticket for a flight or to not buy it.  
If he chooses to buy it he then can choose to buy it from airline $1$ or from airline $2$.  
The two alternatives corresponding to the two airlines are part of the same nest. The parameters $\lambda$ is a measure of the degree of independence in unobserved utility among the two alternatives of this nest.  
A value of $\lambda = 1$ indicates complete independence within the nest.  

We assume that we can represent the representative utilities of the $3$ possible alternatives:
- $V_{\mbox{no go}} = 0$
- $V_{\mbox{go with 1}} = \beta x_1$
- $V_{\mbox{go with 2}} = \beta x_2$  
With $x_1, x_2$ representing the respective prices of company $1$ and company $2$.  

## Logit 
With a simple logit model, the probability that the decision maker chooses company $i$ is:  
$P_i = \frac{e^{\beta x_{i}}}{1 + e^{\beta x_{1}} + e^{\beta x_{2}}}$  


## GEV
With a GEV model, the probability that the decision maker chooses company $i$ is:  
$P_i = \frac{e^{\beta x_i/\lambda}(e^{\beta x_1/\lambda} + e^{\beta x_2/\lambda})^{\lambda -1}}{1 + (e^{\beta x_1/\lambda} + e^{\beta x_2/\lambda})^{\lambda}}$ 