## Introduction

An agent faces a choice among a set of options.
The outcome of the decision is denoted $y$ and is discrete.
Some factors determining the agent's choice are observed by the researcher and some are not. 
The observed factors are denoted $x$ and the unobserved ones $\epsilon$.
The function $h$ such that $y=h(x, \epsilon)$ is caled the behavioral process.  

The probability that the agent chooses an outcome is the probability that the unobserved factors are such that the behavioral process results in that outcome.
$P(y|x) = P(\epsilon \mbox{ s.t. } h(x, \epsilon) = y)$.  
This probability can be expressed as the integral $\int 1_{h(x, \epsilon)=y}f(\epsilon)d\epsilon$ where $f(\epsilon)$ is the density of the unobserved terms.  
There are three posibilities to evaluate that integral:
- Complete Closed-Form Expression: Solve the integral analytically
- Complete Simulation: As integration over a density is a form of averaging
- Partial Simulation, Partial Closed Form  

We will take a look at behavioral models that have been proposed to describe the choice process.

## Properties of Discrete Choice Models

### The Choice Set
Decision makers have to choose among alternatives and the set of alternatives is called the choice set.
- The alternatives must be mutually exclusive from the decision maker's perpective
- The decision maker necessarily chooses one of the alternatives
- The number of alternatives must be finite

### Derivation of Choice Probabilities
We assume that the decision maker will maximize its utility.  
Random Utility Models: A decision maker $n$ faces a choice among $J$ alternatives. $U_{nj} = $ Utility that $n$ obtains from alternative $j$.  
Behavioral model: choose alternative $i$ if and only if $U_{ni} > U_{nj} \forall j \neq i $  
The researcher does not observe the decision maker's utility. The researcher observes some attributes of the alternatives as faced by the decision maker denoted $x_{nj}$ and some attributes of the decision maker $s_n$.  
Representative utility: $V_{nj} = V(x_{nj}, s_n)$.  
Utility is decomposed as $U_{nj} = V_{nj} + \epsilon_{nj}$ where the researcher does not know $\epsilon_{nj}$ and thus treats these terms as random.
Joint density of the random vector $\epsilon_n' = \{\epsilon_{n1}, ..., \epsilon_{nJ}\} = f(\epsilon_n)$  
Probability that $n$ chooses alternative $i$: 
$$P_{ni} = P(U_{ni}>U_{nj}, \forall j\neq i) $$
$$P_{ni} = P(\epsilon_{nj} - \epsilon_{ni} < V_{ni} - V_{nj}, \forall j\neq i) $$ 
$$P_{ni} = \int 1_{\{\epsilon_{nj} - \epsilon_{ni} < V_{ni} - V_{nj}, \forall j\neq i\}}f(\epsilon_n)d\epsilon_n $$ where $f(\epsilon_n)$ represents the density of the unobserved portion of utility.  
Logit and nested logit have closed-form expressions for this integral.

### Specific Models
Different choice models are derived under different specifications of the density of unobserved factors $f(\epsilon_n)$.  
Logit assumes that the unobserved factors are uncorrelated over alternatives and have the same variance for all alternatives.
GEV places the alternatives into several groups called nests with unobserved factors having the same correlation for all alternatives within a nest and no correlation for alternatives in different nests.