# Binary Discrete Choice Models
For an individual, both observable and unobservable features can influence decisions. Utility functions for individuals which consider both observable and unobservable factors take the form:

\begin{equation}
U_{ij} = V_{ij} + \epsilon_{ij}
\end{equation}

The term $U_{ij}$ is the utility of alternative $j$ for individual $i$, $V_{ij}$ is the _deterministic_ component of the utility, 
i.e., the utility associated with the observable features, and $\epsilon_{ij}$ is the random component of the utility (error model). 

## Logit choice model
One of the most common choice models is the `Logit` model. Assume the random component of the utility $U_{ij}$ is _independently and identically distributed_ (IID) across $J$ alternatives, and is [Gumbel distributed](https://en.wikipedia.org/wiki/Gumbel_distribution), 
then the probability that individual $i$ chooses alternative $j$ is given by the [logit choice model](https://en.wikipedia.org/wiki/Discrete_choice):

\begin{equation}
P_{ij} = \frac{e^{V_{ij}/\mu}}{\displaystyle \sum_{k=1}^{J}e^{V_{ik}/\mu}}\qquad{j=1,\dotsc,J}
\end{equation}

where $P_{ij}$ is the probability that individual $i$ chooses alternative $j$, $V_{ij}$ is the deterministic component of the utility, and $\mu$ is a scale parameter.

## Learning Objectives
In this example, our goal is to calculate the probability of a binary choice between purchasing a Tesla Model S or a Honda Odyssey. We will be using the `Logit` choice model to achieve this. Our objectives include:

- Introducing students to Random Utility Models (RUMs) and the `Logit` discrete choice model
- Familiarizing students with `Bernoulli` random variables and how to simulate binary choices
- Teaching students how to directly simulate Random Utility Models (RUMs) `Logit` discrete choice model by directly sampling the `Gumbel` distribution. 

By the end of this exercise, students should have a better understanding of these concepts and be able to apply them in real-world scenarios.

## Setup
The computations in this lab (or example) are enabled by the [VLDecisionsPackage.jl](https://github.com/varnerlab/VLDecisionsPackage.jl.git) and several external `Julia` packages. To load the required packages and any custom codes the teaching team has developed to work with these packages, we [include](https://docs.julialang.org/en/v1/manual/code-loading/) the `Include.jl` file):

In [1]:
include("Include.jl");

[32m[1m  Activating[22m[39m project at `~/Desktop/julia_work/CHEME-5760-Examples-F23`
[32m[1m    Updating[22m[39m registry at `~/.julia/registries/General.toml`
[32m[1m    Updating[22m[39m git-repo `https://github.com/varnerlab/VLDecisionsPackage.jl.git`
[32m[1m   Installed[22m[39m XML2_jll ────── v2.10.4+0
[32m[1m   Installed[22m[39m LoggingExtras ─ v1.0.2
[32m[1m    Updating[22m[39m `~/Desktop/julia_work/CHEME-5760-Examples-F23/Project.toml`
  [90m[10f378ab] [39m[93m~ VLDecisionsPackage v0.1.0 `https://github.com/varnerlab/VLDecisionsPackage.jl.git#main` ⇒ v0.1.0 `https://github.com/varnerlab/VLDecisionsPackage.jl.git#main`[39m
[32m[1m    Updating[22m[39m `~/Desktop/julia_work/CHEME-5760-Examples-F23/Manifest.toml`
  [90m[e6f89c97] [39m[93m↑ LoggingExtras v1.0.1 ⇒ v1.0.2[39m
  [90m[10f378ab] [39m[93m~ VLDecisionsPackage v0.1.0 `https://github.com/varnerlab/VLDecisionsPackage.jl.git#main` ⇒ v0.1.0 `https://github.com/varnerlab/VLDecisionsPackage.

## Data
The dataset we explore is the Tesla versus Odyssey survey presented in class. We load this dataset into the notebook using the `HondaTeslaDataSet()` function. This function stores the data in the `dataset` variable, a [DataFrame type](https://dataframes.juliadata.org/stable/).

In [2]:
dataset = HondaTeslaDataSet()

Row,feature,exponent,Tesla,Honda
Unnamed: 0_level_1,String15,Float64,Float64,Float64
1,sustainability,0.2,5.0,3.0
2,affordability,0.1,2.0,4.0
3,styling,0.05,5.0,2.0
4,usefulness,0.3,2.0,5.0
5,costownership,0.1,4.0,2.0
6,performance,0.05,5.0,1.0
7,safety,0.2,5.0,5.0


### Analytical Logit Choice Model
The deterministic component of the random utility function $V_{ij}$ can be any of the utility functions we studied in class or one of your creations. Let's use the `log-transformed` [Cobb-Douglas utility function](https://varnerlab.github.io/CHEME-5760-Decisions-Book/unit-1-simpledecisions/utilityfunctions.html#cobb-douglas-utility-functions):

$$
\begin{equation}
V_{i} = \sum_{k=1}^{m}\alpha_{i,k}\ln x_{i,k}
\end{equation}
$$

where we constrain the sum of the exponents $\alpha_{k}$ to be unity, i.e., $\sum\alpha_{k} = 1$.

#### Implementation
We construct an instance of the `VLLogTransformedCobbDouglasUtilityFunction` model type (which holds the `Cobb-Douglas` model parameters) using the `build(...)` function, where we pass the $\alpha$-vector and base `b` as parameters to the `build(...)` function:

In [3]:
model = build(VLLogTransformedCobbDouglasUtilityFunction, (
        α = dataset[:,:exponent], b = ℯ)
);

Now that we have the `VLLogTransformedCobbDouglasUtilityFunction` model, we can compute the values of the utility function (log-transformed Cobb-Douglas) using the `model(x)` short-cut syntax:

In [82]:
V = zeros(2);
V[1] = model(dataset[:,:Tesla]);
V[2] = 1.0*model(dataset[:,:Honda]);

# println -
println("The explained utility of the Tesla = $(V[1]) and Honda = $(V[2])")

The explained utility of the Tesla = 1.2206072645530175 and Honda = 1.267042927146653


Finally, we can compute the probability that decision maker $i$ will choose the `Tesla` option (index `1`) and store this value in the variable `p`:

In [83]:
p = exp(V[1])/(exp(V[1])+exp(V[2]))

0.4883931698990063

### Bernoulli random variable
A Bernoulli random variable $X$ models a binary outcome: either `1` or `0`, 
where `1` occurs with probability $p$ and `0` occurs with probability $1-p$. 
The probability mass function (pmf) of the Bernoulli random variable $X$ is:

\begin{equation}
p_{X}(x) = \begin{cases}
    p & \text{if } x = 1 \\
    1 - p & \text{if } x = 0
  \end{cases}
\end{equation}

where $0<p<1$ is called the Bernoulli parameter. The expectation a Bernoulli random variable equals:

\begin{equation}
\mathbb{E}\left[X\right] = p
\end{equation}

while the variance $\text{Var}(X)$ equals:

\begin{equation}
\text{Var}\left[X\right] = p(1-p)
\end{equation}

Bernoulli random variables model many binary events: coin flips (`H` or `T`), `true` or `false`, `yes` or `no`, `present` or `absent`, etc.

In choosing between the `Tesla` and the `Odyssey`, we let `Tesla` be outcome `1`, while the `Odyssey` can be outcome `0`. If this is true, the `p` computed from the `logit` model equals the `Bernoulli parameter`.

#### Implementation
We create a `Bernoulli` distribution with parameter `p` using the function `Bernoulli(p)`, which is exported by the [Distributions.jl](https://github.com/JuliaStats/Distributions.jl) package. We store this distribution in the varaible `d`:

In [6]:
d = Bernoulli(p);

Next, we generate $N_{s}$ samples from the `Bernoulli` distribution ‘d’ to simulate repeated trials of choosing between `Tesla` and `Honda` using the `rand(...)` function. The values of these trials are stored in the `S` array:

In [7]:
Nₛ = 20000 # how many trials are going to do?
S = rand(d, Nₛ);

Finally, we can calculate the expected value of the `Tesla` versus `Honda` choice:
* We count the number of `1` values (`Tesla` values) in the sample array `S` using the `findall(...)` function
* We then divide by the number of samples $N_{s}$, which gives the frequency of choosing `Tesla`

In [8]:
p_tesla = findall(x-> x == 1, S) |> x-> length(x)/Nₛ

0.4872

### Direct Simulation of Discrete Binary Choices
Finally, there is an alternative way to simulate the choices of individuals (binary or multicategory), namely to directly sample the random utility function given a particular error model and then estimate the probability of chosing the Tesla `T` or Honda `H`.

#### Implementation
First we construct a `standard Gumbel` distribution with zero-mean and standard deviation $\sigma = \sqrt(\pi^{2}/6)$ using the function $\text{Gumbel}(0,\sigma)$, which is exported by the [Distributions.jl](https://github.com/JuliaStats/Distributions.jl) package. We store the $\text{Gumbel}(0,\sigma)$ instance in the $\epsilon$ variable:

In [76]:
ϵ = Gumbel(0,√((π^2)/6));

Next, we add the random perturbation, describing the unobservable component of the utility, to the deterministic utility values:

In [85]:
U₁ = V[1] .+ rand(ϵ, Nₛ);
U₂ = V[2] .+ rand(ϵ, Nₛ);

Next, we iterate through the utility function values using a `for` loop, and check which option `T` or `H` has the larger utility:
* If $U[i,1] > U[i,2]$ then we choose `Tesla` and add a `T` to the `choice_vector`
* Otherwise, we add an `H` to the to the `choice_vector`

In [86]:
U = [U₁ U₂];
choice_vector = Array{Char,1}();
for i ∈ 1:Nₛ
    if (U[i,1] > U[i,2])
        push!(choice_vector,'T')
    else
        push!(choice_vector,'H')
    end
end

In [84]:
choice_vector

20000-element Vector{Char}:
 'T': ASCII/Unicode U+0054 (category Lu: Letter, uppercase)
 'T': ASCII/Unicode U+0054 (category Lu: Letter, uppercase)
 'H': ASCII/Unicode U+0048 (category Lu: Letter, uppercase)
 'H': ASCII/Unicode U+0048 (category Lu: Letter, uppercase)
 'T': ASCII/Unicode U+0054 (category Lu: Letter, uppercase)
 'T': ASCII/Unicode U+0054 (category Lu: Letter, uppercase)
 'H': ASCII/Unicode U+0048 (category Lu: Letter, uppercase)
 'H': ASCII/Unicode U+0048 (category Lu: Letter, uppercase)
 'H': ASCII/Unicode U+0048 (category Lu: Letter, uppercase)
 'T': ASCII/Unicode U+0054 (category Lu: Letter, uppercase)
 'H': ASCII/Unicode U+0048 (category Lu: Letter, uppercase)
 'H': ASCII/Unicode U+0048 (category Lu: Letter, uppercase)
 'H': ASCII/Unicode U+0048 (category Lu: Letter, uppercase)
 ⋮
 'H': ASCII/Unicode U+0048 (category Lu: Letter, uppercase)
 'T': ASCII/Unicode U+0054 (category Lu: Letter, uppercase)
 'H': ASCII/Unicode U+0048 (category Lu: Letter, uppercase)
 'T': ASC

Finally, we can calculate the expected value of the `Tesla` versus `Honda` choice analyzing the values stored in the `choice_vector`:
* We count the number of `T` values (`Tesla` values) in the sample array `choice_vector` using the `findall(...)` function
* We then divide by the number of samples $N_{s}$, which gives the frequency of choosing `Tesla`

In [87]:
findall(x-> x == 'T', choice_vector) |> x-> length(x)/(Nₛ)

0.4927