# Question 2: Homogeneous Coefficients with Demographics

In [35]:
using Plots, DataFrames, CSV, GLM
using Optim, Distributions, Random, ForwardDiff
using LinearAlgebra,StatsFuns, LaTeXTabulars

In [6]:
df = DataFrame(CSV.File("../data/ps1_ex2.csv"));
products = sort(unique(df, ["choice"]));

# construct vectors
D = Array(df[:,["d.1", "d.2"]]);
X = Array(products[:, ["x.1", "x.2", "x.3"]]);
C = Array(df[:,["choice"]]);
Y = zeros(size(D)[1], size(X)[1]);
# for now brute force but probably better way to do this
for i in 1:size(Y)[1]
    Y[i, C[i]] = 1;
end 

# Part 5: Estimating $(\delta, \Gamma)$
Building a log-likelihood function:
$$ \sum_i \sum_j y_{ij} \left[ \delta_j + d_i' \Gamma x_j - \log \left( {\sum_{k=1}^{31} \exp(\delta_k + d_i' \Gamma x_k)} \right) \right]  $$

Inputs are:
- $X$ is a $31 \times 3$ matrix of 31 products with 3 characteristics
- $D$ is a $4000 \times 2$ matrix of 4000 individuals with 2 demographic observables
- $Y$ is a $4000 \times 31$ matrix of 4000 individuals choosing one of 31 products

The parameters to be estimated should have the following dimensions:
- $\delta$ should be a vector with 31 rows 
- $\Gamma$ should be a $2 \times 3$ matrix of coefficients

In theory I could provide the gradient to speed this up, but it is fast enough so I won't do that.

In [7]:
# Likelihood 
ll = function(δ, Γ)
    likelihood = 0
    for i in 1:size(D)[1]
        likelihood += (Y[i,:]' * (δ + (D[i, :]' * Γ * X')')) - log(sum(exp.(δ + (D[i, :]' * Γ * X')')))
    end
    return -likelihood # notice that we are returning the negative likelihood
end

# Optim wrapper (since it takes one vector as argument)
ll_wrap = function(x)
    δ = x[1:31]
    δ[31] = 0
    Γ = reshape(x[32:37],2,3)
    return ll(δ, Γ)
end

#7 (generic function with 1 method)

In [41]:
Γ

2×3 Matrix{Float64}:
 0.68645   0.300399  0.714732
 0.500846  0.262963  0.822377

In [8]:
# Minimize the negative likelihood
params0 = zeros(37);
optimum = optimize(ll_wrap ,params0, LBFGS(), autodiff=:forward)
MLE = optimum.minimizer;
δ = MLE[1:31];
Γ = reshape(MLE[32:37],2,3);

In [42]:
string([11, 21, 12, 22, 13, 23])

"[11, 21, 12, 22, 13, 23]"

In [43]:
# delta
delta = hcat("\$\\gamma_{" .* string.(1:31) .* "}\$", round.(δ, digits = 3))
latex_tabular("output/ps1_q2_deltas.tex",
              Tabular("cc"),
              [Rule(:top),
               delta,
               Rule(:bottom)])

# Gamma
gamma = hcat("\$\\gamma_{" .* ["11", "21", "12", "22", "13", "23"] .* "}\$", round.(MLE[32:37], digits = 3))
latex_tabular("output/ps1_q2_gammas.tex",
              Tabular("cc"),
              [Rule(:top),
               gamma,
               Rule(:bottom)])


## Double check coefficients
Here I am using the FOC for $\delta_j$ to verify that the estimated coefficients yield predicted shares that match the data:
$$ \frac{1}{N} \sum_i y_{ij} = \frac{1}{N} \sum_i \frac{\exp(\delta_j + d_i ' \Gamma x_j)}{\sum_k \exp(\delta_k + d_i ' \Gamma x_k)} $$

In [11]:
# Predicted shares
pred_s = zeros(31)
for i in 1:size(D)[1]
    pred_s += exp.(δ + X * Γ' * D[i,:]) / sum(exp.(δ + X * Γ' * D[i,:]))
end
pred_s = pred_s / 4000

# Actual shares
data_s = mean(Y,dims=1)'
 
# Now put them next to each other: yay!
maximum(pred_s .- data_s)

1.8898771436681727e-12

# Part 7: Obtaining estimate of $\beta$
The proposed moment condition is simply exogeneity of the product specific term and other observed characteristics:

$$E[x_j \xi_j] = 0$$

In this case, $\beta$ can be identified from regressing the following equation; and $\xi_j$ would be the error term

$$ \delta_j = x_j' \beta + \xi_j $$

In [15]:
# Build dataframe
products[!, :delta] = δ
rename!(products,[:choice,:x1, :x2, :x3, :d1, :d2, :delta])

# Regression
est = lm(@formula(delta ~ 0 + x1 + x2 + x3), products)

# Store estimates
ξ = δ .- predict(est)
β = coef(est)

3-element Vector{Float64}:
 0.12662593406877717
 1.0801670028728954
 0.6291652102638514

In [None]:
delta = hcat("\$\\beta_{" .* string.(1:3) .* "}\$", round.(β, digits = 3))
latex_tabular("output/ps1_q2_deltas.tex",
              Tabular("cc"),
              [Rule(:top),
               delta,
               Rule(:bottom)])