# Empirical IO PS 2
Maximilian Huber

This code is stored at: https://github.com/MaximilianJHuber/NYU/blob/master/EmpIO/PS2.ipynb.
The notation follows along the lines of Berry, Levinsohn, Pakes (1995): http://www.tcd.ie/Economics/staff/ppwalsh/papers/BLP.pdf

In [4]:
using DataFrames
using GLM
using Optim
using LaTeXStrings

data = readtable("data_ps2.txt", header=false, separator=',')
rename!(data, names(data), [:car, :year, :firm, :price, :quantity, :weight, :hp, :ac, :nest3, :nest4]);

A car can change characteristics over the years:

In [5]:
data[data[:car] .== 91, :]

Unnamed: 0,car,year,firm,price,quantity,weight,hp,ac,nest3,nest4
1,91,1990,19,5995,52409.0,1620,55,0,1,1
2,91,1991,19,6807,55056.0,1620,55,1,1,1
3,91,1992,19,8219,66046.0,1576,55,1,1,1


Therefore, one can argue that product and time dimensions collapse into one. But for the time being lets treat good $j$ to be a _car_, time $t$ to be a _year_ and characteristics $x$ to be _weight_, _horse power_ and _air conditioning_ and finally the price $p$ be _price_.

## Part A: Logit

This part follows BLP (1995) section 6.1.

### 1
Agent $i$ derives utility from good $j$ at time $t$ in the following manner:
$$u_{ijt} = \delta_{jt}^* + \epsilon_{ijt} \quad \text{and} \quad \delta_{jt}^* = x'_{jt}\beta_t - \alpha_t p_{jt}+ \xi_{jt} \quad \forall t \in T$$
where $\epsilon_{ijt}$ is an i.i.d. extrem value.
This logit model has market shares:
$$s_{jt} = \frac{e^{\delta_{jt}^*}}{1 + \sum_{k=1}^J e^{\delta_{kt}^*}}$$
Taking logs yields:
$$\log s_{jt} - \log s_{0t} = \delta_{jt}^* = x'_{jt}\beta_t - \alpha_t p_{jt} + \xi_{jt} $$
where $\xi_{jt}$ is the unobservable good and time specific utility.

This equation will be estimated year-by-year.

### 2 
If I had chosen instead to pool the years the derivation would not change, but panel logit relies on the time-independence of $\xi_{jt}$, which is very implausible.

### 3 
Now I estimate the model with GMM using the BLP instrument for $p$:

#### Data Preparation
Market shares are calculated with an assumed market size of 100 million:

In [6]:
data[:share] = data[:quantity] / 1e8;

Instruments for the price are constructed by looking at the good's competitors, as defined by the goods produced by other firms, avaiable in the same year. I average of the characteristics of those competing goods.

I normalize weight, horse power and price BEFORE creating the instruments, to get the condition number of the matrices down:

In [7]:
meanprice = mean(data[:price])

data[:weight] = data[:weight] / mean(data[:weight])
data[:hp]     = data[:hp] / mean(data[:hp])
data[:price]  = data[:price] / mean(data[:price]);

In [8]:
function competitor_mean_characteristic(good, char::Symbol)
    mean(data[(data[:firm] .!= good[:firm]) .* (data[:year] .== good[:year]), char]) #same year, different firm
end

function firm_mean_characteristic(good, char::Symbol)
    mean(data[(data[:firm] .== good[:firm]) .* (data[:year] .== good[:year]) .* 
            (data[:car] .!= good[:car]), char]) #same year, same firm, different product
end

firm_mean_characteristic (generic function with 1 method)

In [9]:
data[:comp_weight] = [competitor_mean_characteristic(good, :weight) for good in eachrow(data)]
data[:comp_hp]     = [competitor_mean_characteristic(good, :hp) for good in eachrow(data)]
data[:comp_ac]     = [competitor_mean_characteristic(good, :ac) for good in eachrow(data)]

data[:firm_weight] = [firm_mean_characteristic(good, :weight) for good in eachrow(data)]
data[:firm_hp]     = [firm_mean_characteristic(good, :hp) for good in eachrow(data)]
data[:firm_ac]     = [firm_mean_characteristic(good, :ac) for good in eachrow(data)];

Some firms only have one product, so that there are some NaN created. These are replaced with zeros:

In [10]:
data[isnan.(data[:firm_weight]),:firm_weight] = 0.0
data[isnan.(data[:firm_hp]),:firm_hp] = 0.0
data[isnan.(data[:firm_ac]),:firm_ac] = 0.0;

And the left hand side of the model is:

In [11]:
data[:LHS] = zeros(size(data)[1])

for y in [1990, 1991, 1992]
    data[data[:year] .== y, :LHS] = 
        log(data[data[:year] .== y, :share]) - log(1 - sum(data[data[:year] .== y, :share]))
end

In [12]:
head(data)

Unnamed: 0,car,year,firm,price,quantity,weight,hp,ac,nest3,nest4,share,comp_weight,comp_hp,comp_ac,firm_weight,firm_hp,firm_ac,LHS
1,91,1990,19,0.2970233932704021,52409.0,0.5559956020578457,0.4081228050300215,0,1,1,0.00052409,0.9900775851227992,0.9764801886258072,0.4270833333333333,1.053999288105845,1.0502069186120124,0.5588235294117647,-7.462877776838659
2,35,1990,7,0.4307953969117842,17122.0,0.7619198991163071,0.7420414636909483,0,2,2,0.00017122,1.007985368233883,1.001208568345631,0.4508196721311475,0.9629105932244092,0.8709711680072505,0.625,-8.581591922822227
3,61,1990,16,0.4382271748918609,65590.0,0.8974867280131275,0.6900985612325818,0,1,1,0.0006559,1.0072831756766172,1.0122090818191225,0.5043478260869565,0.98029117590808,0.8508742116989539,0.1333333333333333,-7.238532863836876
4,52,1990,10,0.5360293731096715,49877.0,0.8662548762925941,0.6826781465956724,0,1,1,0.00049877,1.0086561218201344,1.001989689829612,0.4724409448818897,0.824612407331883,0.6406291303198519,0.0,-7.51239613448586
5,26,1990,4,0.6837235741670641,35944.0,0.7488780269692712,0.8607680978814999,0,2,3,0.00035944,1.0075772348148713,0.996691248532191,0.4682539682539682,0.9339524889505632,0.8533476832445905,0.25,-7.839993937374661
6,54,1990,11,0.8420204451426995,4640.0,0.9376419659395274,0.9498130735244136,1,2,2,4.64e-05,1.0036015934934468,0.9906541153244582,0.4496124031007752,1.037172042851117,1.1130621955364224,1.0,-9.887241742904356


#### GMM Estimation
I estimate $\hat{\theta}$ using the following GMM procedure:
$$\max_\theta Q_n\big(\theta \big)$$
where $Q_n\big(\theta\big) = -\frac{1}{2}g_n\big(\theta\big)'\,\hat{W}\,g_n\big(\theta\big)$ and $g_n\big(\theta\big) = \frac{1}{n}\sum_{r=1}^R g\big(w_r; \theta\big)$. 

$g\big(w_r; \theta\big) = z_r \cdot (y_r - x_r \cdot \theta)$ is the residual $\xi_{jt}$ from the model above times the instruments calculated using $w_r$, a row of data.

The efficient GMM is with estimated weighting matrix $\hat{W}_A=\left(\frac{1}{n} \sum_{o=1}^n{g(w_o,\theta_{first})g(w_o,\theta_{first})'}\right)^{-1}$:

In [13]:
# w is data, with the left-hand-side as its first column, z are included regressors and excluded instruments
function gn(w, z, θ)

    1/size(w)[1] .* z' * (w[:,1] - w[:, 2:end] * θ)
    
end

#Wrapper creates a closure around the provided data set 
function Qn_wrapper(w, z)
    return θ -> 1/2 * (gn(w,z,θ)' * gn(w,z,θ))[1,1]
end

function Qn_wrapper(w, z, W)
    return θ -> 1/2 * (gn(w,z,θ)' * W * gn(w,z,θ))[1,1]
end


function g(w, z, θ)
    z * (w[1] - w[2:end]' * θ)
end

function What(w, z, θ)
    result = zeros(Float64, size(z)[2], size(z)[2])
    for i in 1:size(w)[1]
        result .+= g(w[i,:],z[i,:],θ) * g(w[i,:],z[i,:],θ)'
    end
    return inv(result/size(z)[1])
end

What (generic function with 1 method)

In [14]:
table3 = DataFrame()
table3[:year] = [1990, 1991, 1992]

table3 = hcat(table3, convert(DataFrame, hcat(
    [begin
        w = convert(Array{Float64}, data[data[:year] .== year, [:LHS, :weight, :hp, :ac, :price]])
        z = convert(Array{Float64}, data[data[:year] .== year, [:weight, :hp, :ac, :comp_weight, :comp_hp, :comp_ac, 
                                                                                   :firm_weight, :firm_hp, :firm_ac]])

        Qn = Qn_wrapper(w, z)

        optres = optimize(Qn, zeros(4), BFGS())
        θfirst = optres.minimizer

        Qn = Qn_wrapper(w, z, What(w, z, θfirst))
        optres = optimize(Qn, θfirst, BFGS())
        optres.minimizer
    end for year in [1990, 1991, 1992]]...)')
)

rename!(table3, names(table3)[2:end], [:βw, :βhp, :βac, :α])
table3[:α] = - table3[:α]

table3

Unnamed: 0,year,βw,βhp,βac,α
1,1990,-8.197922377649943,-0.7720898816400485,1.337509448955808,-0.7734738450632531
2,1991,-8.76171834665696,-0.1911418367860624,0.9205738817135896,-0.8297096740966895
3,1992,-7.597282668253585,-2.60711231509024,0.6780209898810258,-2.0777062702576394


### 4
The negative $\alpha$ values are very implausible!
Cross-price elasticities are constant over all goods, which is rather unrealistic. For example, increasing the price of a Aston Martin DB11 would shift market share equally to the Ford Prius and the Ferrari 911. 

Similarily, the own-price elasticity is constant, but in reality some cars will have higher and some lower.
## Part B: Nested Logit
### 5
Using Berry (1994) notation:

$$\log s_{jt} - \log s_{0t} = \frac{\delta_{jt}^*}{1-\sigma} - \sigma \log{D_g}$$
where $D_g=\sum_{j \in g} e^{\delta_{jt}^*/(1-\sigma)}$. That yields:

$$\log s_{jt} - \log s_{0t} = x'_{jt}\beta_t - \alpha_t p_{jt} + \sigma \log s_{j|g} + \xi_{jt}  $$
### 6
Let me calculate the within group shares:

In [15]:
for nest_structure in [:nest3, :nest4]
    # initialize a column of zeros
    data[Symbol(string("share_", string(nest_structure)))] = zeros(size(data)[1])

    for y in [1990, 1991, 1992]
        for nest in 1:maximum(data[nest_structure])
            data[(data[:year] .== y) .* (data[nest_structure] .== nest), Symbol(string("share_", string(nest_structure)))] = 
                log(data[(data[:year] .== y) .* (data[nest_structure] .== nest), :share]) - 
                log(sum(data[(data[:year] .== y) .* (data[nest_structure] .== nest), :share]))
        end
    end
end

And the within group competitor's characteristics (ignoring the producing firm), which are being used as an instrument for the within shares:

In [16]:
function nest_competitor_mean_characteristic(good, char::Symbol, nest_structure::Symbol)
    mean(data[(data[nest_structure] .== good[nest_structure]) .* (data[:year] .== good[:year]) .* 
            (data[:car] .!= good[:car]), char]) #same year, same nest, different car
end

nest_competitor_mean_characteristic (generic function with 1 method)

In [17]:
for nest_structure in [:nest3, :nest4]

    data[Symbol(string("comp_weight_", string(nest_structure)))] = 
        [nest_competitor_mean_characteristic(good, :weight, nest_structure) for good in eachrow(data)]

    data[Symbol(string("comp_hp_", string(nest_structure)))] = 
        [nest_competitor_mean_characteristic(good, :hp, nest_structure) for good in eachrow(data)]

    data[Symbol(string("comp_ac_", string(nest_structure)))] = 
        [nest_competitor_mean_characteristic(good, :ac, nest_structure) for good in eachrow(data)];
    
end

In [18]:
head(data)

Unnamed: 0,car,year,firm,price,quantity,weight,hp,ac,nest3,nest4,share,comp_weight,comp_hp,comp_ac,firm_weight,firm_hp,firm_ac,LHS,share_nest3,share_nest4,comp_weight_nest3,comp_hp_nest3,comp_ac_nest3,comp_weight_nest4,comp_hp_nest4,comp_ac_nest4
1,91,1990,19,0.2970233932704021,52409.0,0.5559956020578457,0.4081228050300215,0,1,1,0.00052409,0.9900775851227992,0.9764801886258072,0.4270833333333333,1.053999288105845,1.0502069186120124,0.5588235294117647,-7.462877776838659,-4.304576788942753,-3.684711188040598,0.9528997625875184,0.7518815787529369,0.2608695652173913,0.9115978224718132,0.6863883539141272,0.1153846153846153
2,35,1990,7,0.4307953969117842,17122.0,0.7619198991163071,0.7420414636909483,0,2,2,0.00017122,1.007985368233883,1.001208568345631,0.4508196721311475,0.9629105932244092,0.8709711680072505,0.625,-8.581591922822227,-5.573016845671248,-5.7614217762818,1.009372262748225,1.027351029368062,0.4927536231884058,0.9905522864668314,0.927348530582537,0.4246575342465753
3,61,1990,16,0.4382271748918609,65590.0,0.8974867280131275,0.6900985612325818,0,1,1,0.0006559,1.0072831756766172,1.0122090818191225,0.5043478260869565,0.98029117590808,0.8508742116989539,0.1333333333333333,-7.238532863836876,-4.08023187594097,-3.460366275038815,0.9454760424580558,0.7457516710094029,0.2608695652173913,0.89846354839661,0.6755431325217209,0.1153846153846153
4,52,1990,10,0.5360293731096715,49877.0,0.8662548762925941,0.6826781465956724,0,1,1,0.00049877,1.0086561218201344,1.001989689829612,0.4724409448818897,0.824612407331883,0.6406291303198519,0.0,-7.51239613448586,-4.354095146589954,-3.7342295456877994,0.9461549957563284,0.7459129843710749,0.2608695652173913,0.8996647734627844,0.675828533084679,0.1153846153846153
5,26,1990,4,0.6837235741670641,35944.0,0.7488780269692712,0.8607680978814999,0,2,3,0.00035944,1.0075772348148713,0.996691248532191,0.4682539682539682,0.9339524889505632,0.8533476832445905,0.25,-7.839993937374661,-4.831418860223682,-3.3802007968344165,1.0095612753880372,1.025630353510228,0.4927536231884058,1.1384493562225968,1.3454560903014512,0.8636363636363636
6,54,1990,11,0.8420204451426995,4640.0,0.9376419659395274,0.9498130735244136,1,2,2,4.64e-05,1.0036015934934468,0.9906541153244582,0.4496124031007752,1.037172042851117,1.1130621955364224,1.0,-9.887241742904356,-6.878666665753377,-7.067071596363929,1.0068255661275984,1.0243398466168523,0.4782608695652174,0.9881451348665132,0.924502344146462,0.410958904109589


#### 6.a.

In [19]:
nest_structure = :nest3

table6 = DataFrame()
table6[:year] = [1990, 1991, 1992, 1990, 1991, 1992]
table6[:nesting] = [3, 3, 3, 4, 4, 4]

table6 = hcat(table6, convert(DataFrame, hcat(
    [begin
        # do 2SLS:
        X = convert(Array{Float64}, data[data[:year] .== year, 
                [:weight, :hp, :ac, :price, Symbol(string("share_", string(nest_structure)))]])
                    
        Z = convert(Array{Float64}, data[data[:year] .== year, 
                [:weight, :hp, :ac, :comp_weight, :comp_hp, :comp_ac, :firm_weight, :firm_hp, :firm_ac, 
                Symbol(string("comp_weight_", string(nest_structure))), 
                Symbol(string("comp_hp_", string(nest_structure))), 
                Symbol(string("comp_ac_", string(nest_structure)))]])

        y = convert(Array{Float64}, data[data[:year] .== year, [:LHS]])

        θ2SLS = ((X'Z * (Z'Z)^(-1) * Z'X) \ (X'Z * (Z'Z)^(-1) * Z'y))[:,1]

        # so efficient GMM starting from 2SLS:
        w = hcat(data[data[:year] .== year, :LHS], X)
        z = Z

        Qn = Qn_wrapper(w, z)

        optres = optimize(Qn, θ2SLS, BFGS())
        θfirst = optres.minimizer

        Qn = Qn_wrapper(w, z, What(w, z, θfirst))
        optres = optimize(Qn, θfirst, BFGS())
        optres.minimizer
    end for nest_structure in [:nest3, :nest4] for year in [1990, 1991, 1992]] ...)')
)

rename!(table6, names(table6)[2:end], [:nest, :βw, :βhp, :βac, :α, :σ])
table6[:α] = - table6[:α]

table6

Unnamed: 0,year,nest,βw,βhp,βac,α,σ
1,1990,3,-0.7418715285156718,-1.537810896867352,1.2587274829433068,1.018694982828985,1.1705192268386202
2,1991,3,-0.3086694806618532,-1.3500850216564932,0.4819906728231137,0.2926606541592382,1.3209458308141642
3,1992,3,-0.2867003146098926,-0.6726029061783617,0.462677635604744,0.9158809465467384,1.3177975893123737
4,1990,4,-0.7103334215237568,-2.627998232244124,1.0809704270978155,0.0142702404378559,1.1350848060724337
5,1991,4,-0.3349547335574603,-3.73343050185694,0.8184402624117959,-0.5791239363759064,1.0569568403624847
6,1992,4,1.6904310865782943,-5.272087890724476,-0.2255628898265702,-1.4632992788244452,1.261110883286385


The estimates for the price coefficients are higher and mor plausible, except those in nest 4!

#### 6.b.
The conditional error must be time-independent. This is very implausible with real data, since brands have autocorrelation in how much they are liked by buyers.

#### 6.c.
$\sigma$ is often close to one, but in 1991 the two nesting structures give very different estimates for $\sigma$. With 4 nests the $\sigma=0.79$ indicates there is much correlation within the nest. With 3 nests the $\sigma=1.26$ might not be consistent with random utility maximization! 
#### 6.d.
The $\alpha$ also varies a lot over the three years! 
### 7
#### 7.a.
Lets have a look at price elasticities in the logit and the nested logit model:

|          |   Standard Logit Elasticities   | Nested Logit Elasticities |
|:---:|:---:|:---:|
| $\frac{\partial s_{jt}}{\partial p_{jt}}$  | $-\alpha s_{jt}\big(1-s_{jt}\big)$  | $-\alpha s_{jt}\big(\frac{1}{1-\sigma}-\frac{\sigma}{1-\sigma}s_{jt|g}-s_{jt}\big)$  |
|  $\frac{\partial s_{jt}}{\partial p_{kt}}$ | $\alpha s_{jt}s_{kt}$  | $\quad\quad\begin{cases}
\alpha s_{kt}\big(\frac{\sigma}{1-\sigma}s_{jt|g}+s_{jt}\big) & \text{same}\\
\alpha s_{jt}s_{kt} & \text{different}
\end{cases}$  |

The nested logit allows for a more flexible substitution pattern!
#### 7.b.
One could grow a dicision tree (or use some other machine learning methods, "nearest neighbor", etc.) to flexibly use the existing data (characteristics only, no market shares) to create nests. As validation I would check whether the catagories resemble something like: hypercar/supercar, sports car, SUV, hatchback, ...
#### 7.c.
We need to define the nesting structure a priori and must not make any mistakes. It is clear that we would need multiple layers of nesting to get a realistic classification, but then it is hard to justify why to split first according to one criterium instead of another.

### 8
#### 8.a.
Probit would allow for a very flexible substitution pattern, but with more than 120 cars per year the covariance matrix would get very big!
#### 8.b.
The pure characteristics model could not reflect the (unobserved) emotional value of people buying an Alfa Romeo Giulia Quadrifoglio although it is a bad car for the price.

## Part C: Mixed Logit
### 9
#### 9.a.
Now:
$$u_{ijt} = x'_{jt}\beta_t - \alpha_{i} p_{jt}+ \xi_{jt} + \epsilon_{ijt} \quad \forall t \in T$$
where $\alpha_i=\frac{1}{y_i}$ and $y_i \sim \mathcal{LN}$, the log-normal distribution.

Compared to BLP (1995) this is different in the sense the my $\delta_{jt}$ is now $x'_{jt}\beta_t -$ <font color='red'>$\mu_\alpha$</font>
$p_{jt} + \xi_{jt}$, the $\alpha$ is integrated out and its distributional parameters are estimated. More precisely I estimate $\theta=\Big[\beta_w, \beta_{hp}, \beta_{ac}, \mu_y, \sigma_y\Big]$, where the latter two are such that $\log y_i \sim \mathcal{N}(\mu_y, \sigma_y)$, and recover the moments for $\alpha$.

The mixed logit shares are in this setting:
$$s_{jt}=\int_{0}^{\infty}\frac{exp\big(x'_{jt}\beta_{t}-(\mu_{\alpha}-\mu_{\alpha}+\frac{1}{y_{i}})p_{jt}+\xi_{jt}\big)}{1+\sum_{k}exp\big(x'_{jt}\beta_{t}-\frac{1}{y_{i}}p_{jt}+\xi_{jt}\big)}\frac{1}{y_{i}\sigma_{y}\sqrt{2\pi}}\exp\Big(-\frac{(\log y_{i}-\mu_{y})^{2}}{2\sigma_{y}^{2}}\Big)dy_{i}=
$$

$$\int_{0}^{\infty}\frac{exp\big(\delta_{jt}+(\mu_{\alpha}-\frac{1}{y_{i}})p_{jt}\big)}{1+\sum_{k}exp\big(\delta_{kt}+(\mu_{\alpha}-\frac{1}{y_{i}})p_{kt}\big)}\frac{1}{y_{i}\sigma_{y}\sqrt{2\pi}}\exp\Big(-\frac{(\log y_{i}-\mu_{y})^{2}}{2\sigma_{y}^{2}}\Big)dy_{i}=
$$

Changing the integration variable:

$$\int_{-\infty}^{\infty}\frac{exp\big(\delta_{jt}+(\mu_{\alpha}-\frac{1}{\exp(e_{i})})p_{jt}\big)}{1+\sum_{k}exp\big(\delta_{kt}+(\mu_{\alpha}-\frac{1}{\exp(e_{i})})p_{kt}\big)}\frac{1}{\exp(e_{i})\sigma_{y}\sqrt{2\pi}}\exp\Big(-\frac{(e_{i}-\mu_{y})^{2}}{2\sigma_{y}^{2}}\Big)\exp(e_{i})de_{i}=
$$

$$\int_{-\infty}^{\infty}\frac{exp\Big(\delta_{jt}+(\mu_{\alpha}-\exp(-e_{i}))p_{jt}\Big)}{1+\sum_{k}exp\Big(\delta_{kt}+(\mu_{\alpha}-\exp(-e_{i}))p_{kt}\Big)}\frac{1}{\sigma_{y}\sqrt{2\pi}}\exp\Big(-\frac{(e_{i}-\mu_{y})^{2}}{2\sigma_{y}^{2}}\Big)de_{i}=
$$

Again changing the integration variable to standardize the normal distribution (I recycle the $e_i$) and plugging-in for $\mu_\alpha = \exp(-\mu_{y}+\frac{1}{2}\sigma_{y}^{2})$

$$\int_{-\infty}^{\infty}\underset{h(e_{i};\delta_{jt},\,p_{jt})}{\underbrace{\frac{1}{\sqrt{\pi}}\frac{exp\Big(\delta_{jt}+\Big(\exp(-\mu_{y}+\frac{1}{2}\sigma_{y}^{2})-\exp(-\mu_{y}-\sqrt{2}\sigma_{y}e_{i})\Big)\,p_{jt}\Big)}{1+\sum_{k}exp\Big(\delta_{kt}+\Big(\exp(-\mu_{y}+\frac{1}{2}\sigma_{y}^{2})-\exp(-\mu_{y}-\sqrt{2}\sigma_{y}e_{i})\Big)\,p_{kt}\Big)}}}\phi(e_{i})de_{i}\approx
$$

$$\approx\sum_{l=1}^n q_l \, h\Big(e_{l};\delta_{jt},\,p_{jt}\Big)$$
where $q_l$ and $e_l$ are the Gauss-Hermite weights and points.

The moment function is now: $g\big(w_r; \theta\big) = z_r \cdot (\delta_r - x_r \cdot \beta + \mu_\alpha \cdot p_r)$ where the $\delta_r$ comes from the solution to the BLP iteration: 

$$\delta^{(k)}(\theta) = \delta^{(k-1)}(\theta) + \log(\tilde{s}_j) - \log\Big(s_j(\delta^{(k-1)}, \theta\Big)$$

In [20]:
using FastGaussQuadrature

function gn_BLP(w, z, θ)
    1/size(w)[1] .* z' * (w[:,1] .- w[:, 2:4] * θ[1:3] .+ θ[4] .* w[:, 5])
end

#Wrapper creates a closure around the provided data set 
function Qn_wrapper_BLP(w, z)
    
    δ = zeros(size(w)[1])
    (e, q) = gausshermite(30)
    p = w[:, 5] #price
    s = w[:, 1] #share
    
    
    return θ -> begin #w, hp, ac, mu, sigma
        μ_y = θ[4]
        σ_y = θ[5]
        
        h(e) = 1/(√π) .* exp.(δ .+ (exp(-μ_y + 1/2 * σ_y^2) - exp(-μ_y - (√2) * σ_y * e)) .* p) / 
            (1 + sum(exp.(δ .+ (exp(-μ_y + 1/2 * σ_y^2) - exp(-μ_y - (√2) * σ_y * e)) .* p)))
        
        error = 1
        δ = zeros(size(w)[1])
        
        while error > 1e-13
            δold = copy(δ)
            δ = δold .+ log.(s) .- log.(hcat(h.(e)...) * q)
            error = maximum(abs.(δ - δold))
        end

        gvec = gn_BLP(hcat(δ, w[:, 2:5]), z, θ) 
        1/2 * (gvec' * gvec)[1,1]
    end
end

Qn_wrapper_BLP (generic function with 1 method)

In [136]:
year = 1990

w = convert(Array{Float64}, data[data[:year] .== year, [:share, :weight, :hp, :ac, :price]])
z = convert(Array{Float64}, data[data[:year] .== year, [:weight, :hp, :ac, :comp_weight, :comp_hp, :comp_ac,
                                                                           :firm_weight, :firm_hp, :firm_ac]])

Qn = Qn_wrapper_BLP(w, z)

optres = optimize(Qn, vcat(zeros(4), 1), NelderMead())

Results of Optimization Algorithm
 * Algorithm: Nelder-Mead
 * Starting Point: [0.0,0.0,0.0,0.0,1.0]
 * Minimizer: [-3.616161297173018,-6.8367452078433795, ...]
 * Minimum: 2.812222e-02
 * Iterations: 1000
 * Convergence: false
   *  √(Σ(yᵢ-ȳ)²)/n < 1.0e-08: false
   * Reached Maximum Number of Iterations: true
 * Objective Calls: 1629

In [137]:
optres.minimizer

5-element Array{Float64,1}:
 -3.61616
 -6.83675
  2.71727
 14.1569 
 -5.81497

In [138]:
optres = optimize(Qn, optres.minimizer, NelderMead())

Results of Optimization Algorithm
 * Algorithm: Nelder-Mead
 * Starting Point: [-3.616161297173018,-6.8367452078433795, ...]
 * Minimizer: [-9.086903471330073,0.3650066573758027, ...]
 * Minimum: 1.008921e-02
 * Iterations: 568
 * Convergence: true
   *  √(Σ(yᵢ-ȳ)²)/n < 1.0e-08: true
   * Reached Maximum Number of Iterations: false
 * Objective Calls: 938

In [139]:
θ = optres.minimizer

5-element Array{Float64,1}:
 -9.0869  
  0.365007
  1.77367 
 13.6707  
 -5.70867 

Hence $\mu_\alpha$ is:

In [128]:
exp(-θ[4] + 1/2 * θ[5]^2)

13.786551417871351

Well, it is very positive, which is good. Car buyers dislike prices!

#### 9.b.
$\alpha_i = \alpha_1+\alpha_2/y_i \implies \mu_\alpha = \alpha_1+\alpha_2 \cdot \exp(-\mu_{y}+\frac{1}{2}\sigma_{y}^{2})$ hence $\alpha_1$ is not identified, $\alpha_2$ might be by the variance or higher moments.
The moment conditions would be derived by plugging the new $\mu_\alpha$ into the integral above.

### 10
#### 10.a.
With fixed parameters on the distribution of $y_i$ there are only 3 parameters to estimate. 
$$35000 = \exp(\mu_y + \frac{\sigma_y^2}{2})$$
$$45000^2 = (\exp(\sigma_y^2) - 1)\cdot\exp(2\mu_y+\sigma_y^2)$$
Hence:
$$2(\log 35000 - \mu_y) = \sigma_y^2$$
$$2\log45000 = \log(\exp(\sigma_y^2) - 1) + 2\mu_y+\sigma_y^2$$
And:
$$\mu_y = \log\Big(\frac{35000^2}{\sqrt{45000^2 + 35000^2}}\Big)$$
$$\sigma_y = \sqrt{\log\Big(1+\frac{45000^2}{35000^2}\Big)}$$

In [34]:
#Wrapper creates a closure around the provided data set 
function Qn_wrapper_BLP_fixed(w, z)
    
    δ = zeros(size(w)[1])
    (e, q) = gausshermite(30)
    p = w[:, 5] #price
    s = w[:, 1] #share
    
    μ_y = log(35000^2 / sqrt(45000^2 + 35000^2))
    σ_y = sqrt(log(1+(45000^2)/(35000^2)))
    
    return θ -> begin #w, hp, ac
        push!(θ, μ_y, σ_y)
        
        h(e) = 1/(√π) .* exp.(δ .+ (exp(-μ_y + 1/2 * σ_y^2) - exp(-μ_y - (√2) * σ_y * e)) .* p) / 
            (1 + sum(exp.(δ .+ (exp(-μ_y + 1/2 * σ_y^2) - exp(-μ_y - (√2) * σ_y * e)) .* p)))
        
        error = 1
        δ = zeros(size(w)[1])
        
        while error > 1e-13
            δold = copy(δ)
            δ = δold .+ log.(s) .- log.(hcat(h.(e)...) * q)
            error = maximum(abs.(δ - δold))
        end

        gvec = gn_BLP(hcat(δ, w[:, 2:5]), z, θ) 
        1/2 * (gvec' * gvec)[1,1]
    end
end

Qn_wrapper_BLP_fixed (generic function with 1 method)

In [39]:
year = 1990

w = convert(Array{Float64}, data[data[:year] .== year, [:share, :weight, :hp, :ac, :price]])
z = convert(Array{Float64}, data[data[:year] .== year, [:weight, :hp, :ac, :comp_weight, :comp_hp, :comp_ac,
                                                                           :firm_weight, :firm_hp, :firm_ac]])

Qn = Qn_wrapper_BLP_fixed(w, z)

optres = optimize(Qn, [-9.0869, 0.365007, 1.77367], NelderMead())

LoadError: [91mBoundsError[39m