# Section 2: Random Coefficient Model
We assumed that everyone has the same taste for each characteristic. However, it's not the case. Additionally, there may be some features of a good that are observable to the consumer, but not to the econometrician. To allow for these, we introduce a new model for the utility:
$$
\begin{equation}
u_{ijt} = \alpha \ln(y_{it}-p_{jt}) + x_{jt}\beta + \sum_{k=1}^K \sigma_k v_{ik} x_{jkt} + \xi_{jt} \epsilon_{ijt}
\end{equation}
$$
where $K$ is the number of (observable) characteristics, $\xi$ is the observable-to-consumer-but-not-to-econometricin component, $v_{ik}$ is a random variable that is distributed standard normal, and $y_{it}$ is the income of individual $i$ at time $t$. There are couple of arrangements I want to make before I go on. First, let's play with the first term a bit. 
$$
\begin{align}
\ln(y_{it}-p_{jt}) &= \ln(1-\frac{p_{jt}}{y_{it}}) + \ln(y_{it}) \\
& \approx  -\frac{p_{jt}}{y_{it}} + \ln(y_{it}) \\
\end{align}
$$
Next, define $$\delta_{jt} \equiv x_{jt}\beta + \xi_{jt} $$ and $$\mu_{ijt} \equiv -\alpha\frac{p_{jt}}{y_{it}} + \sum_{k=1}^K \sigma_k v_{ik} x_{jkt} $$
Now, we have the following model:
$$
\begin{equation}
u_{ijt} = \alpha \ln(y_{it}) + \delta_{jt} + \mu_{ijt} + \epsilon_{ijt}
\end{equation}
$$

So, again, if individual $i$ buys good $j$, it means that good $j$ provides him more utility than any other good $k$, including the outside good. Therefore, we can say that the probability of individual $i$ buying $j$ at year $t$ is equal to 

$$ \begin{align}
Pr(i \text{ buys good } j \text{ in year } t) &= Pr(u_{ijt}> u_{ikt}, \forall k= {0,1,\ldots,J_t}, k\neq j) \\
&= Pr(\alpha \ln(y_{it}) + \delta_{jt} + \mu_{ijt} + \epsilon_{ijt} > \alpha \ln(y_{it}) + \delta_{kt} + \mu_{ikt} + \epsilon_{ikt}  , \forall k= {0,1,\ldots,J_t}, k\neq j) \\
&= Pr(\delta_{jt} + \mu_{ijt} + \epsilon_{ijt} > \delta_{kt} + \mu_{ikt} + \epsilon_{ikt}, \forall k= {0,1,\ldots,J_t}, k\neq j)\\
&= \frac{\exp(\delta_{jt} + \mu_{ijt})}{1 + \sum_{k=1}^{J_t} \exp(\delta_{kt} + \mu_{ikt})}
\end{align} 
$$

Assuming that the individuals are distributed with cdf $T$, we can write the probability of good $j$ being sold in year $t$ as
$$
\begin{align}
Pr(\text{good } j \text{ being sold in year } t) &= \int_i Pr( i \text{ buys good } j \text{ in year } t) dT(i) \\
& = \int_i \frac{\exp(\delta_{jt} + \mu_{ijt})}{1 + \sum_{k=1}^{J_t} \exp(\delta_{kt} + \mu_{ikt})} dT(i) \\
& \approx \frac{1}{ns} \sum_{i=1}^{ns} \frac{\exp(\delta_{jt} + \mu_{ijt})}{1 + \sum_{k=1}^{J_t} \exp(\delta_{kt} + \mu_{ikt})} \\
\end{align}
$$

Here is how we code this function.

In [275]:
def s_comp(expdelta0,expmu):
    nom = expdelta0 @ np.ones((1,ns)) * expmu
    denom = np.dot(A,np.dot(A.T,nom))
    frac = nom/(1+denom)
    s_computed = (1/ns)*(frac.sum(1))
    return(s_computed)

To calculate the probabilities, we need $\delta$'s and $\mu$'s. However, we don't know what $\delta$ is. Why? We don't have the data for $\xi_j$. So, how do we calculate $\delta_{jt}$? We approximate it numerically.

Let's write the following recursion

$$
\begin{equation}
\delta^{n+1} = \delta^{n} + \log(s^{DATA}) - \log(s(\theta)) \; \forall n=0,1,2,\ldots
\end{equation}
$$

where $\theta \equiv (\beta,\alpha,\sigma)$, and $\sigma \equiv (\sigma_0,\ldots,\sigma_k)$. Define $\theta_B \equiv (\alpha,\sigma)$ to be the set of parameters outside of $\delta$ for later use.

Define $s(\theta) \equiv Pr(\text{good } j \text{ being sold in year } t)$. So, $\theta$ is the set of parameters that determines $Pr(\text{good } j \text{ being sold in year } t)$. 

BLP shows that the above equation is a "contraction", i.e. for all $\epsilon>0$, there exists an integer $J_\epsilon$ such that for all $m>J_\epsilon$, $||(\delta^{m+1}-\delta^{m}||<\epsilon$. Therefore, we can use the function above to find the "best" $\delta$ that explains the market shares in the data. 


In [276]:
def contraction(sigma,alpha):
    expdelta0 = deltam
    expdelta0.shape = (nrow,1)
    draws = v * np.kron(np.ones((ns,1)),sigma)
    expmu = (np.exp(np.dot(X,draws.T) - np.dot(price,np.ones((1,ns))) * alpha / y))
    dif = 1
    it = 0
    while(dif > tol and it < 1000):
        rat = s_data/s_comp(expdelta0,expmu)
        rat.shape = (nrow,1)
        expdelta1 = expdelta0 * rat
        dif = (abs(expdelta1/expdelta0-1)).max()
        expdelta0 = expdelta1
        it += 1
    return(np.log(expdelta0))

For computational speed, I transformed the recursion above by taking exponentials of both sides. What next?

## GMM estimation

The moment condition we use for the estimation is
$$
\begin{equation}
E[\xi\mid Z] = 0
\end{equation}
$$
where $Z$ is the instrument matrix. Before going into the GMM objective function, let's first calculate $\xi$'s.

We noted earlier that $ \delta_{jt} \equiv x_{jt}\beta + \xi_{jt}$. Now that we "estimated" $\delta_{jt}$, we can back out $\xi_{jt}$. But first, we need $\beta$. 

We can get $\beta$'s by using 2SLS. How? Use the above equation to regress the $\delta$ we got from the contraction on the product characteristics $X$ and $\xi$, by using $Z$ as the instrument for $\xi$. 

$$
\begin{equation}
\hat{\beta} = (X^{'} Z \Phi^{-1} Z^{'} X)^{-1} X^{'} Z \Phi^{-1} Z^{'} \delta(\theta_B)
\end{equation}
$$
where $\Phi \equiv Z^{'}Z$ being the weighting matrix.

Then, use $\hat{\beta}$ to back out $\xi$:
$$ \xi_{jt} = \delta_{jt}(\theta) - x_{jt}\beta $$

Next, define $$g \equiv \frac{1}{J} Z^{'}\xi$$

Finally, define the GMM objective function as

$$ Q(\theta_B) \equiv g^{'}Wg $$

Any symmetric and positive-definite $W$ gives us the consistent estimates for the parameters. So we use the identity matrix, i.e. $W = I$



In [277]:

def obj(params):
    t = time.time()
    sigma = params[0:5]
    alpha = params[5]
    delta = contraction(sigma,alpha)
    beta = np.dot(prem,delta)
    deltam = delta
    xi = delta - np.dot(X,beta)
    g = (1/nrow)* (np.dot(Z.T,xi))
    ob = np.dot(g.T,g)
    ob = ob[0,0]
    t1 = time.time()-t
    print(t1)
    print('{:6.5f}   {:6.5f}   {:6.5f}   {:6.5f}   {:6.5f}   {:6.5f}   {:6.5f}'.format(abs(sigma[0]), abs(sigma[1]), abs(sigma[2]),  abs(sigma[3]), abs(sigma[4]), abs(alpha), ob))
    return(ob)

Finally, we find the parameters $\sigma$ and $\alpha$ by solving

$$
\begin{equation}
\min_{\theta_B = (\sigma,\alpha)} Q(\theta_B)
\end{equation}
$$


## Implementation

Step 0: Load necessary packages.

In [278]:
import pandas as pd
import numpy as np
from scipy import stats
from numba import jit
import time
from numpy.linalg import inv
from scipy.optimize import minimize

Step 1: Import the data (prepared earlier with the R code above).

In [279]:
data = pd.read_csv("data.csv")

Step 2: Preparation.

In [280]:
ns = 1500
nrow = data.shape[0]

X = data[["const", "hpwt", "air", "mpd", "size"]]
X = np.asarray(X)
sizeX = X.shape[1]
Z = data[["const", "hpwt", "air", "mpd", "size", "own_const", "own_hpwt", "own_air", "own_mpd", "own_size", "all_const", "all_hpwt", "all_air", "all_mpd", "all_size"]]
Z = np.asarray(Z)
phi = np.dot(Z.T,Z)
prem = inv(X.T @ Z @ inv(phi) @ Z.T @ X) @ X.T @ Z @ inv(phi) @ Z.T


A = pd.get_dummies(data["year"])
A = np.asarray(A, dtype = np.float32)
AAT =  np.dot(A,A.T)
nyears = A.shape[1]


np.random.seed(6128)
v = np.random.randn(ns,sizeX)



ym = np.array([2.01156, 2.06526, 2.07843, 2.05775, 2.02915, 2.05346, 2.06745, 2.09805, 2.10404, 2.07208, 2.06019, 2.06561, 2.07672, 2.10437, 2.12608, 2.16426, 2.18071, 2.18856, 2.21250, 2.18377])
sigma_y = np.sqrt(1.72)
ym.shape = (nyears,1)
np.random.seed(6128)
ar = np.random.randn(nyears,ns)
y = A @ np.exp(ym + sigma_y*ar)


global deltam 
deltam = np.exp(data["logdif"])
deltam = np.asarray(deltam)


s_data = np.asarray(data["share"])

price = np.asarray(data["price"])
price.shape = (nrow,1)


tol = 1e-6



Step 3: Estimation via minimization of GMM objective function with respect to the parameters $\sigma$ and $\alpha$.

In [281]:
#init = np.array([-7.16200,-1.45583,-0.08428,0.05325,-0.24379,37.61207])
init = np.array([3.612, 4.628, 1.818, 1.050, 2.056,43.501])
t0 = time.time()
soln = minimize(obj,init,method="L-BFGS-B", tol = 1e-2, options={'disp': True},)
elapsed_time = time.time()-t0
print(elapsed_time)

9.456685066223145
3.61200   4.62800   1.81800   1.05000   2.05600   43.50100   433.45228
9.666469097137451
3.61200   4.62800   1.81800   1.05000   2.05600   43.50100   433.45228
10.25153112411499
3.61200   4.62800   1.81800   1.05000   2.05600   43.50100   433.45228
10.417877912521362
3.61200   4.62800   1.81800   1.05000   2.05600   43.50100   433.45228
9.866004943847656
3.61200   4.62800   1.81800   1.05000   2.05600   43.50100   433.45229
10.619967937469482
3.61200   4.62800   1.81800   1.05000   2.05600   43.50100   433.45228
9.99603009223938
3.61200   4.62800   1.81800   1.05000   2.05600   43.50100   433.45228
9.885231971740723
3.55042   4.60724   1.70305   0.06136   1.99206   43.46846   197.92273
9.820586919784546
3.55042   4.60724   1.70305   0.06136   1.99206   43.46846   197.92273
10.03408694267273
3.55042   4.60724   1.70305   0.06136   1.99206   43.46846   197.92273
10.198553085327148
3.55042   4.60724   1.70305   0.06136   1.99206   43.46846   197.92273
10.621930122375488


10.318263053894043
7.24321   1.63011   0.10612   0.08418   0.14763   37.21567   12.39601
10.256860971450806
7.24321   1.63011   0.10612   0.08418   0.14763   37.21567   12.39601
10.155671119689941
7.24321   1.63011   0.10612   0.08418   0.14763   37.21567   12.39601
10.666969776153564
7.24321   1.63011   0.10612   0.08418   0.14763   37.21567   12.39601
10.37922978401184
7.24321   1.63011   0.10612   0.08418   0.14763   37.21567   12.39601
10.802839994430542
7.12487   1.53315   0.08139   0.05211   0.24392   37.52093   11.87359
10.674896955490112
7.12487   1.53315   0.08139   0.05211   0.24392   37.52093   11.87359
10.235674858093262
7.12487   1.53315   0.08139   0.05211   0.24392   37.52093   11.87359
10.261731147766113
7.12487   1.53315   0.08139   0.05211   0.24392   37.52093   11.87359
10.70065188407898
7.12487   1.53315   0.08139   0.05211   0.24392   37.52093   11.87359
10.156445980072021
7.12487   1.53315   0.08139   0.05211   0.24392   37.52093   11.87359
10.209495067596436
7.12