# IV estimation of the Similarity model

This notebook introduces a BLP estimator of Similarity demand parameters. Since standard instruments often result in biased parameter estimates and higher standard errors, we also implement optimal instruments (See e.g. Reynaert & Verboven, 2014) in the Similarity Model setting.

In [1]:
import numpy as np
import pandas as pd
#pd.options.mode.chained_assignment = None
pd.set_option('display.max_rows', 500)
import os
import sys
from numpy import linalg as la
from scipy import optimize
import scipy.stats as scstat
from matplotlib import pyplot as plt
import itertools as iter
%load_ext line_profiler

# Files
module_path = os.path.abspath(os.path.join('../..'))
if module_path not in sys.path:
    sys.path.append(module_path)

data_path = os.path.join(module_path, 'data')

from utilities.Logit_file import estimate_logit, logit_se, logit_t_p, q_logit, logit_score, logit_score_unweighted, logit_ccp, LogitBLP_estimator, LogitBLP_se
from data.Eurocarsdata_file import Eurocars_cleandata

In [2]:
# Load dataset and variable names
descr = (pd.read_stata(os.path.join(data_path,'eurocars.dta'), iterator = True)).variable_labels() # Obtain variable descriptions
dat_file = pd.read_csv(os.path.join(data_path, 'eurocars.csv')) # reads in the data set as a pandas dataframe.

In [3]:
# Outside option is included if OO == True, otherwise analysis is done on the inside options only.
OO = True

# Choose which variables to include in the analysis, and assign them either as discrete variables or continuous.

x_discretevars = [ 'brand', 'home', 'cla']
x_contvars = ['cy', 'hp', 'we', 'le', 'wi', 'he', 'li', 'sp', 'ac', 'pr']
z_IV_contvars = ['xexr']
z_IV_discretevars = []
x_allvars =  [*x_contvars, *x_discretevars]
z_allvars = [*z_IV_contvars, *z_IV_discretevars]

if OO:
    nest_contvars = [var for var in x_contvars if var != 'pr'] # We nest over all variables other than price, but an alternative list can be specified here if desired.
    nest_discvars = ['in_out', *x_discretevars]
    nest_vars = ['in_out', *nest_contvars, *x_discretevars]
else:
    nest_contvars = [var for var in x_contvars if (var != 'pr')]
    nest_discvars = x_discretevars # See above
    nest_vars = [*nest_contvars, *nest_discvars]

G = len(nest_vars)

# Print list of chosen variables as a dataframe
pd.DataFrame(descr, index=['description'])[x_allvars].transpose().reset_index().rename(columns={'index' : 'variable names'})

Unnamed: 0,variable names,description
0,cy,cylinder volume or displacement (in cc)
1,hp,horsepower (in kW)
2,we,weight (in kg)
3,le,length (in cm)
4,wi,width (in cm)
5,he,height (in cm)
6,li,"average of li1, li2, li3 (used in papers)"
7,sp,maximum speed (km/hour)
8,ac,time to acceleration (in seconds from 0 to 100...
9,pr,price (in destination currency including V.A.T.)


In [4]:
dat, dat_org, x_vars, z_vars, N, pop_share, T, J, K = Eurocars_cleandata(dat_file, x_contvars, x_discretevars, z_IV_contvars, z_IV_discretevars, outside_option=OO)

In [5]:
# Create dictionaries of numpy arrays for each market. This allows the size of the data set to vary over markets.

dat = dat.reset_index(drop = True).sort_values(by = ['market', 'co']) # Sort data so that reshape is successfull

x = {t: dat[dat['market'] == t][x_vars].values.reshape((J[t],K)) for t in np.arange(T)} # Dict of explanatory variables
y = {t: dat[dat['market'] == t]['ms'].to_numpy().reshape((J[t])) for t in np.arange(T)} # Dict of market shares

# BLP Estimation and instruments

The setting is now a bit different. Instead of the noise coming from random sampling of individuals, we now have an additional source of uncertainty, stemming frm the random sampling of the fixed effects $\xi_{tj}$ for each market and each product. The number of ”observations” is therefore

$$
S = T \cdot \sum_t J_t
$$

Note that while random sampling of individuals choices (number of observations
in the hundreds of millions) still has an effect on the estimated parameters in
principle, this effect is completely drowned out by the sampling variance of the
fixed effects (number of observations $S \approx 150^2 \cdot 50$), so we choose to ignore it
here. When estimating random coefficients Models, there is also a third source
of uncertainty stemming from approximation of numerical integrals. This is not
an issue in Similarity, as we have the inverse demand in closed form.

The principles are pretty similar to what we have been doing already. When
applicable, we will use the same notation as in the FKN section. Define the
residual,

$$\xi_t(\theta) = u(X_t, \beta) − \nabla_q \Omega(q_t^0|\lambda)$$

In the Similarity Model, this residual is a linear function of $\theta$ which has the form

$$\xi_t(\theta) =  G^0_t \theta − r_t^0$$

where $ G^0_t=[X_t, -\nabla_{q,\lambda}\Omega(q_t^0|\lambda)]$ and $r^0_t = \ln q^0_t$ as in the FKN section with $q^0_t$ being e.g. the observed market shares in market $t = 1, \ldots, T$. For the BLP estimator, we set this residual orthogonal to a matrix of instruments $ Z_t$ of size $J_t \times (K+G)$, and find the estimator $ \hat \theta^{IV}$ which solves the moment conditions

$$\frac{1}{T} \sum_t  Z_t' \xi(\hat \theta^{IV}) = 0$$

Since $\hat \xi$ is linear, the moment equations have a unique solution,

$$\hat \theta^{IV} = \left(\frac{1}{T}\sum_t  Z_t' G^0_t \right)^{-1}\left(\frac{1}{T}\sum_t  Z_t' r^0_t \right)$$

We require an instrument for the price of the goods. This is something which is correlated with the price, but uncorrelated with the error term $\xi_t$ (in the BLP Model, $\xi_{tj}$ represents unobserved components of car quality). A standard instrument in this case would be a measure of marginal cost (or something which is correlated with marginal cost, like a production price index). For everything other than price, we can simply use the regressor itself as the instrument i.e. $  Z^{tjd} = G^0_{tjd}$, for all other dimensions than price.

First we construct our instruments $ Z$. We'll use the average exchange rate of the destination country relative to average exchange rate of the origin country. 

In [None]:
S = T*np.sum(np.array([x[t].shape[0] for t in np.arange(T)]))

xexr = {t: dat[dat['market'] == t][z_vars[0]].values for t in np.arange(T)}
G0 = G_array(y, x, Model)
pr_index = len(x_contvars)
for t in np.arange(T):
    G0[t][:,pr_index] = xexr[t] / xexr[t].max()

z = G0

We then calculate the moment estimator $\hat \theta^{IV}$.

In [None]:
def BLP_estimator(q_obs, z, x, sample_share, model):
    '''
    Args.
        q_obs: a dictionary of T numpy arrasy (J[t],) of observed or nonparametrically estimated market shares for each market t
        z: a dictionary of T numpy arrays (J[t],K+G) of instruments for each market t
        x: a dictionary of T numpy arrays (J[t],K) of covariates for each market t
        sample_share: A (T,) numpy array of the fraction of observations in each market t 
        model: a dictionary of the Similarity Model specification as outputted by 'Similarity_specification'

    Returns
        theta_hat: a numpy array (K+G,) of BLP parameter estimates
    '''
    T = len(z)

    G = G_array(q_obs, x, model)
    d = G[0].shape[1]
    r = {t: np.log(q_obs[t], out = np.NINF*np.ones_like((q_obs[t])), where = (q_obs[t] > 0)) for t in np.arange(T)}
    
    sZG = np.empty((T,d,d))
    sZr = np.empty((T,d))

    for t in np.arange(T):
        sZG[t,:,:] = sample_share[t]*np.einsum('jd,jp->dp', z[t], G[t])
        sZr[t,:] = sample_share[t]*np.einsum('jd,j->d', z[t], r[t])

    theta_hat = la.solve(sZG.sum(axis=0), sZr.sum(axis=0))
    
    return theta_hat

In [None]:
BLP_theta = BLP_estimator(y, z, x, np.ones((T,)), Model)

In the Logit Model we get the parameter estimates:

In [None]:
G_logit = x
for t in np.arange(T):
    G_logit[t][:,pr_index] = xexr[t] / xexr[t].max()

z_logit = G_logit

In [None]:
LogitBLP_beta = LogitBLP_estimator(y, z_logit, x, np.ones((T,)))
LogitBLP_SE = LogitBLP_se(LogitBLP_beta, y, z_logit, x)
LogitBLP_t,LogitBLP_p = logit_t_p(LogitBLP_beta, logit_score_unweighted(LogitBLP_beta, y, x), np.ones((T,)), S)
LogitBLP_beta

array([-14.92920752,  -2.3589754 ,  -6.76421995,   0.02963003,
        -2.05176127,  10.84731336,  -1.04140126,  -0.58331478,
         5.15289118,   0.51808091,  -0.17336342,  -2.037768  ,
        -0.81720168,  -1.44357757,  -1.04059281,  -1.16245013,
        -1.74530433,  -0.85123531,  -2.72300281,  -1.08758839,
        -0.68958989,  -0.95909482,  -2.11727698,  -2.93039275,
        -2.90655875,  -2.05527142,  -1.82107985,   0.51974428,
        -2.02980519,  -0.79701277,  -0.86356478,  -0.86816254,
        -0.81661044,  -1.48858878,  -0.9378501 ,  -1.87621854,
        -3.7657435 ,  -1.52567201,  -3.14936663,  -2.07998398,
        -1.85954898,  -0.7631942 ,  -1.94891051,  -1.60837966,
        -1.15784827,  -0.48973547,  -2.57588437,   1.56903974,
         0.0275757 ,   0.04982496,  -0.30342661,  -0.3829885 ])

### BLP approximation to optimal instruments

BLP propose an algorithm for constructing an approximation to the optimal instruments. It is described in simple terms in Reynaert & Verboven (2014), and it has the following steps.
It requires a consistent initial parameter estimate $\hat \theta = (\hat \beta', \hat \lambda')'$; here we can just use the MLE or the FKN estimates we have already computed. Let $Z_t$ denote the matrix of instruments (this is the matrix $X_t$ with the price replaced by the exchange rate). The steps are then as follows:

First we form the regression equation of the covariates on the instruments:
$$
X_t = Z_t \Pi + \Epsilon_t
$$

The OLS estimate is then given as:
$$
\hat \Pi = \left( \frac{1}{T}\sum_t Z_t' Z_t \right)^{-1}\left( \frac{1}{T}\sum_t Z_t' X_t\right)
$$

Thus the predicted covariates given the instruments $W$ are:
$$
\hat X_t = Z_t \hat \Pi
$$

Having constructed $\hat X_t$ (which consists of the exogenous regressors, and the predicted price given $Z_t$), we compute the predicted mean utility:

$$
\hat u_t = \hat X_t \hat \beta
$$

and then the predicted market shares at the mean utility:

$$
\hat q_t^{*} = P(\hat u_t | \hat \lambda)
$$

Computationally, here we just use $\hat X_t$ in place of $X_t$ in the CCP function.
Given the predicted market shares, we compute

$$
\hat G_t^{*} = \left[\hat X_t, -\nabla_{q,\lambda} \Omega (\hat q_t^{*} | \hat \lambda)\right]
$$

which is the same as the function $\hat G_t^0$ we already have constructed, except we evaluate it at the
predictions $\hat X_t$ and $\hat q_t^{*}$ instead of at $X_t$ and $\hat q_t^0$.

The procedure above gives an approximation to the optimal instruments. We also require a weight matrix. The optimal weight matrix is the (generalized) inverse of the conditional (on the instruments) covariance of the fixed effects. Assuming $\xi_{tj}$ is independently and identically distributed over markets t and products j, the conditional covariance simplifies to a scalar $\sigma^2$ times an identity matrix (of size $J_t$).
This means that all fixed effects are weighted equally, and the weights therefore drop out of the IV regression. The optimal IV estimator is therefore

$$
\hat \theta^{\text{IV}} = \left(\frac{1}{T}\sum_t (\hat G_t^*)'\hat G_t^0\right)^{-1}\left( \frac{1}{T}\sum_t (\hat G_t^*)'\hat r_t^0 \right)
$$

Let $\hat \xi^*$ denote the estimated residual evaluated at the new parameter estimates,

$$
\hat \xi_{tj}^* = \hat \xi_{tj}(\hat \theta^{\text{IV}})
$$

We may estimate the constant $\sigma^2$ by

$$
\hat \sigma^2 = \frac{1}{S}\sum_{t}\sum_{j = 1}^{J_t} \left(\hat \xi_{tj}^*\right)^2 
$$

The distribution of the estimator $\hat \theta^{\text{IV}}$ is then

$$
\hat \theta^{\text{IV}} \sim \mathcal{N}(\theta_0, \Sigma^{\text{IV}})
$$

which can be consistently estimated by

$$
\hat \Sigma^{\text{IV}} = \hat \sigma^2 \left( \sum_t (\hat G_t^*)'\hat G_t^0 \right)^{-1}
$$

and the standard errors are then the square root of the diagonal elements.

In [None]:
def predict_x(x, w, sample_share):
    ''' 
    This function computes the predicted covariates from a regression on the instruments

    Args:
        x: a dictionary of T numpy arrays (J[t],K) of covariates for each market t
        w: a dictionary of T numpy arrays (J[t],K) of instruments for each covariate for each market t
        sample_share: A (T,) numpy array of the fraction of observations in each market t 

    Returns.
        X_hat: a dictionary of T numpy arrays (J[t],K) of predicted covariates for each market t
    '''
    
    T = len(w)
    K = w[0].shape[1]

    sWW = np.empty((T,K,K))
    sWX = np.empty((T,K,K))

    for t in np.arange(T):
        sWW[t,:,:] = sample_share[t]*np.einsum('jk,jl->kl', w[t], w[t])
        sWX[t,:,:] = sample_share[t]*np.einsum('jk,jl->kl', w[t], x[t])

    Pi_hat = la.solve(sWW.sum(axis=0), sWX.sum(axis=0))
    X_hat = {t: np.einsum('jl,lk->jk', w[t], Pi_hat) for t in np.arange(T)}

    return X_hat

In [None]:
def BLP_se(Theta, y, x, model):
    '''
    This function computes BLP standard errors which are consistent when using optimal instruments

    Args:
        Theta: a numpy array (K+G,) of BLP estimated 
        y: a dictionary of T numpy arrasy (J[t],) of observed or nonparametrically estimated market shares for each market t
        x: a dictionary of T numpy arrays (J[t],K) of covariates for each market t 
        model: a dictionary of the Similarity Model specification as outputted by 'Similarity_specification'

    Returns.
        SE: a numpy array (K+G,) of estimated BLP standard errors using optimal instruments
    '''
    T = len(x)
    S = T * np.array([x[t].shape[0] for t in np.arange(T)]).sum()

    G = G_array(y, x, model)
    d = G[0].shape[1]
    r = {t: np.log(y[t]) for t in np.arange(T)}
    
    # We calculate \sigma^2
    xi = {t: np.einsum('jd,d->j', G[t], Theta) - r[t] for t in np.arange(T)}
    sum_xij2 = np.empty((T,))

    for t in np.arange(T):
        sum_xij2[t] = (xi[t]**2).sum()
    
    sigma2 = np.sum(sum_xij2) / S

    # We calculate GG for each market t
    GG = np.empty((T,d,d))

    for t in np.arange(T):
        GG[t,:,:] = np.einsum('jd,jp->dp', G[t], G[t])

    # Finally we compute \Sigma and the standard errors
    Sigma = sigma2*la.inv(GG.sum(axis=0))
    SE = np.sqrt(np.diag(Sigma))

    return SE

In [None]:
def OptimalBLP_estimator(Theta0, q_obs, w, x, sample_share, model):
    '''
    This function estimates the Similarity demand model using optimal instruments in the BLP setting
    
    Args:
        Theta0: a numpy array (K+G,) of consistent parameter estimates from estimation using the covariates ('first-stage parameters')
        q_obs: a dictionary of T numpy arrasy (J[t],) of observed or nonparametrically estimated market shares for each market t
        w: a dictionary of T numpy arrays (J[t],K+G) of instruments for each market t
        x: a dictionary of T numpy arrays (J[t],K) of covariates for each market t
        sample_share: a (T,) numpy array of the fraction of observations in each market t 
        model: a dictionary of the Similarity Model specification as outputted by 'Similarity_specification'

    Returns.
        Theta_IV: a numpy array (K+G,) of BLP parameter estimates in the Similarity Model using optimal instruments
        SE_IV: a numpy array (K+G,) of estimated BLP standard errors using optimal instruments
    '''
    
    T = len(x)
    K = x[0].shape[1]
    
    X_hat = predict_x(x, w, sample_share)
    q0 = Similarity_ccp(Theta0, X_hat, model)
    G_star = G_array(q0, X_hat, model)
    G0 = G_array(q_obs, x, model)
    
    r = {t: np.log(q_obs[t]) for t in np.arange(T)}

    d = G0[0].shape[1]

    sGG = np.empty((T,d,d))
    sGr = np.empty((T,d))

    for t in np.arange(T):
        sGG[t,:,:] = sample_share[t]*np.einsum('jd,jp->dp', G_star[t], G0[t])
        sGr[t,:] = sample_share[t]*np.einsum('jd,j->d', G_star[t], r[t])

    Theta_IV = la.solve(sGG.sum(axis=0), sGr.sum(axis=0))
    SE_IV = BLP_se(Theta_IV, q_obs, x, model)

    return Theta_IV, SE_IV

In [None]:
ThetaOptBLP, SEOptBLP = OptimalBLP_estimator(FKN_theta, y, z_logit, x, np.ones((T,)), Model)
OptBLP_t, OptBLP_p = Similarity_t_p(SEOptBLP, ThetaOptBLP, S)

In [None]:
np.array([p for p in ThetaOptBLP[K:]  if p > 0]).sum()

1.259433210474408

In [None]:
reg_table(ThetaOptBLP, SEOptBLP, N, x_vars, nest_vars)

variables,theta,se,t (theta == 0),p
in_out,-11.7985***,0.03458,341.164,0.0
cy,-0.7545***,0.02002,37.691,0.0
hp,-5.587***,0.02601,214.837,0.0
we,0.2574***,0.02089,12.326,0.0
le,-2.4046***,0.02323,103.5,0.0
wi,5.9706***,0.03394,175.927,0.0
he,0.8705***,0.02738,31.794,0.0
li,-0.77***,0.01218,63.199,0.0
sp,5.02***,0.02657,188.901,0.0
ac,1.1385***,0.01299,87.662,0.0


In [None]:
qOpt = Similarity_ccp(ThetaOptBLP, z_logit, Model)