# Random Coefficients Logit Tutorial - PyBLP


This tutorial follows Section 4 of the paper from Aviv Nevo (2000): "A practitioner’s guide to estimation of random‐coefficients logit models of demand.  *Journal of Economics & Management Strategy*, 9 (4), 513-548".

The paper shows a possible application of the random-coefficients Logit model. We are going to use the same data and solve the paper’s cereal problem. The data is fake, and should only be used to learn the method.

We will use the PyBLP package for Python 3. Documentation for this package can be found at https://pyblp.readthedocs.io/en/stable/index.html .

### Theory of Random Coefficients Logit
This method retains the benefits of simpler discrete-choice models: it can be estimated using only market-level price and quantity data and it deals with the endogeneity of prices. Moreover, it returns more realistic demand elasticities than Logit/Nested Logit models. 

The chosen specification of the indirect utility of consumer $i$ from consuming product $j$ in market $t$ is:

$u_{ijt} = \alpha_i p_{jt} + x'_{jt} \beta_i + \xi_{jt} + \epsilon_{ijt}$

Where $p_{jt}$ is the price, $x'_{jt}$ is the (row) vector of $K$ observable characteristics of the product, $\epsilon_{ijt}$ is the mean-zero stochastic term,  i.i.d. distributed with the Type I Extreme Value (Gumbel) distribution, and $\xi_{jt}$ is the unobserved (by the econometrician) characteristics.

This specification can be derived from a quasilinear utility function (free of wealth effects) because of the way price enters the indirect utility function. Including wealth effects could be more reasonable for other types of products (e.g. cars). Notice that $\xi_{jt}$, which among other things captures the elements of vertical product differentiation, is identical for all consumers, while $\alpha_i$ varies: this is consistent with the theoretical literature of vertical product differentiation.

The mean utility of the outside good is normalized to zero, so that $u_{i0t} = \epsilon_{i0t}$.

We can separate the linear component of utility from the non-linear one: $u_{ijt} = \delta_{jt} + \mu_{ijt} \;$, where $\delta_{jt} = \alpha p_{jt} + x'_{jt} \bar\beta + \xi_{jt} \;$ is the aspects of mean utility that all individuals agree on, and $\mu_{ijt}(\theta) \;$ is the individual specific heterogeneity (let $\theta$ be a vector with all parameters of the model).

Consumers are assumed to purchase one unit of the good that gives the highest utility. The set of individual attributes that lead to the choice of good $j$ is:

$A_{jt}(\delta) = \{\mu_i \; | \; \delta_{jt} + \mu_{ijt} > \delta_{j't} + \mu_{ij't},\; \text{for all} \; j' \neq j \}$

Therefore, the estimated market share of product $j$ is:

$S_{jt}(\delta_{jt}, \theta) = \int_{A_{jt}} \frac{\exp{(\delta_{jt} + \mu_{ijt})}}{1+\sum_k \exp{(\delta_{kt} + \mu_{ikt})}}d\mu $

For each $\theta$ there is a unique $\delta_{jt}(\theta)$ that solves $S^{obs}_{jt} - S_{jt}(\delta_{jt}, \theta) = 0 \;$ (Berry, 1994). This system of equations is nonlinear and is solved numerically. It can be solved by using the contraction mapping suggested by BLP(1995), which means computing the series:

$\delta_{jt}^{h+1} = \delta_{jt}^{h} + ln(S^{obs}_{jt}) - ln(S_{jt}(\delta_{jt}^{h}, \theta))\;\;\;\;$ (see BLP Appendix I for proof of convergence)

where $h=0, \ldots, H$, $H$ is the smallest integer such that $||\delta_{jt}^{H}-\delta_{jt}^{H-1}||$ is smaller than some tolerance level, and $\delta_{jt}^{H}$ is the approximation to $\delta_{jt}$. In words, we begin evaluating the right-hand side of the series at some initial guess for $\theta$ and $\delta_{jt}$, obtain a new $\delta_{jt}^{h}$, substitute $\delta_{jt}^{h}$ back into the right-hand side of the series, and repeat the process until convergence. 

We then compute the error term vector $\hat\xi_{jt}(\theta)$. Let $z_{jt}$ be a set of instruments such that $E[z'_{jt} \xi_{jt}(\theta)] = 0$. The GMM estimate is then

$\hat\theta_{GMM} = \underset{\theta}{\operatorname{argmin}} \; \xi(\theta)' z_{jt} \Phi^{-1} z'_{jt} \xi(\theta)$

where $\Phi$ is the variance-covariance matrix of the moments. The inverse is used to give less weight to those moments that have higher variance.

### Specification of Random Taste Parameters

We have to specify an initial guess of the nonlinear parameters. This serves two primary purposes: speeding up estimation and indicating to the solver through initial values of zero which parameters are restricted to be always zero.
It is common to assume that the random taste parameters follow a multivariate normal distribution, and to break them up into three parts:

$\begin{pmatrix} \alpha_i \\ \beta_i \end{pmatrix} =\begin{pmatrix} \alpha \\ \beta \end{pmatrix} + \Pi d_i + \Sigma v_i$.

where $\alpha$ and $\beta$ are the mean taste which all individuals agree on, $d_i$ is a $D\times1$ vector of known demographic variables, $\Pi$ is a $(K+1)\times D$ matrix of coefficients that measure how the taste characteristics vary with demographics, $\Sigma$ is a $(K+1)\times (K+1)$ matrix of parameters, and $v_i$ represents unknown individual characteristics. We cannot directly observe individual data about $d_i$ and $v_i$. The difference between the two is that we know something about the distribution of demographics $d_i$ (e.g. thorugh census data).

### The Data
The data used for the analysis below consists of `shares` and `prices` for 24 brands of breakfast cereals (a differentiated product ) in 47 cities over 2 quarters (`quarter`). `market_ids` are the unique market identifiers (which we subscript $t$). Whithin a market, the sum of all `shares` must be less than 1. Firm and brand are identified by columns `firm_ids`, `product_ids`. There are two product characteristics: `Sugar`, which measures sugar content, and `Mushy`, a dummy variable equal to one if the product gets soggy in milk. There are 20 pre-computed instruments (`demand_instruments0`, ... , `demand_instruments19`). These represent only the excluded instruments. The exogenous regressors will be automatically added to the set of instruments. Finally, demographic variables include the log of income (`Income`), the log of income squared, (`Income Sq`), `Age`, and `Child`, a dummy variable equal to one if the individual is less than sixteen.




## Explaining the Results

## Code

We start with a plain logit model and then progressively move towards random effects.

Before jumping into the code, let's go over some reserved variable names:

1. `market_ids`: unique market identifiers
2. `shares`: specifies market shares within `market_ids`
3. `prices`: endogenous prices
4. `demand_instruments_0`, `demand_instruments_1`: excluded instruments

In [3]:
import pyblp
import numpy as np
import pandas as pd

In [4]:
# Import data
product_data = pd.read_csv(pyblp.data.NEVO_PRODUCTS_LOCATION)
product_data.head()

## TODO: How were demand instruments computed? Check: https://pyblp.readthedocs.io/en/stable/_api/pyblp.data.html#module-pyblp.data

Unnamed: 0,market_ids,city_ids,quarter,product_ids,firm_ids,brand_ids,shares,prices,sugar,mushy,...,demand_instruments10,demand_instruments11,demand_instruments12,demand_instruments13,demand_instruments14,demand_instruments15,demand_instruments16,demand_instruments17,demand_instruments18,demand_instruments19
0,C01Q1,1,1,F1B04,1,4,0.012417,0.072088,2,1,...,2.116358,-0.154708,-0.005796,0.014538,0.126244,0.067345,0.068423,0.0348,0.126346,0.035484
1,C01Q1,1,1,F1B06,1,6,0.007809,0.114178,18,1,...,-7.374091,-0.576412,0.012991,0.076143,0.029736,0.087867,0.110501,0.087784,0.049872,0.072579
2,C01Q1,1,1,F1B07,1,7,0.012995,0.132391,4,1,...,2.187872,-0.207346,0.003509,0.091781,0.163773,0.111881,0.108226,0.086439,0.122347,0.101842
3,C01Q1,1,1,F1B09,1,9,0.00577,0.130344,3,0,...,2.704576,0.040748,-0.003724,0.094732,0.135274,0.08809,0.101767,0.101777,0.110741,0.104332
4,C01Q1,1,1,F1B11,1,11,0.017934,0.154823,12,0,...,1.261242,0.034836,-0.000568,0.102451,0.13064,0.084818,0.101075,0.125169,0.133464,0.121111


In [6]:
# Set up: Need to specify the formulation of our demand model
logit_formulation = pyblp.Formulation('prices', absorb='C(product_ids)')
print(logit_formulation)

prices + Absorb[C(product_ids)]


In [9]:
# With the formulation, we can set up the problem
problem = pyblp.Problem(logit_formulation, product_data)

Initializing the problem ...
Absorbing demand-side fixed effects ...
Initialized the problem after 00:00:00.

Dimensions:
 T    N     F    K1    MD    ED 
---  ----  ---  ----  ----  ----
94   2256   5    1     20    1  

Formulations:
     Column Indices:          0   
--------------------------  ------
X1: Linear Characteristics  prices


In the output above, we have the following information:

$T$: number of markets

$N$: number of product-market observations

$F$: number of firms (not used in this example)

$K_1$: dimension of linear demand parameters

$M_D$: dimension of the instrument variables (excluded instruments + exogenous variables)

$E_D$: number of fixed effect dimensions (e.g. one-dimensional fixed effects)

In [10]:
# Let's compute it
logit_results = problem.solve()

Solving the problem ...
Updating the weighting matrix ...
Computed results after 00:00:00.

Problem Results Summary:
GMM     Objective    Clipped  Weighting Matrix
Step      Value      Shares   Condition Number
----  -------------  -------  ----------------
 1    +1.899432E+02     0      +6.927228E+07  

Estimating standard errors ...
Computed results after 00:00:00.

Problem Results Summary:
GMM     Objective    Clipped  Weighting Matrix
Step      Value      Shares   Condition Number
----  -------------  -------  ----------------
 2    +1.874555E+02     0      +5.682065E+07  

Cumulative Statistics:
Computation   Objective 
   Time      Evaluations
-----------  -----------
 00:00:00         2     

Beta Estimates (Robust SEs in Parentheses):
    prices     
---------------
 -3.004710E+01 
(+1.008589E+00)


In [1]:
# We can check out the estimated parameter(s)
logit_results.parameters

NameError: name 'logit_results' is not defined

## Random coefficients

If we want to include demographic information to estimate how tastes correlate for different product characteristics we can estimate a random coefficients model. Like mentioned in the introduction, this means that we now must include non-linear demand-side parameters collected in $\mu_{ijt}$.

Just like in the logit model, our data consists of the following information:

$T$: number of markets

$N$: number of product-market observations

$F$: number of firms (not used in this example)

$K_1$: dimension of linear demand parameters

$M_D$: dimension of the instrument variables (excluded instruments + exogenous variables)

$E_D$: number of fixed effect dimensions (e.g. one-dimensional fixed effects)

In [23]:
# Import data
product_data = pd.read_csv(pyblp.data.NEVO_PRODUCTS_LOCATION)
product_data.head()

Unnamed: 0,market_ids,city_ids,quarter,product_ids,firm_ids,brand_ids,shares,prices,sugar,mushy,...,demand_instruments10,demand_instruments11,demand_instruments12,demand_instruments13,demand_instruments14,demand_instruments15,demand_instruments16,demand_instruments17,demand_instruments18,demand_instruments19
0,C01Q1,1,1,F1B04,1,4,0.012417,0.072088,2,1,...,2.116358,-0.154708,-0.005796,0.014538,0.126244,0.067345,0.068423,0.0348,0.126346,0.035484
1,C01Q1,1,1,F1B06,1,6,0.007809,0.114178,18,1,...,-7.374091,-0.576412,0.012991,0.076143,0.029736,0.087867,0.110501,0.087784,0.049872,0.072579
2,C01Q1,1,1,F1B07,1,7,0.012995,0.132391,4,1,...,2.187872,-0.207346,0.003509,0.091781,0.163773,0.111881,0.108226,0.086439,0.122347,0.101842
3,C01Q1,1,1,F1B09,1,9,0.00577,0.130344,3,0,...,2.704576,0.040748,-0.003724,0.094732,0.135274,0.08809,0.101767,0.101777,0.110741,0.104332
4,C01Q1,1,1,F1B11,1,11,0.017934,0.154823,12,0,...,1.261242,0.034836,-0.000568,0.102451,0.13064,0.084818,0.101075,0.125169,0.133464,0.121111


For the estmation, we proceed with the following steps:

1. Define a formulation for the (linear) $X_1$ demand model.
2. Define a formulation for the (nonlinear) $X_2$ demand model. This should only include the variables over which we want random effects. 
3. Define an `Integration` configuration.
4. Combine the `Formlation` classes, `product_data` and the `Integration` configuration to construct a `Problem`. 

In [25]:
X1_formulation = pyblp.Formulation('0 + prices', absorb='C(product_ids)')
X2_formulation = pyblp.Formulation('1 + prices + sugar + mushy')
product_formulations = (X1_formulation, X2_formulation)
product_formulations

(prices + Absorb[C(product_ids)], 1 + prices + sugar + mushy)

In [26]:
# Define integration configuration (MC draws from standard normal for 50 individuals)
mc_integration = pyblp.Integration('monte_carlo', size=50, specification_options={'seed': 0})
mc_integration

Configured to construct nodes and weights with Monte Carlo simulation with options {seed: 0}.

In [27]:
# Set up the problem we are trying to solve
mc_problem = pyblp.Problem(product_formulations, product_data, integration=mc_integration)

Initializing the problem ...
Absorbing demand-side fixed effects ...
Initialized the problem after 00:00:00.

Dimensions:
 T    N     F    I     K1    K2    MD    ED 
---  ----  ---  ----  ----  ----  ----  ----
94   2256   5   4700   1     4     20    1  

Formulations:
       Column Indices:           0       1       2      3  
-----------------------------  ------  ------  -----  -----
 X1: Linear Characteristics    prices                      
X2: Nonlinear Characteristics    1     prices  sugar  mushy


In [28]:
# Set up optimization configuration
bfgs = pyblp.Optimization('bfgs', {'gtol': 1e-4})
results1 = mc_problem.solve(sigma=np.ones((4, 4)), optimization=bfgs)

Solving the problem ...

Nonlinear Coefficient Initial Values:
Sigma:        1           prices          sugar          mushy      |  Sigma Squared:        1           prices          sugar          mushy    
------  -------------  -------------  -------------  -------------  |  --------------  -------------  -------------  -------------  -------------
  1     +1.000000E+00                                               |        1         +1.000000E+00  +1.000000E+00  +1.000000E+00  +1.000000E+00
prices  +1.000000E+00  +1.000000E+00                                |      prices      +1.000000E+00  +2.000000E+00  +2.000000E+00  +2.000000E+00
sugar   +1.000000E+00  +1.000000E+00  +1.000000E+00                 |      sugar       +1.000000E+00  +2.000000E+00  +3.000000E+00  +3.000000E+00
mushy   +1.000000E+00  +1.000000E+00  +1.000000E+00  +1.000000E+00  |      mushy       +1.000000E+00  +2.000000E+00  +3.000000E+00  +4.000000E+00
Starting optimization ...

GMM   Optimization   Objective   F

 1         13           20           643         1998         0     +1.709813E+02  +2.530586E+00  +2.145849E+01  +2.668277E-01, -3.685002E+00, +1.016783E+01, +2.170148E-02, -1.421849E-01, +3.124917E-02, -4.305572E-02, -4.198917E-01, -2.551813E-02, +4.623964E-01
 1         14           21           635         1973         0     +1.707025E+02  +2.788304E-01  +3.608216E+00  +3.570481E-01, -3.817709E+00, +1.019190E+01, +1.429127E-02, -1.406186E-01, +2.726278E-02, -8.912507E-02, -3.848715E-01, +4.248616E-02, +3.913389E-01
 1         15           22           633         1974         0     +1.706346E+02  +6.788879E-02  +4.507758E+00  +4.085872E-01, -3.930566E+00, +1.028899E+01, +1.243444E-02, -1.412215E-01, +2.663432E-02, -1.380862E-01, -4.095786E-01, +6.102722E-02, +3.764763E-01
 1         16           23           640         1997         0     +1.705610E+02  +7.362754E-02  +3.823741E+00  +5.049743E-01, -4.234224E+00, +1.049397E+01, +1.149021E-02, -1.431276E-01, +2.599731E-02, -2.486737E-

 2         0             2          1773         5411         0     +1.717821E+03                 +2.842536E+03  +1.338712E+00, -1.372131E+01, +6.754997E+00, -5.709669E-01, +6.400537E-01, +2.723030E-01, -6.598161E-01, -5.001782E-01, -2.036819E-02, +3.948961E-01
 2         0             3           500         1571         0     +1.495374E+02  +3.308679E-01  +3.296787E+01  +1.322686E+00, -1.372533E+01, +6.745247E+00, +7.578622E-02, -7.120688E-02, +3.693199E-02, -6.719974E-01, -5.042749E-01, -3.578379E-02, +3.417797E-01
 2         1             4           563         1770         0     +1.550263E+02                 +1.648920E+02  +1.344295E+00, -1.372213E+01, +6.746390E+00, +1.021161E-01, -4.349559E-02, +6.070821E-02, -6.587312E-01, -5.102611E-01, -3.452186E-02, +3.584631E-01
 2         1             5           500         1578         0     +1.495197E+02  +1.773914E-02  +2.339699E+01  +1.323813E+00, -1.372516E+01, +6.745307E+00, +7.715889E-02, -6.976219E-02, +3.817153E-02, -6.713058E-

In [46]:
# Let's check out the results
results1

Problem Results Summary:
GMM     Objective      Gradient         Hessian         Hessian     Clipped  Weighting Matrix  Covariance Matrix
Step      Value          Norm       Min Eigenvalue  Max Eigenvalue  Shares   Condition Number  Condition Number 
----  -------------  -------------  --------------  --------------  -------  ----------------  -----------------
 2    +1.483665E+02  +8.703745E-05  +8.523925E-02   +6.535573E+03      0      +5.150953E+07      +8.252073E+05  

Cumulative Statistics:
Computation  Optimizer  Optimization   Objective   Fixed Point  Contraction
   Time      Converged   Iterations   Evaluations  Iterations   Evaluations
-----------  ---------  ------------  -----------  -----------  -----------
 00:00:59       Yes          58           75          86976       266894   

Nonlinear Coefficient Estimates (Robust SEs in Parentheses):
Sigma:         1             prices            sugar            mushy       |  Sigma Squared:         1             prices           

In [48]:
# We can access the estimated parameters of our model (see https://pyblp.readthedocs.io/en/stable/_api/pyblp.ProblemResults.html#pyblp.ProblemResults)
results1.parameters

array([[ 1.20756626e+00],
       [-1.14476023e+01],
       [ 8.42360026e+00],
       [ 6.05755506e-02],
       [-9.13679006e-02],
       [ 3.78337369e-02],
       [-5.87911416e-01],
       [-6.21828137e-01],
       [-2.26170352e-02],
       [ 4.80004753e-01],
       [-3.13734983e+01]])

## Adding demograhpic information

In [None]:
# Import demograhpic information
agent_data = pd.read_csv(pyblp.data.NEVO_AGENTS_LOCATION)
agent_data.head()

# FIXME: Check out integration nodes.

In [8]:
agent_formulation = pyblp.Formulation('0 + income + income_squared + age + child')
agent_formulation

income + income_squared + age + child

In [7]:
nevo_problem = pyblp.Problem(
    product_formulations,
    product_data,
    agent_formulation,
    agent_data
)

NameError: name 'product_formulations' is not defined