In [1]:
%load_ext autoreload
%autoreload 2

import numpy as np
import scipy
from functools import partial


# Structural Econometrics in Labor and IO.

# Problem Set: Estimation of Static Differentiated Product Demand Models.

  
## Description

Consider a scenario where only market level data on product demand are available: market shares, prices, product characteristics. We wish to estimate consumer preferences, compute the resulting price elasticities, and ultimately learn the firms' implied profit margins and marginal cost. To do so, we make use of a model of individual utility maximization and imperfect competition on the product market. Note, that the crucial assumption in Berry (1994) and virtually all subsequent work is that market shares are generated by a large number (technically, infinity) of consumers in the market and, hence, can be used as a measure of individual choice probabilities.



#### Demand
We assume consumer $i$ chooses one unit of product $j\in J$ or an outside good $j=0$ (e.g. no purchase) to obtain utility

\begin{equation}
u_{ijt} = x_{jt}\beta_i + \alpha p_{jt} + \xi_{jt} + \varepsilon_{ijt} = \delta_{jt} + \mu_{jt}(\nu_i) + \varepsilon_{ijt}\tag{1}
\end{equation}

where $(x_{jt},p_{jt})$ are observable characteristics and price, $\xi_{jt}$ is the unobservable characteristic, $\delta_{jt} = x_{jt}\beta + \alpha p_{jt} + \xi_{jt}$, $\mu_{jt}(\nu_i) = x_{jt}\sigma \nu_i$, and $\varepsilon_{ijt}$ i.i.d. extreme value type 1.  The utility of the outside good is normalized such that $\delta_{0t}=0$. The assumption of utility-maximizing consumers and the distribution of $\varepsilon_{ijt}$ yields the logit choice probabilities:

\begin{equation}
s_{jt}(\delta_t,\sigma) = \int \frac{\exp(\delta_{jt} + \mu_{jt}(\nu_i))}{1+\sum_{l=1}^J \exp(\delta_{lt} + \mu_{lt}(\nu_i))} dP_{\nu}(\nu),
\tag{2}
\end{equation}

where $P_{\nu}$ is typically assumed to take a specific parametric form.

#### Supply
Assume that firms f maximize profits given by

\begin{equation}
\pi_f = \sum_{k\in F_f}(p_{kt} - c_{kt})s_{kt}L_{t},
\tag{3}
\end{equation}

where $c_{kt}$ are marginal cost and $L_{t}$ market size, so that the system of FOC for a Nash equilibrium:

\begin{equation}
s_{jt} + \sum_{k\in F_f}(p_{kt} - c_{kt})\frac{\partial s_{kt}}{\partial p_{jt}} = 0
\tag{4}
\end{equation}

For implementation, it will be useful to write expression (4) in vector notation as $s_t+\Delta_t(p_t-c_t)=0,$ where $\Delta_t(j,k)$ denotes a diagonal matrix of own-price derivatives and off-diagonal elements according to market structure (equal to zero in the case of single-product firms). If marginal cost are known, we obtain the supply side by solving the system for $c_t$:

\begin{equation}
p_t+\Delta_t^{-1}s_t = c_t
\tag{5}
\end{equation}

When changes in market structure are of interest (e.g. merger analysis), it is useful to generate a full $\tilde{\Delta}_t$ derivative matrix, an equally-sized matrix $O_t$ designating firms' product ownership, and element-wise multiply $\tilde{\Delta}_t$ and $O_t$. $\partial s_{kt} / \partial p_{jt}$ is given by $\alpha s_{jt} (1-s_{jt})$ if $j=k$ and $-\alpha s_{jt} s_{kt}$ otherwise.

## Problems


### 1. Simulation

Simulate data based on the described model assuming the following

- $J=10$ products are sold in $T=250$ markets (size $L_t=1$) by single-product firms.
- Two observable product characteristics $x_{jt}=(1,x_{jt}^1)$, with $x_{jt}^1 \sim U(1,2)$.
- Marginal cost $c_{jt}=x_{jt}\gamma_1 + w_{jt}\gamma_2 + \omega_{jt}$.
- Three observable cost shifters $w_{jt}=(w_{jt}^1,w_{jt}^2,w_{jt}^3)$, all i.i.d. $U(0,1)$.
- Marginal cost parameters: $\gamma_1=(0.7,0.7)$ and $\gamma_2=(1,1,1)$.
- Unobserved demand and cost characteristic $(\xi_{jt},\omega_{jt}) \sim N(0,\sigma_c)$ 
    with $\sigma_c = \left[ \begin{array}{cc} 1 & 0.7 \\ 0.7 & 1 \\ \end{array}\right]$.  
- Preference parameters $\beta=(2,2), \alpha=-2, \sigma=(0,1)$.
- $\nu \sim N(0,1)$.
-  $L_t=1$ in each market.

To simulate the data, you must code functions computing


- a $T\times J$ matrix containing market shares using equation (4). This requires numerical integration, which can be done for example in the following ways:


    - MC integration: Draw many (at least a 100, but of course depends on the function) values from the distribution over which you want to integrate, calculate the value of the function and take the mean over all integration points. 


    - Quadrature: You can think of this of a smart way to draw the values from the distribution, which drastically reduces the number of draws. The draws are then not equally weighted (as in the mean) and instead they have particular weights. You can find the mathematical details here: https://en.wikipedia.org/wiki/Gauss%E2%80%93Hermite_quadrature and the approriate scipy function is *scipy.special.roots_hermite*. 
    
  
- a $T\times (J \times J)$ matrix containing market share derivatives with respect to price (own-price on the diagonal and cross-price off the diagonal),
- a $T\times J$ matrix containing (Bertrand-Nash) equilibrium prices, using equation (5) and scipy's root-finding function *scipy.optimize.fsolve* or *scipy.optimize.root*.

In [2]:
# Import custom functions from python files:
from generate_data import simulate_data
from integration import quadrature_hermite
from shares_and_derivatives import compute_bertrand_prices, compute_shares, compute_mean_utility

In [3]:
seed = 100

In [4]:
params = {
    "gamma_1": [0.7, 0.7],
    "gamma_2": [1, 1, 1],
    "beta": [2, 2],
    "alpha": -2,
    "sigma_c": np.array([[1, 0.7], [0.7, 1]]),
    "sigma": [0, 1],
}

# number of markets, T
num_markets = 250
# number of brands per market, J
num_products = 10

quad_draws, quad_weights = quadrature_hermite(n_quad_points=15, mu=0, sigma=1)

In [5]:
df = simulate_data(
    params=params, num_products=num_products, num_markets=num_markets, seed=seed
)
df

Unnamed: 0_level_0,Unnamed: 1_level_0,obs_char_0,obs_char_1,obs_cost_shifter_1,obs_cost_shifter_2,obs_cost_shifter_3,xi,omega,marginal_costs
market,product,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
0,0,1,1.543405,0.249526,0.439082,0.940456,-1.092337,-0.577983,2.831465
0,1,1,1.278369,0.267269,0.713405,0.164054,0.561220,-0.261381,2.478206
0,2,1,1.424518,0.621049,0.785709,0.495000,0.988689,1.020035,4.618956
0,3,1,1.844776,0.150104,0.780814,0.246959,2.028907,2.356624,5.525845
0,4,1,1.004719,0.391014,0.516103,0.681441,0.483138,-0.448913,2.542949
...,...,...,...,...,...,...,...,...,...
249,5,1,1.200090,0.237092,0.212944,0.609612,-1.948449,-1.706798,0.892914
249,6,1,1.919083,0.768674,0.085858,0.295009,0.267682,-0.254561,2.938338
249,7,1,1.478302,0.650803,0.993083,0.360738,0.466113,0.551296,4.290731
249,8,1,1.635912,0.861580,0.559942,0.033316,-0.013543,-1.457153,1.842822


In [6]:
df["bertrand_prices"] = compute_bertrand_prices(
    df=df, params=params, quad_draws=quad_draws, quad_weights=quad_weights
)

In [7]:
# First calculate the mean utility
mean_utility = compute_mean_utility(
    params=params,
    x_0=df["obs_char_0"].to_numpy(),
    x_1=df["obs_char_1"].to_numpy(),
    price=df["bertrand_prices"].to_numpy(),
    xi=df["xi"].to_numpy(),
)

# Then the implied market shares
df["market_shares"] = compute_shares(
    params=params,
    x_0=df["obs_char_0"].to_numpy(),
    x_1=df["obs_char_1"].to_numpy(),
    delta=mean_utility,
    num_products=num_products,
    num_markets=num_markets,
    quad_draws=quad_draws,
    quad_weights=quad_weights,
).reshape(num_products * num_markets)
df

Unnamed: 0_level_0,Unnamed: 1_level_0,obs_char_0,obs_char_1,obs_cost_shifter_1,obs_cost_shifter_2,obs_cost_shifter_3,xi,omega,marginal_costs,bertrand_prices,market_shares
market,product,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
0,0,1,1.543405,0.249526,0.439082,0.940456,-1.092337,-0.577983,2.831465,3.346933,0.030008
0,1,1,1.278369,0.267269,0.713405,0.164054,0.561220,-0.261381,2.478206,3.061323,0.142540
0,2,1,1.424518,0.621049,0.785709,0.495000,0.988689,1.020035,4.618956,5.121515,0.005092
0,3,1,1.844776,0.150104,0.780814,0.246959,2.028907,2.356624,5.525845,6.029497,0.007251
0,4,1,1.004719,0.391014,0.516103,0.681441,0.483138,-0.448913,2.542949,3.079195,0.067592
...,...,...,...,...,...,...,...,...,...,...,...
249,5,1,1.200090,0.237092,0.212944,0.609612,-1.948449,-1.706798,0.892914,1.459464,0.117466
249,6,1,1.919083,0.768674,0.085858,0.295009,0.267682,-0.254561,2.938338,3.492021,0.096957
249,7,1,1.478302,0.650803,0.993083,0.360738,0.466113,0.551296,4.290731,4.792255,0.003038
249,8,1,1.635912,0.861580,0.559942,0.033316,-0.013543,-1.457153,1.842822,2.515484,0.256684


### 2. Estimation

The simulated data at hand, forget the parameters specified in Problem 1 and

- assume that $\sigma = (0,0)$. Estimate $\{\alpha,\beta\}$ using OLS and report the results.

In [8]:
# Import estimation functions:
from estimation_formulas import ols_formula, two_sls_formula

In [9]:
log_share_outside_good = np.log(1 - df.groupby("market")["market_shares"].transform("sum"))

In [10]:
y = np.log(df["market_shares"].to_numpy()) - log_share_outside_good
ols_formula(
    y=y,
    x=df[["obs_char_0", "obs_char_1", "bertrand_prices"]].to_numpy(),
)

(array([-0.52382639,  1.90183727, -1.3516473 ]),
 array([0.08951068, 0.05433568, 0.01448492]))

- still assume that $\sigma = (0,0)$. Estimate $\{\alpha,\beta\}$ using linear IV and report the results for two alternative sets of instrumental variables: 

    a) $x_{jt}$, the squared term $(x_{jt}^1)^2$, and BLP instruments (sums of other firms' $x_{jt}$)

In [11]:
# GMM estimation: BLP instruments
df["blp_instruments"] = (
    df.groupby("market")["obs_char_1"].transform("sum") - df["obs_char_1"]
)
df["obs_char_1_squared"] = df["obs_char_1"] ** 2

In [12]:
two_sls_formula(
    y=np.log(df["market_shares"].to_numpy()) - log_share_outside_good,
    x=df[["obs_char_0", "obs_char_1", "bertrand_prices"]].to_numpy(),
    z=df[["obs_char_0", "obs_char_1", "blp_instruments", "obs_char_1_squared"]].to_numpy(),
)

(array([ 27.7389772 ,   9.6820547 , -11.94995035]),
 array([251.09974805,  69.12668453,  94.15915088]))

- still assume that $\sigma = (0,0)$. Estimate $\{\alpha,\beta\}$ using linear IV and report the results for two alternative sets of instrumental variables:     

    b) all variables in a) plus cost shifters ($w_{jt}$).

In [13]:
two_sls_formula(
    y=np.log(df["market_shares"].to_numpy()) - log_share_outside_good,
    x=df[["obs_char_0", "obs_char_1", "bertrand_prices"]].to_numpy(),
    z=df[
        [
            "obs_char_0",
            "obs_char_1",
            "blp_instruments",
            "obs_char_1_squared",
            "obs_cost_shifter_1",
            "obs_cost_shifter_2",
            "obs_cost_shifter_3",
        ]
    ].to_numpy(),
)

(array([ 1.03100966,  2.32985434, -1.93469716]),
 array([0.15486035, 0.07539416, 0.04313364]))

### 3. Contraction mapping

Code the BLP contraction mapping to obtain $\delta$ for a given set of parameters (requires observed market shares and the market share function as input): $\delta_t=s_t^{-1}(s_t,\sigma) \equiv \delta_t(s_t,\sigma)$. The algorithm should iterate as follows: $\delta^{h+1} = \delta^h + \ln (s) - \ln (s(\delta^h,\sigma))$.

In [14]:
def contraction_mapping(
    obs_market_shares,
    params,
    x_0,
    x_1,
    num_products,
    num_markets,
    quad_draws,
    quad_weights,
    threshold=1e-6,
):
    """The BLP contraction mapping.

    Args:
        obs_market_shares (numpy.ndarray): 1d array of shape (num_markets * num_products,)
            containing the observed market shares for all products in all markets.
        params (dict): Model parameters.
        x_0 (numpy.ndarray): 1d array of shape (num_markets * num_products,) containing
            the first observable characteristic (constant) for all products in all
            markets.
        x_1 (numpy.ndarray): 1d array of shape (num_markets * num_products,) containing
           the second observable characteristic for all products in all markets.
        num_markets (int): Number of markets.
        num_products (int): Number of products.
        num_quad_points (int): Number of quadrature points.
        threshold (float): Convergence threshold.

    Returns:
        numpy.ndarray: 1d array of shape (num_markets * num_products,) containing the
        converged delta.
    """
    delta_current = np.ones(len(x_0))
    delta_new = delta_current + 1

    while np.abs(delta_new - delta_current).max() > threshold:
        delta_current = delta_new.copy()

        market_share_new = compute_shares(
            params=params,
            x_0=x_0,
            x_1=x_1,
            delta=delta_current,
            num_products=num_products,
            num_markets=num_markets,
            quad_draws=quad_draws,
            quad_weights=quad_weights,
        ).reshape(num_markets * num_products)

        # Calculate new delta
        delta_new = delta_current + np.log(obs_market_shares) - np.log(market_share_new)

    return delta_new

In [15]:
# Calculate delta as the fix point
delta_fixp = contraction_mapping(
    obs_market_shares=df["market_shares"].to_numpy(),
    params=params,
    x_0=df["obs_char_0"].to_numpy(),
    x_1=df["obs_char_1"].to_numpy(),
    num_products=num_products,
    num_markets=num_markets,
    quad_draws=quad_draws,
    quad_weights=quad_weights,
    threshold=1e-12,
)
# And check against the before calculated mean utility.
delta_fixp - mean_utility

array([ 4.44089210e-16,  4.44089210e-16,  0.00000000e+00, ...,
        0.00000000e+00,  1.11022302e-16, -2.22044605e-16])

### 4. Estimation with contraction mapping

Build the function that nests the $\delta$ contraction mapping and the GMM objective function. Then, find the parameter values that minimize this function using Matlab's fminunc or fminsearch routines.

In [16]:
def estimate_gmm_contraction(
    param_vector,
    obs_market_shares,
    x,
    z,
    num_products,
    num_markets,
    quad_draws,
    quad_weights,
    threshold=1e-12,
):
    """Estimate sigma with the contraction mapping.

    Args:
        param_vector (numpy.ndarray): 1d array of shape (2,) containing the parameters
            to be estimated.
        obs_market_shares (numpy.ndarray): 1d array of shape (num_markets * num_products,)
            containing the observed market shares for all products in all markets.
        x (numpy.ndarray): 2d array of shape (num_markets * num_products, 2) containing
            the first and second observable characteristics for all products in all
            markets.
        z (numpy.ndarray): 2d array of shape
            (num_markets * num_products, num_non_instrumented_obs + num_instruments)
            containing the non instrumented observables and instruments for all products
            in all markets.
        num_markets (int): Number of markets.
        num_products (int): Number of products.
        num_quad_points (int): Number of quadrature points.
        threshold (float): Convergence threshold.

    Returns:
        float: The GMM objective function value.
    """
    # Fill the dictionary of parameters
    params = {
        "sigma": np.array([0, param_vector[0]]),
    }
    delta_contract = contraction_mapping(
        obs_market_shares=obs_market_shares,
        params=params,
        x_0=x[:, 0],
        x_1=x[:, 1],
        num_products=num_products,
        num_markets=num_markets,
        quad_draws=quad_draws,
        quad_weights=quad_weights,
        threshold=threshold,
    )

    # Create weighting matrix
    norm = np.mean(np.mean(z.T @ z))
    weight_matrix = np.linalg.inv((z.T @ z) / norm) / norm

    # Estimate coefficients
    projection_matrix = z @ weight_matrix @ z.T
    coeffs = (
        np.linalg.inv(x.T @ projection_matrix @ x)
        @ x.T
        @ projection_matrix
        @ delta_contract
    )
    np.savetxt("coeffs_gmm.txt", coeffs)

    # Calculate residuals
    residuals = delta_contract - x @ coeffs

    # Calculate criterion function values
    func_value = residuals.T @ z @ weight_matrix @ z.T @ residuals
    return func_value

In [17]:
# Partial in all arguments, except params
partial_gmm = partial(
    estimate_gmm_contraction,
    obs_market_shares=df["market_shares"].to_numpy(),
    x=df[["obs_char_0", "obs_char_1", "bertrand_prices"]].to_numpy(),
    z=df[
        [
            "obs_char_0",
            "obs_char_1",
            "obs_char_1_squared",
            "blp_instruments",
            "obs_cost_shifter_1",
            "obs_cost_shifter_2",
            "obs_cost_shifter_3",
        ]
    ].to_numpy(),
    num_products=num_products,
    num_markets=num_markets,
    quad_draws=quad_draws,
    quad_weights=quad_weights,
    threshold=1e-12,
)

In [18]:
# Minimize the function
result = scipy.optimize.minimize(
    fun=partial_gmm, x0=np.array([0.5]), method="L-BFGS-B"
)
result

  message: CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL
  success: True
   status: 0
      fun: 3.5043515914215893
        x: [ 9.176e-01]
      nit: 5
      jac: [ 7.505e-06]
     nfev: 14
     njev: 7
 hess_inv: <1x1 LbfgsInvHessProduct with dtype=float64>

In [19]:
# Load saved coefficients
print(np.loadtxt("coeffs_gmm.txt"),result.x)

[ 1.74856468  2.09012939 -1.98274491] [0.91758058]
