Merge branch 'custom-ownership'

jeffgortmaker · May 2, 2018 · 72cdd41 · 72cdd41
2 parents c39664c + 3e3e12c
commit 72cdd41
Show file tree

Hide file tree

Showing 12 changed files with 250 additions and 77 deletions.
diff --git a/docs/api.rst b/docs/api.rst
@@ -36,6 +36,7 @@ There are also a number of convenience functions that can be used to construct c
 
    build_id_data
    build_indicators
+   build_ownership
    build_blp_instruments
 
 

diff --git a/docs/background.rst b/docs/background.rst
@@ -7,7 +7,7 @@ The following sections provide a brief overview of the BLP model and how it is e
 The Model
 ---------
 
-At a high level, there are :math:`t = 1, 2, \dotsc, T` markets, each with :math:`j = 1, 2, \dotsc, J_t` products produced by :math:`f = 1, 2, \dotsc, F_t` firms. There are :math:`i = 1, 2, \dotsc, I_t` agents who choose among the :math:`J_t` products and an outside good, denoted by :math:`j = 0`. 
+At a high level, there are :math:`t = 1, 2, \dotsc, T` markets, each with :math:`j = 1, 2, \dotsc, J_t` products produced by :math:`f = 1, 2, \dotsc, F_t` firms. There are :math:`i = 1, 2, \dotsc, I_t` agents who choose among the :math:`J_t` products and an outside good, denoted by :math:`j = 0`. The set :math:`\mathscr{J}_{ft} \subset \{1, 2, \ldots, J_t\}` denotes the products produced by firm :math:`f` in market :math:`t`.
 
 
 Demand-Side
@@ -65,7 +65,9 @@ Called the BLP-markup equation in :ref:`Morrow and Skerlos (2011) <ms11>`, the m
 
 .. math:: \eta = -\left(O \odot \frac{\partial s}{\partial p}\right)^{-1}s,
 
-in which :math:`O_{jk}` is :math:`1` if the same firm produces products :math:`j` and :math:`k`, and is :math:`0` otherwise. The Jacobian is
+in which the market's owenership matrix, :math:`O`, is definited in terms of its corresponding cooperation matrix, :math:`\kappa` by :math:`O_{jk} = \kappa_{fg}` where :math:`j \in \mathscr{J}_{ft}`, the set of products produced by firm :math:`f` in the market, and similarly, :math:`g \in \mathscr{J}_{gt}`. Usually, :math:`\kappa = I`, the identity matrix, so :math:`O_{jk}` is simply :math:`1` if the same firm produces products :math:`j` and :math:`k`, and is :math:`0` otherwise.
+
+The Jacobian in the BLP-markup equation is
 
 .. math:: \frac{\partial s}{\partial p} = \Lambda - \Gamma,
 

diff --git a/docs/examples.rst b/docs/examples.rst
@@ -377,7 +377,7 @@ Since we included two columns of firm IDs in both problems, we can use :meth:`Re
    blp_changed_prices = blp_results.solve_merger(blp_costs)
    nevo_changed_prices = nevo_results.solve_merger(nevo_costs)
 
-If the problems were configured with more than two columns of firm IDs, we could estimate post-merger prices for the other mergers with the `firm_ids_index` argument, which is by default ``1``.
+If the problems were configured with more than two columns of firm IDs, we could estimate post-merger prices for the other mergers with the `firms_index` argument, which is by default ``1``.
 
 We'll compute post-merger shares with :meth:`Results.compute_shares`.
 
@@ -390,8 +390,8 @@ Post-merger prices and shares are used to compute other post-merger outputs. For
 
 .. ipython:: python
 
-   blp_changed_hhi = blp_results.compute_hhi(blp_changed_shares, firm_ids_index=1)
-   nevo_changed_hhi = nevo_results.compute_hhi(nevo_changed_shares, firm_ids_index=1)
+   blp_changed_hhi = blp_results.compute_hhi(blp_changed_shares, firms_index=1)
+   nevo_changed_hhi = nevo_results.compute_hhi(nevo_changed_shares, firms_index=1)
    bins = np.linspace(0, 3000, 50)
    plt.hist(blp_changed_hhi - blp_hhi, bins, alpha=0.5, color='maroon');
    plt.hist(nevo_changed_hhi - nevo_hhi, bins, alpha=0.5, color='navy');

diff --git a/docs/notation.rst b/docs/notation.rst
@@ -4,25 +4,26 @@ Notation
 The notation in pyblp is a customized amalgamation of the notation employed by :ref:`Berry, Levinsohn, and Pakes (1995) <blp95>`, :ref:`Nevo (2000) <n00>`, and :ref:`Morrow and Skerlos (2011) <ms11>`.
 
 
-Dimensions
-----------
+Dimensions and Sets
+-------------------
 
-===========  ==================================
-Symbol       Description
-===========  ==================================
-:math:`N`    Products across all markets.
-:math:`T`    Markets.
-:math:`J_t`  Products in market :math:`t`.
-:math:`F_t`  Firms in market :math:`t`.
-:math:`I_t`  Agents in market :math:`t`.
-:math:`K_1`  Linear product characteristics.
-:math:`K_2`  Nonlinear product characteristics.
-:math:`K_3`  Cost product characteristics.
-:math:`D`    Demographic variables.
-:math:`M_D`  Demand-side instruments.
-:math:`M_S`  Supply-side instruments.
-:math:`P`    Unknown nonlinear parameters.
-===========  ==================================
+========================  =======================================================
+Symbol                    Description
+========================  =======================================================
+:math:`N`                 Products across all markets.
+:math:`T`                 Markets.
+:math:`J_t`               Products in market :math:`t`.
+:math:`F_t`               Firms in market :math:`t`.
+:math:`I_t`               Agents in market :math:`t`.
+:math:`K_1`               Linear product characteristics.
+:math:`K_2`               Nonlinear product characteristics.
+:math:`K_3`               Cost product characteristics.
+:math:`D`                 Demographic variables.
+:math:`M_D`               Demand-side instruments.
+:math:`M_S`               Supply-side instruments.
+:math:`P`                 Unknown nonlinear parameters.
+:math:`\mathscr{J}_{ft}`  Products produced by firm :math:`f` in market :math:`t`
+========================  =======================================================
 
 
 Matrices, Vectors, and Scalars
@@ -50,6 +51,8 @@ Symbol                     Dimensions                    Description
 :math:`\zeta^*`            :math:`N \times 1`            Post-merger markup term from the :math:`\zeta`-markup equation.
 :math:`O`                  :math:`J_t \times J_t`        Ownership matrix in market :math:`t`.
 :math:`O^*`                :math:`J_t \times J_t`        Post-merger ownership matrix in market :math:`t`.
+:math:`\kappa`             :math:`F_t \times F_t`        Cooperation matrix in market :math:`t`.
+:math:`\kappa^*`           :math:`F_t \times F_t`        Post-merger cooperation matrix in market :math:`t`.
 :math:`\Lambda`            :math:`J_t \times J_t`        Diagonal matrix used to decompose :math:`\eta` and :math:`\zeta` in market :math:`t`.
 :math:`\Gamma`             :math:`J_t \times J_t`        Another matrix used to decompose :math:`\eta` and :math:`\zeta` in market :math:`t`.
 :math:`d`                  :math:`I_t \times D`          Observed agent characteristics called demographics in market :math:`t`.

diff --git a/pyblp/__init__.py b/pyblp/__init__.py
@@ -1,6 +1,6 @@
 """Loads public-facing objects into the top-level namespace."""
 
-from .construction import build_id_data, build_indicators, build_blp_instruments
+from .construction import build_id_data, build_indicators, build_ownership, build_blp_instruments
 from .utilities import Iteration, Integration, Optimization
 from . import data, options, primitives
 from .simulation import Simulation

diff --git a/pyblp/construction.py b/pyblp/construction.py
@@ -37,8 +37,6 @@ def build_id_data(T, J, F, mergers=()):
 
     Example
     -------
-
-
     The following code builds a small panel of market and firm IDs with an extra column of firm IDs that represents a
     simple acquisition:
 
@@ -47,7 +45,7 @@ def build_id_data(T, J, F, mergers=()):
        @suppress
        np.set_printoptions(linewidth=1)
 
-       id_data = pyblp.build_id_data(T=2, J=4, F=3, mergers=[{0: 1}])
+       id_data = pyblp.build_id_data(T=2, J=5, F=4, mergers=[{2: 0}])
        id_data
 
     """
@@ -111,6 +109,118 @@ def build_indicators(ids):
     return np.hstack([np.where(np.c_[ids] == i, 1, 0) for i in np.unique(ids)]).astype(options.dtype)
 
 
+def build_ownership(id_data, kappa_specification=None):
+    r"""Build ownership matrices, :math:`O`.
+
+    Ownership matrices are defined by their cooperation matrix counterparts, :math:`\kappa`. For each market :math:`t`,
+    :math:`O_{jk} = \kappa_{fg}` where :math:`j \in \mathscr{J}_{ft}`, the set of products produced by firm :math:`f` in
+    the market, and similarly, :math:`g \in \mathscr{J}_{gt}`.
+
+    Parameters
+    ----------
+    id_data : `structured array-like`
+        IDs that associate products with markets and firms. Each row corresponds to a product. Fields:
+
+            - **market_ids** : (`object`) - IDs that associate products with markets.
+
+            - **firm_ids** : (`object`) - IDs that associate products with firms. Each column will be used to construct
+              one stack of ownership matrices. If there are multiple columns, this field can either be a matrix or it
+              can be broken up into multiple one-dimensional fields with column index suffixes that start at zero. For
+              example, if there are two columns, this field can be replaced with two one-dimensional fields: `firm_ids0`
+              and `firm_ids1`.
+
+    kappa_specification : `callable, optional`
+        A function that specifies each market's cooperation matrix, :math:`\kappa`. The function is of the following
+        form::
+
+            kappa(f, g) -> value
+
+        where `value` is :math:`O_{jk}` and both `f` and `g` are firm IDs from the `firm_ids` field of `id_data`.
+
+        The default specification, ``lambda: f, g: int(f == g)``, constructs traditional ownership matrices. That is,
+        :math:`\kappa = I`, the identify matrix, implies that :math:`O_{jk}` is :math:`1` if the same firm produces
+        products :math:`j` and :math:`k`, and is :math:`0` otherwise.
+
+        If `firm_ids` happen to be indices for an actual :math:`\kappa` matrix, ``lambda f, g: kappa[f, g]`` will
+        build ownership matrices according to the matrix ``kappa``.
+
+    Returns
+    -------
+    `ndarray`
+        Stacked :math:`J_t \times J_t` ownership matrices, :math:`O`, for each market :math:`t`. Each stack is
+        associated with a `firm_ids` column. If a market has fewer products than others, extra columns will contain
+        ``numpy.nan``.
+
+    Example
+    -------
+    The following code uses IDs created by the example for :func:`build_id_data` to build two stacks of standard
+    ownership matrices.
+
+    .. ipython:: python
+
+       @suppress
+       np.set_printoptions(threshold=100)
+
+       id_data = pyblp.build_id_data(T=2, J=5, F=4, mergers=[{2: 0}])
+       ownership = pyblp.build_ownership(id_data)
+       ownership
+
+    We'll now define modify the default :math:`\kappa` specification so that the elements associated with firm IDs ``0``
+    and ``1`` are equal to ``0.5``.
+
+    .. ipython:: python
+
+       def kappa_specification(f, g):
+           if f == g:
+               return 1
+           return 0.5 if f < 2 and g < 2 else 0
+
+    The following code uses this specification to build two more stacks of non-standard ownership matrices.
+
+    .. ipython:: python
+
+       @suppress
+       np.set_printoptions(threshold=100)
+
+       ownership = pyblp.build_ownership(id_data, kappa_specification)
+       ownership
+
+    """
+
+    # extract and validate IDs
+    market_ids = extract_matrix(id_data, 'market_ids')
+    firm_ids = extract_matrix(id_data, 'firm_ids')
+    if market_ids is None:
+        raise KeyError("id_data must have a market_ids field.")
+    if firm_ids is None:
+        raise KeyError("id_data must have a firm_ids field.")
+    if market_ids.shape[1] > 1:
+        raise ValueError("The market_ids field of id_data must be one-dimensional.")
+
+    # validate or use the default kappa specification
+    if kappa_specification is None:
+        kappa_specification = lambda f, g: int(f == g)
+    elif not callable(kappa_specification):
+        raise ValueError("kappa_specification must be None or callable.")
+    kappa_specification = np.vectorize(kappa_specification, [options.dtype])
+
+    # determine the maximum number of products in a market
+    J = np.unique(market_ids, return_counts=True)[1].max()
+
+    # stack ownership matrices vertically for each market and horizontally for each set of firm IDs
+    stacks = []
+    for ids in firm_ids.T:
+        matrices = []
+        for t in np.unique(market_ids):
+            ids_t = ids[market_ids.flat == t]
+            tiled_ids_t = np.tile(np.c_[ids_t], ids_t.size)
+            ownership_t = np.full((ids_t.size, J), np.nan, options.dtype)
+            ownership_t[:, :ids_t.size] = kappa_specification(tiled_ids_t, tiled_ids_t.T)
+            matrices.append(ownership_t)
+        stacks.append(np.vstack(matrices))
+    return np.hstack(stacks)
+
+
 def build_blp_instruments(characteristic_data, average=False):
     r"""Construct traditional BLP instruments.
 

diff --git a/pyblp/primitives.py b/pyblp/primitives.py
@@ -4,18 +4,23 @@
 import scipy.linalg
 
 from . import options
+from .construction import build_ownership
 from .utilities import output, extract_matrix, Matrices, Integration
 
 
 class Products(Matrices):
-    """Structured product data.
+    r"""Structured product data.
 
     Attributes
     ----------
     market_ids : `ndarray`
         IDs that associate products with markets.
     firm_ids : `ndarray`
         IDs that associate products with firms. Any columns after the first represent changes such as mergers.
+    ownership : `ndarray`
+        Stacked :math:`J_t \times J_t` ownership matrices, :math:`O`, for each market :math:`t`. Each stack is
+        associated with a :attr:`Products.firm_ids` column. If a market has fewer products than others, extra columns
+        will contain ``numpy.nan``.
     shares : `ndarray`
         Shares, :math:`s`.
     prices : `ndarray`
@@ -45,6 +50,24 @@ def __new__(cls, product_data, nonlinear_prices=True):
         if market_ids.shape[1] > 1:
             raise ValueError("The market_ids field of product_data must be one-dimensional.")
 
+        # determine the maximum number of products in a market
+        J = np.unique(market_ids, return_counts=True)[1].max()
+
+        # load or build ownership matrices
+        ownership = extract_matrix(product_data, 'ownership')
+        if firm_ids is None:
+            ownership = None
+        elif ownership is None:
+            ownership = build_ownership({'market_ids': market_ids, 'firm_ids': firm_ids})
+        elif ownership.shape[1] % J > 0 or ownership.shape[1] > J * firm_ids.shape[1]:
+            raise ValueError(
+                f"The ownership field of product_data must have a number of columns that is a multiple of {J} and that "
+                f"does not exceed {J * firm_ids.shape[1]}."
+            )
+        elif ownership.shape[1] < J * firm_ids.shape[1]:
+            unmatched_firm_ids = firm_ids[:, ownership.shape[1] / J:]
+            ownership = np.c_[ownership, build_ownership({'market_ids': market_ids, 'firm_ids': unmatched_firm_ids})]
+
         # load shares
         shares = extract_matrix(product_data, 'shares')
         if shares is None:
@@ -86,6 +109,7 @@ def __new__(cls, product_data, nonlinear_prices=True):
         return super().__new__(cls, {
             'market_ids': (market_ids, np.object),
             'firm_ids': (firm_ids, np.object),
+            'ownership': (ownership, options.dtype),
             'shares': (shares, options.dtype),
             'prices': (prices, options.dtype),
             'X1': (np.hstack(X1_list), options.dtype),
@@ -246,10 +270,10 @@ def get_characteristic(self, X1_index=None, X2_index=None):
         """Get the values for a product characteristic in X1 or X2 (or both)."""
         return self.products.X1[:, [X1_index]] if X2_index is None else self.products.X2[:, [X2_index]]
 
-    def get_ownership_matrix(self, firm_ids_index=0):
-        """Get the ownership matrix. By default, unchanged firm IDs are used."""
-        tiled_ids = np.tile(self.products.firm_ids[:, [firm_ids_index]], self.J)
-        return np.where(tiled_ids == tiled_ids.T, 1, 0)
+    def get_ownership_matrix(self, firms_index=0):
+        """Get an ownership matrix. By default, unchanged firm IDs are used."""
+        offset = firms_index * self.products.ownership.shape[1] // self.products.firm_ids.shape[1]
+        return self.products.ownership[:, offset:offset + self.J]
 
     def compute_delta(self, X1=None):
         """Compute delta. By default, the X1 with which this market was initialized is used."""

diff --git a/pyblp/problem.py b/pyblp/problem.py
@@ -34,6 +34,14 @@ class Problem(object):
               supply side of the problem. These are also needed to compute some post-estimation outputs. Any columns
               after the first can be used to compute post-estimation outputs for changes, such as mergers.
 
+            - **ownership** : (`numeric, optional') - Custom stacked :math:`J_t \times J_t` ownership matrices,
+              :math:`O`, for each market :math:`t`, which can be built with :func:`build_ownership`. Each stack is
+              associated with a `firm_ids` column and must have as many columns as there are products in the market with
+              the most products. If a market has fewer products than others, extra columns will be ignored and may be
+              filled with any value, such as ``numpy.nan``. If an ownership matrix stack is unspecified, its
+              corresponding column in `firm_ids` is used by :func:`build_ownership` to build a stack of standard
+              ownership matrices.
+
             - **shares** : (`numeric`) - Shares, :math:`s`.
 
             - **prices** : (`numeric`) - Prices, :math:`p`, which will always be included in :math:`X_1`. If