ENH/API: refactor score computation

- Add score computation methods to ProblemResults via an intermediate EconomyResults class. - Clarify which methods are in SimulationResults and BootstrappedResults. - Simplify micro data simulation. - Split up score computation into two methods: compute_micro_scores and compute_agent_scores. - Add new micro_data validation. - Simplify the score unit test.
jeffgortmaker · Jun 18, 2022 · 61b1928 · 61b1928
1 parent 68e0989
commit 61b1928
Show file tree

Hide file tree

Showing 10 changed files with 1,094 additions and 956 deletions.
diff --git a/README.rst b/README.rst
@@ -108,6 +108,7 @@ Features
 - Product-specific demographics
 - Flexible micro moments that can match statistics based on survey data
 - Support for micro moments based on second choice data
+- Support for optimal micro moments that match micro data scores
 - Fixed effect absorption
 - Nonlinear functions of product characteristics
 - Concentrating out linear parameters

diff --git a/docs/api.rst b/docs/api.rst
@@ -103,7 +103,7 @@ The following methods test the validity of overidentifying and model restriction
    ProblemResults.run_lm_test
    ProblemResults.run_wald_test
 
-In addition to class attributes, other post-estimation outputs can be estimated market-by-market with the following methods, which each return an array.
+In addition to class attributes, other post-estimation outputs can be estimated market-by-market with the following methods, each of which return an array.
 
 .. autosummary::
    :toctree: _api
@@ -129,7 +129,7 @@ In addition to class attributes, other post-estimation outputs can be estimated
    ProblemResults.compute_profits
    ProblemResults.compute_consumer_surpluses
 
-A parametric bootstrap can be used, for example, to compute standard errors for the above post-estimation outputs. The following method returns a results class with all of the above methods, which returns a distribution of post-estimation outputs corresponding to different bootstrapped samples.
+A parametric bootstrap can be used, for example, to compute standard errors forpost-estimation outputs. The following method returns a results class with the same methods in the list directly above, which returns a distribution of post-estimation outputs corresponding to different bootstrapped samples.
 
 .. autosummary::
    :toctree: _api
@@ -150,6 +150,16 @@ Importance sampling can be used to create new integration nodes and weights. Its
 
    ProblemResults.importance_sampling
 
+The following methods can compute micro moment values, compute scores from micro data, or simulate such data.
+
+.. autosummary::
+   :toctree: _api
+
+   ProblemResults.compute_micro_values
+   ProblemResults.compute_micro_scores
+   ProblemResults.compute_agent_scores
+   ProblemResults.simulate_micro_data
+
 
 Bootstrapped Problem Results Class
 ----------------------------------
@@ -163,7 +173,7 @@ Parametric bootstrap computation returns the following class.
 
    BootstrappedResults
 
-This class has all of the same methods as :class:`ProblemResults`, except for :meth:`ProblemResults.bootstrap`, :meth:`ProblemResults.compute_optimal_instruments`, :meth:`ProblemResults.importance_sampling`, :meth:`ProblemResults.run_distance_test`, :meth:`ProblemResults.run_hansen_test`, :meth:`ProblemResults.run_lm_test`, and :meth:`ProblemResults.run_wald_test`. It can also be pickled or converted into a dictionary.
+This class has many of the same methods as :meth:`ProblemResults`. It can also be pickled or converted into a dictionary.
 
 .. autosummary::
    :toctree: _api
@@ -284,7 +294,7 @@ Solved simulations return the following results class.
 
    SimulationResults
 
-This class has all of the same methods as :class:`ProblemResults`, except for :meth:`ProblemResults.bootstrap`, :meth:`ProblemResults.compute_optimal_instruments`, :meth:`ProblemResults.importance_sampling`, :meth:`ProblemResults.run_distance_test`, :meth:`ProblemResults.run_hansen_test`, :meth:`ProblemResults.run_lm_test`, and :meth:`ProblemResults.run_wald_test`. It can also be pickled or converted into a dictionary.
+This class has many of the same methods as :class:`ProblemResults`. It can also be pickled or converted into a dictionary.
 
 .. autosummary::
    :toctree: _api
@@ -299,15 +309,6 @@ It can also be converted into a :class:`Problem` with the following method.
 
    SimulationResults.to_problem
 
-The following methods can compute micro moment values, compute scores from micro data, or simulate such data.
-
-.. autosummary::
-   :toctree: _api
-
-   SimulationResults.compute_micro_values
-   SimulationResults.compute_micro_scores
-   SimulationResults.build_micro_data
-
 
 Structured Data Classes
 -----------------------

diff --git a/pyblp/markets/results_market.py → pyblp/markets/economy_results_market.py b/pyblp/markets/results_market.py → pyblp/markets/economy_results_market.py
@@ -7,12 +7,13 @@
 from .market import Market
 from .. import exceptions, options
 from ..configurations.iteration import Iteration
+from ..micro import MicroDataset, Moments
 from ..utilities.algebra import approximately_invert
 from ..utilities.basics import Array, Bounds, Error, SolverStats, NumericalErrorHandler
 
 
-class ResultsMarket(Market):
-    """A market in structured BLP results."""
+class EconomyResultsMarket(Market):
+    """A market in structured results for an economy underlying the BLP model."""
 
     @NumericalErrorHandler(exceptions.EquilibriumRealizationNumericalError)
     def safely_solve_equilibrium_realization(
@@ -404,3 +405,104 @@ def safely_compute_consumer_surplus(
         # integrate over agents
         surplus = surpluses @ self.agents.weights
         return surplus, errors
+
+    @NumericalErrorHandler(exceptions.SyntheticMicroDataNumericalError)
+    def safely_compute_micro_weights(self, dataset: MicroDataset) -> Tuple[Array, List[Error]]:
+        """Compute probabilities needed for simulating micro data, handling any numerical errors."""
+        errors: List[Error] = []
+        weights_mapping, _, _, _ = self.compute_micro_dataset_contributions([dataset])
+        return weights_mapping[dataset], errors
+
+    @NumericalErrorHandler(exceptions.SyntheticMicroMomentsNumericalError)
+    def safely_compute_micro_contributions(self, moments: Moments) -> Tuple[Array, Array, List[Error]]:
+        """Compute micro moment value contributions, handling any numerical errors."""
+        errors: List[Error] = []
+        micro_numerator, micro_denominator, _, _, _, _, _ = self.compute_micro_contributions(moments)
+        return micro_numerator, micro_denominator, errors
+
+    @NumericalErrorHandler(exceptions.MicroScoresNumericalError)
+    def safely_compute_score_denominator_contributions(
+            self, dataset: MicroDataset) -> Tuple[Array, Array, Array, List[Error]]:
+        """Compute denominator contributions to micro scores, handling any numerical errors."""
+
+        # compute probabilities and their derivatives
+        probabilities, conditionals = self.compute_probabilities()
+        probabilities_tangent_mapping, conditionals_tangent_mapping = (
+            self.compute_probabilities_by_parameter_tangent_mapping(probabilities, conditionals)
+        )
+        xi_jacobian, errors = self.compute_xi_by_theta_jacobian(
+            probabilities, conditionals, probabilities_tangent_mapping
+        )
+        self.update_probabilities_by_parameter_tangent_mapping(
+            probabilities_tangent_mapping, conditionals_tangent_mapping, probabilities, conditionals, xi_jacobian
+        )
+
+        # compute contributions
+        _, denominator_mapping, _, tangent_mapping = self.compute_micro_dataset_contributions(
+            [dataset], self.delta, probabilities, probabilities_tangent_mapping, compute_jacobians=True
+        )
+        if dataset in denominator_mapping:
+            denominator = denominator_mapping[dataset]
+            jacobian = np.array([tangent_mapping[(dataset, p)] for p in range(self.parameters.P)])
+        else:
+            denominator = 0
+            jacobian = np.zeros(self.parameters.P, options.dtype)
+
+        return xi_jacobian, denominator, jacobian, errors
+
+    @NumericalErrorHandler(exceptions.MicroScoresNumericalError)
+    def safely_compute_score_numerator_contributions(
+            self, dataset: MicroDataset, j: Optional[Any], k: Optional[Any], xi_jacobian: Array) -> (
+            Tuple[Array, Array, List[Error]]):
+        """Compute numerator contributions to micro scores, handling any numerical errors."""
+        errors: List[Error] = []
+
+        # compute probabilities and their derivatives
+        probabilities, conditionals = self.compute_probabilities()
+        probabilities_tangent_mapping, conditionals_tangent_mapping = (
+            self.compute_probabilities_by_parameter_tangent_mapping(probabilities, conditionals)
+        )
+        self.update_probabilities_by_parameter_tangent_mapping(
+            probabilities_tangent_mapping, conditionals_tangent_mapping, probabilities, conditionals, xi_jacobian
+        )
+
+        # obtain weights and their derivatives
+        weights_mapping, _, tangent_mapping, _ = self.compute_micro_dataset_contributions(
+            [dataset], self.delta, probabilities, probabilities_tangent_mapping, compute_jacobians=True
+        )
+        if dataset in weights_mapping:
+            weights = weights_mapping[dataset]
+            tangent = np.stack([tangent_mapping[(dataset, p)] for p in range(self.parameters.P)], axis=-1)
+        else:
+            weights = np.zeros_like(self.compute_micro_weights(dataset))
+            tangent = np.zeros(list(weights.shape) + [self.parameters.P], options.dtype)
+
+        # validate choices and select corresponding weights if specified
+        if j is not None:
+            try:
+                weights = weights[:, j]
+                tangent = tangent[:, j]
+            except IndexError as exception:
+                message = f"In market '{self.t}', choice index '{j}' is not between 0 and {weights.shape[1] - 1}."
+                raise ValueError(message) from exception
+
+        # validate second choices and select corresponding weights if specified and there are second choices
+        if len(weights.shape) == 1 + int(j is None) + 1:
+            if j is not None and k is None:
+                raise ValueError(
+                    "The dataset is configured to support second choice data, so micro_data must have "
+                    "second_choice_indices."
+                )
+            if k is not None:
+                try:
+                    weights = weights[:, k]
+                    tangent = tangent[:, k]
+                except IndexError as exception:
+                    message = f"In market '{self.t}', choice index '{k}' is not between 0 and {weights.shape[-1] - 1}."
+                    raise ValueError(message) from exception
+
+        # integrate over agents to get the numerator contributions
+        numerator = weights.sum(axis=0)
+        jacobian = tangent.sum(axis=0)
+
+        return numerator, jacobian, errors
diff --git a/pyblp/markets/simulation_results_market.py b/pyblp/markets/simulation_results_market.py