Skip to content

Commit

Permalink
ENH: support agent-specific product availability
Browse files Browse the repository at this point in the history
- New availability field in product_data, similar to product-specific demographics.
- Multiplies exponentiated probabilities.
- Typically 0s or 1s, but can be other numbers to model known probabilities of availability that differ by demographic.
  • Loading branch information
jeffgortmaker committed Jun 27, 2023
1 parent 0a592fd commit 5d4a71f
Show file tree
Hide file tree
Showing 10 changed files with 150 additions and 19 deletions.
1 change: 1 addition & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,7 @@ Features
- Multiple equation GMM
- Demographic interactions
- Product-specific demographics
- Consumer-specific product availability
- Flexible micro moments that can match statistics based on survey data
- Support for micro moments based on second choice data
- Support for optimal micro moments that match micro data scores
Expand Down
1 change: 1 addition & 0 deletions docs/notation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ Symbol Dimensions
:math:`\Gamma` :math:`J_t \times J_t` Another matrix used to decompose :math:`\eta` and :math:`\zeta` in market :math:`t`
:math:`d` :math:`I_t \times D` Observed agent characteristics called demographics in market :math:`t`
:math:`\nu` :math:`I_t \times K_2` Unobserved agent characteristics called integration nodes in market :math:`t`
:math:`a` :math:`I_t \times J_t` Agent-specific product availability in market :math:`t`
:math:`w` :math:`I_t \times 1` Integration weights in market :math:`t`
:math:`\delta` :math:`N \times 1` Mean utility
:math:`\mu` :math:`J_t \times I_t` Agent-specific portion of utility in market :math:`t`
Expand Down
23 changes: 21 additions & 2 deletions pyblp/economies/problem.py
Original file line number Diff line number Diff line change
Expand Up @@ -1385,8 +1385,8 @@ class Problem(ProblemEconomy):
same ID within a market.
Along with ``market_ids`` and ``agent_ids``, the names of any additional fields can be typically be used as
variables in ``agent_formulation``. The exception is the name ``'demographics'``, which is reserved for use by
:class:`Agents`.
variables in ``agent_formulation``. Exceptions are the names ``'demographics'`` and ``'availability'``, which
are reserved for use by :class:`Agents`.
In addition to standard demographic variables :math:`d_{it}`, it is also possible to specify product-specific
demographics :math:`d_{ijt}`. A typical example is geographic distance of agent :math:`i` from product
Expand All @@ -1397,6 +1397,25 @@ class Problem(ProblemEconomy):
the market, as ordered in ``product_data``. The last index should be the number of products in the largest
market, minus one. For markets with fewer products than this maximum number, latter columns will be ignored.
Finally, by default each agent :math:`i` in market :math:`t` is faced with the same choice set of product
:math:`j`, but it is possible to specify agent-specific availability :math:`a_{ijt}` much in the same way that
product-specific demographics are specified. To do so, the following field can be specified:
- **availability** : (`numeric, optional`) - Agent-specific product availability, :math:`a`. Choice
probabilities in :eq:`probabilities` are modified according to
.. math:: s_{ijt} = \frac{a_{ijt} \exp V_{ijt}}{1 + \sum_{k \in J_t} a_{ijt} \exp V_{ikt}},
and similarly for the nested logit model and consumer surplus calculations. By default, all
:math:`a_{ijt} = 1`. To have a product :math:`j` be unavailable to agent :math:`i`, set
:math:`a_{ijt} = 0`.
Agent-specific availability is specified in the same way that product-specific demographics are specified.
In ``agent_data``, one can include ``'availability0'``, ``'availability1'``, ``'availability2'``, and so
on, where the index corresponds to the order in which products appear within market in ``product_data``.
The last index should be the number of products in the largest market, minus one. For markets with fewer
products than this maximum number, latter columns will be ignored.
integration : `Integration, optional`
:class:`Integration` configuration for how to build nodes and weights for integration over agent choice
probabilities, which will replace any ``nodes`` and ``weights`` fields in ``agent_data``. This configuration is
Expand Down
25 changes: 23 additions & 2 deletions pyblp/economies/simulation.py
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,25 @@ class Simulation(Economy):
the market, as ordered in ``product_data``. The last index should be the number of products in the largest
market, minus one. For markets with fewer products than this maximum number, latter columns will be ignored.
Finally, by default each agent :math:`i` in market :math:`t` is faced with the same choice set of product
:math:`j`, but it is possible to specify agent-specific availability :math:`a_{ijt}` much in the same way that
product-specific demographics are specified. To do so, the following field can be specified:
- **availability** : (`numeric, optional`) - Agent-specific product availability, :math:`a`. Choice
probabilities in :eq:`probabilities` are modified according to
.. math:: s_{ijt} = \frac{a_{ijt} \exp V_{ijt}}{1 + \sum_{k \in J_t} a_{ijt} \exp V_{ikt}},
and similarly for the nested logit model and consumer surplus calculations. By default, all
:math:`a_{ijt} = 1`. To have a product :math:`j` be unavailable to agent :math:`i`, set
:math:`a_{ijt} = 0`.
Agent-specific availability is specified in the same way that product-specific demographics are specified.
In ``agent_data``, one can include ``'availability0'``, ``'availability1'``, ``'availability2'``, and so
on, where the index corresponds to the order in which products appear within market in ``product_data``.
The last index should be the number of products in the largest market, minus one. For markets with fewer
products than this maximum number, latter columns will be ignored.
integration : `Integration, optional`
:class:`Integration` configuration for how to build nodes and weights for integration over agent choice
probabilities, which will replace any ``nodes`` and ``weights`` fields in ``agent_data``. This configuration is
Expand Down Expand Up @@ -459,12 +478,13 @@ def __init__(
if not isinstance(integration, Integration):
raise ValueError("integration must be None or an Integration instance.")
agent_market_ids, nodes, weights = integration._build_many(products.X2.shape[1], np.unique(market_ids))
agent_ids = None
agent_ids = availability = None
elif agent_data is not None:
agent_market_ids = extract_matrix(agent_data, 'market_ids')
agent_ids = extract_matrix(agent_data, 'agent_ids')
nodes = extract_matrix(agent_data, 'nodes')
weights = extract_matrix(agent_data, 'weights')
availability = extract_matrix(agent_data, 'availability')
else:
raise ValueError("At least one of agent_data or integration must be specified.")

Expand All @@ -473,7 +493,8 @@ def __init__(
'market_ids': (agent_market_ids, np.object_),
'agent_ids': (agent_ids, np.object_),
'nodes': (nodes, options.dtype),
'weights': (weights, options.dtype)
'weights': (weights, options.dtype),
'availability': (availability, options.dtype),
}
if agent_formulation is not None:
for name in sorted(agent_formulation._names - set(agent_mapping)):
Expand Down
4 changes: 4 additions & 0 deletions pyblp/markets/economy_results_market.py
Original file line number Diff line number Diff line change
Expand Up @@ -377,6 +377,10 @@ def safely_compute_consumer_surplus(
exp_utilities = np.exp(utilities - utility_reduction)
scale_weights = 1

# optionally adjust for agent-specific product availability
if self.agents.availability.size > 0:
exp_utilities *= self.agents.availability.T

# eliminate any products from the choice set
if eliminate_product_ids is not None:
for j, product_id in enumerate(self.products.product_ids[:, product_ids_index]):
Expand Down
39 changes: 31 additions & 8 deletions pyblp/markets/market.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ def __init__(
self.products = update_matrices(self.products, products_update_mapping)

# fill missing columns of integration nodes (associated with zeros in sigma) with zeros and drop extra
# product-specific demographic values for product indices not in this market
# product-specific demographic/agent-specific product availability values for products not in this market
agents_update_mapping: Dict[str, Tuple[Optional[Array], Any]] = {}
if self.agents.nodes.shape[1] != economy.K2 and not parameters.nonzero_sigma_index.all():
nodes = np.zeros((self.agents.shape[0], economy.K2), self.agents.nodes.dtype)
Expand All @@ -85,6 +85,9 @@ def __init__(
if len(self.agents.demographics.shape) == 3:
demographics = self.agents.demographics[..., :self.products.size]
agents_update_mapping['demographics'] = (demographics, demographics.dtype)
if self.agents.availability.size > 0:
availability = self.agents.availability[..., :self.products.size]
agents_update_mapping['availability'] = (availability, availability.dtype)
if agents_update_mapping:
self.agents = update_matrices(self.agents, agents_update_mapping)

Expand Down Expand Up @@ -331,8 +334,8 @@ def compute_probabilities(
self, delta: Array = None, mu: Optional[Array] = None, linear: bool = True, safe: bool = True,
utility_reduction: Optional[Array] = None, numerator: Optional[Array] = None,
eliminate_outside: bool = False, eliminate_product: Optional[int] = None,
eliminate_product_id: Optional[Any] = None, product_ids_index: Optional[int] = None) -> (
Tuple[Array, Optional[Array]]):
eliminate_product_id: Optional[Any] = None, product_ids_index: Optional[int] = None,
availability: Optional[Array] = None) -> Tuple[Array, Optional[Array]]:
"""Compute choice probabilities. By default, use unchanged delta and mu values. If linear is False, delta and mu
must be specified and already be exponentiated. If safe is True, scale the logit equation by the exponential of
negative the maximum utility for each agent, and if utility_reduction is specified, it should be values that
Expand Down Expand Up @@ -384,6 +387,12 @@ def compute_probabilities(
if eliminate_outside:
scale = 0

# optionally adjust for agent-specific product availability
if availability is None and self.agents.availability.size > 0:
availability = self.agents.availability
if availability is not None:
exp_utilities *= availability.T

# optionally eliminate a product from the choice set
if eliminate_product is not None:
exp_utilities[eliminate_product] = 0
Expand Down Expand Up @@ -1023,11 +1032,21 @@ def compute_probabilities_by_parameter_tangent(
# compute the tangent of marginal probabilities with respect to the parameter (re-scale for robustness)
utility_reduction = np.clip(utilities.max(axis=0, keepdims=True), 0, None)
with np.errstate(divide='ignore', invalid='ignore'):
exp_utilities = np.exp(utilities - utility_reduction)

# hand agent-specific product availability
if self.agents.availability.size > 0:
availability = self.agents.availability
if agent_indices is not None:
availability = availability[agent_indices]
exp_utilities *= availability.T

B = marginals * (
A_sums * (1 - self.group_rho) -
(np.log(self.groups.sum(np.exp(utilities - utility_reduction))) + utility_reduction)
(np.log(self.groups.sum(exp_utilities)) + utility_reduction)
)
marginals_tangent = group_associations * B - marginals * (group_associations.T @ B)

marginals_tangent[~np.isfinite(marginals_tangent)] = 0

else:
Expand Down Expand Up @@ -1421,8 +1440,11 @@ def compute_micro_dataset_contributions(
delta = self.delta

mu = None
availability = None
if agent_indices is not None:
mu = self.mu[:, agent_indices]
if self.agents.availability.size > 0:
availability = self.agents.availability[agent_indices]

# pre-compute and validate micro dataset weights, multiplying these with probabilities and using these to
# compute micro value denominators
Expand Down Expand Up @@ -1454,7 +1476,7 @@ def compute_micro_dataset_contributions(

# pre-compute probabilities
if probabilities is None:
probabilities, _ = self.compute_probabilities(delta, mu)
probabilities, _ = self.compute_probabilities(delta, mu, availability=availability)

# pre-compute outside probabilities
need_outside_probabilities = len(weights.shape) == 2 and weights.shape[1] == 1 + self.J
Expand Down Expand Up @@ -1485,7 +1507,7 @@ def compute_micro_dataset_contributions(
# re-compute probabilities if there is nesting or there was a numerical error
if eliminated_probabilities_j is None or not np.isfinite(eliminated_probabilities_j).all():
eliminated_probabilities_j, eliminated_conditionals_j = self.compute_probabilities(
delta, mu, eliminate_product=j
delta, mu, eliminate_product=j, availability=availability
)

eliminated_probabilities_list.append(eliminated_probabilities_j)
Expand All @@ -1510,7 +1532,8 @@ def compute_micro_dataset_contributions(
# re-compute probabilities if there is nesting or there was a numerical error
if eliminated_probabilities_j is None or not np.isfinite(eliminated_probabilities_j).all():
eliminated_probabilities_j, eliminated_conditionals_j = self.compute_probabilities(
delta, mu, eliminate_product_id=product_id, product_ids_index=ids_index
delta, mu, eliminate_product_id=product_id, product_ids_index=ids_index,
availability=availability
)

eliminated_probabilities_list.append(eliminated_probabilities_j)
Expand Down Expand Up @@ -1602,7 +1625,7 @@ def compute_micro_dataset_contributions(
# re-compute probabilities if there is nesting or there was a numerical error
if outside_eliminated_probabilities is None or not np.isfinite(outside_eliminated_probabilities).all():
outside_eliminated_probabilities, outside_eliminated_conditionals = self.compute_probabilities(
delta, mu, eliminate_outside=True
delta, mu, eliminate_outside=True, availability=availability
)

if compute_jacobians:
Expand Down

0 comments on commit 5d4a71f

Please sign in to comment.