Add auto-batched (low-rank) multivariate normal guides. #1737

tillahoffmann · 2024-02-17T23:56:55Z

This PR implements auto-guides that support batching along leading dimensions of the parameters. The guides are motivated by models that have conditional independence structure but possibly strong correlation within each instance of a plate. The interface is exactly the same as Auto[LowRank]MultivariateNormal with an additional argument batch_ndims that specifies the number of dimensions to treat as independent in the posterior approximation.

Example

Consider a random walk model with n time series and t observations. Then the number of parameters is n * t + 2 * n (matrix of latent time series, one scale parameter for random walk innovations for each series, and one scale parameter for observation noise for each series). For concreteness, here's the model.

def model(n, t):
    with numpyro.plate("n", n):
        # Model for time series.
        innovation_scale = numpyro.sample(
            "innovation_scale",
            distributions.HalfCauchy(1),
        )
        innovations = numpyro.sample(
            "innovations",
            distributions.Normal().expand([t]).to_event(1),
        )
        series = numpyro.deterministic(
            "series",
            innovations.cumsum(axis=-1),
        )
        
        # Model for observations.
        noise_scale = numpyro.sample(
            "noise_scale",
            distributions.HalfCauchy(1),
        )
        data = numpyro.sample(
            "data",
            distributions.Normal(series, noise_scale[:, None]).to_event(1),
        )

Suppose we use different auto-guides and count the number of parameters we need to optimize. The example below is for n = 10 and t = 20

# [guide class] [total number of parameters]
# 	[parameter shapes]
AutoDiagonalNormal 440
	 {'auto_loc': (220,), 'auto_scale': (220,)}
AutoLowRankMultivariateNormal 3740
	 {'auto_loc': (220,), 'auto_cov_factor': (220, 15), 'auto_scale': (220,)}
AutoMultivariateNormal 48620
	 {'auto_loc': (220,), 'auto_scale_tril': (220, 220)}
AutoBatchedLowRankMultivariateNormal 1540
	 {'auto_loc': (10, 22), 'auto_cov_factor': (10, 22, 5), 'auto_scale': (10, 22)}
AutoBatchedMultivariateNormal 5060
	 {'auto_loc': (10, 22), 'auto_scale_tril': (10, 22, 22)}

AutoDiagonalNormal of course has the fewest parameters and AutoMultivariateNormal the most. The number of location parameters is the same across all guides. The batched versions have significantly fewer scale/covariance parameters (but of course cannot model dependence between different series). There is no free lunch, but I believe these batched guides can strike a reasonable compromise between modeling dependence and computational cost.

Implementation

The implementation uses a mixin AutoBatchedMixin to

determine the batch shape (and verify that a batched guide is appropriate for the model) and
apply a reshaping transformation to account for the existence of batches in the variational approximation.

The two batched guides are implemented analogously to the non-batched guides with the addition of the mixin and slight modifications to the parameters.

I added a ReshapeTransform to take care of the shapes. That could probably also be squeezed into the UnpackTransform. I decided on the former approach because

it separates the concerns rather than packing more logic into UnpackTransform and
I've found myself looking for reshaping samples in other settings.

Note

I didn't implement the get_base_dist, get_transform, and get_posterior methods because I couldn't find the corresponding tests.

fehiepsi · 2024-02-20T16:19:55Z

numpyro/infer/autoguide.py

+        for site in self.prototype_trace.values():
+            if site["type"] == "sample" and not site["is_observed"]:
+                shape = site["value"].shape
+                if site["value"].ndim < self.batch_ndim:


I think a safer check is site["value"].ndim < self.batch_ndim + site["fn"].event_dim.

fehiepsi

Thanks for the great contribution, @tillahoffmann! LGTM pending the comment above.

tillahoffmann · 2024-02-22T16:15:08Z

It turns out that for larger datasets, we run into google/jax#19885. The issue could probably be worked around in numpyro by slightly rearranging operations in the LowRankMultivariateNormal implementation. Is that of interest or just wait for the upstream fix (I don't know how quick the jax folks usually are)?

fehiepsi · 2024-02-22T22:04:59Z

Oh, what a subtle issue. It would be nice to have a fix here (if the solution is simple like changing operators around)

* Add `ReshapeTransform`. * Add `AutoBatchedMultivariateNormal`. * Refactor to use `AutoBatchedMixin`. * Add `AutoLowRankMultivariateNormal`. * Fix import order. * Disable batching along event dimensions.

tillahoffmann added 5 commits February 17, 2024 17:17

Add ReshapeTransform.

3005e99

Add AutoBatchedMultivariateNormal.

d80e7d1

Refactor to use AutoBatchedMixin.

ff3fbf7

Add AutoLowRankMultivariateNormal.

45b95f8

Fix import order.

ec385cb

fehiepsi reviewed Feb 20, 2024

View reviewed changes

fehiepsi approved these changes Feb 20, 2024

View reviewed changes

Disable batching along event dimensions.

d545498

fehiepsi merged commit b35fcec into pyro-ppl:master Feb 21, 2024
4 checks passed

tillahoffmann deleted the batched branch February 22, 2024 16:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add auto-batched (low-rank) multivariate normal guides. #1737

Add auto-batched (low-rank) multivariate normal guides. #1737

tillahoffmann commented Feb 17, 2024 •

edited

fehiepsi Feb 20, 2024

fehiepsi left a comment

tillahoffmann commented Feb 22, 2024

fehiepsi commented Feb 22, 2024

Add auto-batched (low-rank) multivariate normal guides. #1737

Add auto-batched (low-rank) multivariate normal guides. #1737

Conversation

tillahoffmann commented Feb 17, 2024 • edited

Example

Implementation

fehiepsi Feb 20, 2024

Choose a reason for hiding this comment

fehiepsi left a comment

Choose a reason for hiding this comment

tillahoffmann commented Feb 22, 2024

fehiepsi commented Feb 22, 2024

tillahoffmann commented Feb 17, 2024 •

edited