Implement NanMaskedNormal, NanMaskedMultivariateNormal #3116

fritzo · 2022-07-09T01:40:54Z

This implements two distributions to serve as likelihoods for partially observed data, where unobserved elements are specified as NAN values. This is new functionality beyond pyro.mask() and Distribution.mask() in that it allows NAN values within an event of MultivariateNormal; in this case we can analytically marginalize out the missing value. The NanMaskedNormal is similar to Normal.mask(...), but I've included it for easier compatibility with the nontrivial NanMaskedMultivariateNormal.

My motivating example is a Bayesian multivariate linear regression model with learned multivariate noise distribution and partially observed response as specified in a pandas dataframe. Each of the response columns is differently partially observed.

Tested

unit test of NanMaskedNormal
unit test of NanMaskedMultivariateNormal
end-to-end smoke test of NanMaskedMultivariateNormal

pyro/distributions/nanmasked.py

fehiepsi · 2022-07-09T02:24:29Z

Nice, I remember that this is requested by many forum users.

martinjankowiak

lgtm. obviously there are various ways the computation could be sped-up in different regimes but since this is probably most useful in the relatively low dimensional setting anyway...

martinjankowiak · 2022-07-09T18:21:47Z

pyro/distributions/nanmasked.py

+        result = value.new_zeros(n)
+
+        # Evaluate ok elements.
+        for pattern in sorted(set(map(tuple, ok.tolist()))):


oh i thought you were computing one big marginalized covariance with 0s/1s where appropriate so that everything could be vectorized (no for loop)

😄 that's beyond my linear algebra skills / patience. In practice I'm working with 3 columns so there are at most 7 patterns.

martinjankowiak · 2022-07-09T18:22:47Z

pyro/distributions/nanmasked.py

+            ok_value = value[row_mask][:, col_mask]
+            ok_loc = loc[row_mask][:, col_mask]
+            ok_cov = cov[row_mask][:, col_mask][:, :, col_mask]
+            marginal = MultivariateNormal(ok_loc, ok_cov, validate_args=False)


do these invocation not need covariance_matrix=?

i guess one nice thing about this pattern is that you don't need to worry about factors of log 2pi explicitly...

covariance_matrix is the default first argument, so no kwarg is necessary.

* Implement NanMaskedNormal, NanMaskedMultivariateNormal * Fix test * Add test for fully-unobserved data

Implement NanMaskedNormal, NanMaskedMultivariateNormal

7a7455d

fritzo added the enhancement label Jul 9, 2022

Fix test

6669a3a

fritzo added the awaiting review label Jul 9, 2022

fehiepsi reviewed Jul 9, 2022

View reviewed changes

pyro/distributions/nanmasked.py Show resolved Hide resolved

Add test for fully-unobserved data

b3732cd

martinjankowiak approved these changes Jul 9, 2022

View reviewed changes

martinjankowiak merged commit 38facc1 into dev Jul 10, 2022

martinjankowiak deleted the nan-masked branch July 10, 2022 17:08

OlaRonning pushed a commit to aleatory-science/pyro that referenced this pull request Aug 2, 2022

Implement NanMaskedNormal, NanMaskedMultivariateNormal (pyro-ppl#3116)

34cb317

* Implement NanMaskedNormal, NanMaskedMultivariateNormal * Fix test * Add test for fully-unobserved data

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement NanMaskedNormal, NanMaskedMultivariateNormal #3116

Implement NanMaskedNormal, NanMaskedMultivariateNormal #3116

fritzo commented Jul 9, 2022

fehiepsi commented Jul 9, 2022

martinjankowiak left a comment

martinjankowiak Jul 9, 2022

fritzo Jul 10, 2022

martinjankowiak Jul 9, 2022

martinjankowiak Jul 9, 2022

fritzo Jul 10, 2022

Implement NanMaskedNormal, NanMaskedMultivariateNormal #3116

Implement NanMaskedNormal, NanMaskedMultivariateNormal #3116

Conversation

fritzo commented Jul 9, 2022

Tested

fehiepsi commented Jul 9, 2022

martinjankowiak left a comment

Choose a reason for hiding this comment

martinjankowiak Jul 9, 2022

Choose a reason for hiding this comment

fritzo Jul 10, 2022

Choose a reason for hiding this comment

martinjankowiak Jul 9, 2022

Choose a reason for hiding this comment

martinjankowiak Jul 9, 2022

Choose a reason for hiding this comment

fritzo Jul 10, 2022

Choose a reason for hiding this comment