Add unnormalized distributions to the language #63

VMatthijs · 2019-02-25T21:29:56Z

Based on discussions with @seantalts and @bob-carpenter :

For each distribution, e.g. poisson_lpmf, make the compiler accept the function name poisson_unnormalized_lpmf as well.
This should code gen to poisson_lpmf<true>.
This allows us to treat
x ~ poisson(42);
as syntactic sugar for
target += poisson_unnormalized_lpmf(x| 42);.

This is related to issue stan-dev/stan#1299 .

Also make sure that algebraic optimizations like bernoulli_lpmf(x|inv_logit(theta))->bernoulli_logit_lpmf(x|theta) and automatic vectorization optimizations apply to these unnormalized distributions as well.

The text was updated successfully, but these errors were encountered:

seantalts · 2019-05-31T17:55:07Z

just hitting this now - should this actually be inverted? We have a propto__ type parameter to log_prob and it seems like sometimes explicitly want to set it to false rather than sometimes explicitly setting it to true (false is the expensive, more accurate case). So perhaps poisson_normalized_lpmf? Or what about poisson_lpmf_normalized?

bob-carpenter · 2019-05-31T18:05:49Z

_lpmf should remain the suffix to keep the name mangling rules and doc simple.

I think we should support marked cases and leave the unmarked cases the same. So the following two groups would contain equivalent statements, leading off with our existing forms.

y ~ foo(theta);

y ~ foo_propto(theta);
target += foo_propto_lpdf(y | theta);

target += foo_lpdf(y | theta);

y ~ foo_normalized(theta);
target += foo_normalized_lpdf(y | theta);

You can think of the first statement in each of these two groups as syntactic sugar and the ~ statements can now be fully defined as syntactic sugar for target +=.

I could be convinced that propto should be replaced with unnormalized, but it's a handful to type.

seantalts · 2019-05-31T18:16:41Z

Oh, for supporting these for users? Sure, makes sense.

I'm having trouble parsing the two groups. Are you saying y ~ foo(theta) = target += foo_lpdf(y | theta) and so on, line 1 = line 1, line 2= line 2, line 3 = line 3? Or also that line 1 = line 2 = line 3?

seantalts · 2019-05-31T18:45:43Z

@VMatthijs likes "_unnormalized" and Bob likes "_propto." Unfortunately I don't really care - propto is nice in that it's what we use in code gen

Fix #63, which fixed low_dim_gauss_mix

VMatthijs · 2019-05-31T18:49:07Z

I don't really care either. Let's go with _propto.

VMatthijs · 2019-05-31T18:50:16Z

I don't think these have been exposed to the language yet. They're in the backend now, but the type checker should still be made to accept them.

seantalts · 2019-05-31T18:50:42Z

Yep, sorry about that.

bob-carpenter · 2019-05-31T21:18:45Z

Our current y ~ foo(theta) is propto/unnormalized. Our current target += foo_lpdf(y | theta) is normalized. The ones grouped together are equivalent. I don't really care so much about propto vs. unnormalized. I can see arguments either way. We should ask a wider group of statisticians about what they think is natural.

seantalts · 2019-05-31T21:27:49Z

Agreed! I'll write up a discourse post.

seantalts · 2019-05-31T21:33:42Z

poll! https://discourse.mc-stan.org/t/statisticians-propto-vs-unnormalized-vs/9054

VMatthijs · 2019-06-03T12:04:34Z

Just a reminder to ourselves that whatever ends up being implemented should be documented in the wiki. (And then in the manual when we are ready to release.)

seantalts · 2019-06-03T16:26:43Z

Does anyone remember why we wanted to expose this to the end users in the language? Was there a motivating use-case for when you needed to use target += foo but didn't want it to be normalized?

VMatthijs · 2019-06-03T16:35:01Z

@bob-carpenter , weren't you concerned about the normalizing constant of a gamma distribution being expensive to compute or something like that?

bob-carpenter · 2019-06-03T19:40:34Z

Yes, in general there's a lot of expensive computation that's avoided if we can do unnormalized/propto computations. But then in some contexts, like mixtures, we need the normalized forms. Also, if we want to reduce sampling statements to target increment statements, then we need this flexibility to be able to actually implement the sampling statements as they're currently implemented (unnormalized, that is).

seantalts · 2019-06-03T19:43:24Z

Also, if we want to reduce sampling statements to target increment statements, then we need this flexibility to be able to actually implement the sampling statements as they're currently implemented (unnormalized, that is).

We don't actually need this exposed to end users to do that, right?

Yes, in general there's a lot of expensive computation that's avoided if we can do unnormalized/propto computations. But then in some contexts, like mixtures, we need the normalized forms.

Agreed, but are there cases where you can't use ~ but need to get the propto=true version?

In other words, are there specific motivating examples for exposing this to end users of the Stan language?

rok-cesnovar · 2020-06-17T18:05:00Z

Superseed by #253 and #541. Closing.

VMatthijs added the feature New feature or request label Feb 25, 2019

seantalts mentioned this issue May 31, 2019

Fix #63, which fixed low_dim_gauss_mix #144

Merged

seantalts closed this as completed in e1c70ed May 31, 2019

seantalts added a commit that referenced this issue May 31, 2019

Merge pull request #144 from stan-dev/code-gen9

13f859d

Fix #63, which fixed low_dim_gauss_mix

VMatthijs reopened this May 31, 2019

VMatthijs mentioned this issue May 31, 2019

Fixing semantic check to accept _propto distributions #145

Merged

rok-cesnovar mentioned this issue May 8, 2020

allow access to propto template param as int in functions ending in _log / _lpdf / _lpmf stan-dev/stan#1299

Closed

2 tasks

rok-cesnovar closed this as completed Jun 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add unnormalized distributions to the language #63

Add unnormalized distributions to the language #63

VMatthijs commented Feb 25, 2019 •

edited

seantalts commented May 31, 2019

bob-carpenter commented May 31, 2019 •

edited

seantalts commented May 31, 2019

seantalts commented May 31, 2019

VMatthijs commented May 31, 2019

VMatthijs commented May 31, 2019

seantalts commented May 31, 2019

bob-carpenter commented May 31, 2019 via email

seantalts commented May 31, 2019

seantalts commented May 31, 2019

VMatthijs commented Jun 3, 2019 •

edited

seantalts commented Jun 3, 2019

VMatthijs commented Jun 3, 2019

bob-carpenter commented Jun 3, 2019 via email

seantalts commented Jun 3, 2019

rok-cesnovar commented Jun 17, 2020

Add unnormalized distributions to the language #63

Add unnormalized distributions to the language #63

Comments

VMatthijs commented Feb 25, 2019 • edited

seantalts commented May 31, 2019

bob-carpenter commented May 31, 2019 • edited

seantalts commented May 31, 2019

seantalts commented May 31, 2019

VMatthijs commented May 31, 2019

VMatthijs commented May 31, 2019

seantalts commented May 31, 2019

bob-carpenter commented May 31, 2019 via email

seantalts commented May 31, 2019

seantalts commented May 31, 2019

VMatthijs commented Jun 3, 2019 • edited

seantalts commented Jun 3, 2019

VMatthijs commented Jun 3, 2019

bob-carpenter commented Jun 3, 2019 via email

seantalts commented Jun 3, 2019

rok-cesnovar commented Jun 17, 2020

VMatthijs commented Feb 25, 2019 •

edited

bob-carpenter commented May 31, 2019 •

edited

VMatthijs commented Jun 3, 2019 •

edited