Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added chapter on proportionality constants (Issue #287) #288

Merged
merged 5 commits into from Mar 4, 2021

Conversation

bbbales2
Copy link
Member

Submission Checklist

  • Builds locally
  • Declare copyright holder and open-source license: see below

Summary

@rok-cesnovar @jgabry Can both of you have a look at this?

I'm not sold on the organization yet, and we need to make sure we really want to do this as a separate chapter (the chapter titles need to stay compatible into the future cause this is how links pointing to the old docs are forwarded to the new docs)

I was realizing when I wrote this that maybe I should have done the reference manual version of this first. I'm not sure how much of what I wrote here is duplicate or what.

Copyright and Licensing

Issue: #287

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Columbia University

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

@rok-cesnovar rok-cesnovar self-requested a review November 17, 2020 21:31
Copy link
Member

@rok-cesnovar rok-cesnovar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review!


## User-defined Distributions

While it is possible to define custom distributions in Stan, it is not yet
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure we need to say "not yet". No one is currently thinking about/working on allowing these. We might end up adding it soon, but if we don't this will be a dangling "soon" :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just mention that currently you always define a UDF as normalized but can call it unnormalized. Calling it as unnormalized will however only help with any calls to Stan-defined lpdfs/lpmfs.

@bbbales2
Copy link
Member Author

@rok-cesnovar @jgabry actually I forgot to ask.

proportionality constants/unnormalized distributions/additive constants (for the log)

What lingo should I be leaning towards? These are all kinda referring to the same thing, but it might be confusing to carelessly swap between all three (which I do).

@rok-cesnovar
Copy link
Member

Not a native speaker nor a statistician by profession, so I have no clue which one is preffered so will let Jonah take this one.

@jgabry
Copy link
Member

jgabry commented Nov 19, 2020

proportionality constants/unnormalized distributions/additive constants (for the log)

What lingo should I be leaning towards? These are all kinda referring to the same thing, but it might be confusing to carelessly swap between all three (which I do).

Yeah that's a good question. Maybe at the beginning just mention that all of these terms will be used? I think the reality is that it's pretty natural to use all of these terms and people will see all of them in different places already. Maybe just one sentence towards the beginning (if it's not there already) that connects these three concepts to make sure the reader is prepared, i.e., unnormalized distributions are those without the constant term and when working on the log scale (like Stan does) that results in additive rather than multiplicative constant.

That said, if you want to just pick one way to talk about it then that's fine too, but I think it will be tricky!

Comment on lines +5 to +6
compute the functions up to a proportionality constant (or similarly compute
log densities up to an additive constant). In MCMC this comes from the fact that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In relation to the question you asked about which terminology to use, I think you've done a good job here! You use both "proportionality constant" and "additive constant" but you make it clear that additive is when working on the log scale.

MCMC, variational inference, or optimization, it is usually only necessary to
compute the functions up to a proportionality constant (or similarly compute
log densities up to an additive constant). In MCMC this comes from the fact that
the distribution being sampled does not need to be normalized (similarly
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also in relation to your question about terminology, maybe just make the connection between "normalized" and the "constants" that you previously mentioned explicit here. Then I think you can freely use all the terms below and not worry about it.

Comment on lines 11 to 13
There are three different syntaxes to work with distributions in Stan. The way
to select between them is by determining if the proportionality constants are
necessary.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's worth mentioning the speed issue here. Because really the user shouldn't even bother thinking about using the unnormalized versions if they're not worried about the computation time, right? Plenty of models are fast enough that the user could ignore this and save themselves the time understanding the differences between the different versions. What do you think?


```
x ~ normal(0, 1);
target += normal_lupdf(x | 0, 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first time we show this syntax should we say that "u" stands for "unnormalized" to make the connection between the name and the purpose clear?

@jgabry
Copy link
Member

jgabry commented Nov 19, 2020

In addition to my other comments: thanks for working on this!

@bbbales2
Copy link
Member Author

Edits made. I'm still not totally happy with how this renders (there's a separate page for each section and it's not obvious how they feed into each other).

As a follow up, where in the reference manual should we add stuff? May as well be part of this pull request. I guess I add a couple sections here: https://mc-stan.org/docs/2_25/reference-manual/increment-log-prob-section.html ?

@jgabry
Copy link
Member

jgabry commented Nov 20, 2020

Edits made. I'm still not totally happy with how this renders (there's a separate page for each section and it's not obvious how they feed into each other).

Is whether it creates a separate page or not determined by the header level (is that the right terminology?), that is how many # there are?

As a follow up, where in the reference manual should we add stuff? May as well be part of this pull request. I guess I add a couple sections here: https://mc-stan.org/docs/2_25/reference-manual/increment-log-prob-section.html ?

Yeah that seems like a good spot. Maybe also a very quick mention of lupdf in the section after that on sampling statements since they're equivalent?

@bbbales2
Copy link
Member Author

I added docs to the reference manual for lupdf and lupmf. @rok-cesnovar @jgabry have a look at these.

@@ -156,7 +156,9 @@ or `add`, but it may be named `pi` or `e`.
Variable names will also conflict with the names of distributions
suffixed with `_lpdf`, `_lpmf`, `_lcdf`, and `_lccdf`, `_cdf`, and
`_ccdf`, such as `normal_lcdf_log`; this also holds for the deprecated
forms `_log`, `_cdf_log`, and `_ccdf_log`.
forms `_log`, `_cdf_log`, and `_ccdf_log`. No user-defined variable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, we need to revise this section actually. As of 2.25 Stan Math function names are allowed as identifiers. The only restriction is that identifier can not end with _lupdf/_lupmf

So allowed: add, normal_lpdf, sum,...
Not allowed: normal_lupdf, p_lupmf

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to push directly into this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do!

Copy link
Member

@jgabry jgabry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a few comments, but this looks good. Thanks @bbbales2!

src/reference-manual/statements.Rmd Outdated Show resolved Hide resolved
The `normal_lupdf` function returns the log density of an unnormalized distribution.
With the unnormalized version of the function, Stan does not define what the
normalization constant will be, though usually as many terms as possible are dropped
to make the calculation fast. One possible definition of `normal_lupdf` is:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why "One possible definition of" and not "The definition of"? Is this saying that Stan has multiple possible versions of normal_lupdf?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this to try to make it less ambiguous

src/reference-manual/statements.Rmd Show resolved Hide resolved
Comment on lines 257 to 258
Other `_lupdf` and `_lupmf` functions used in the definition of
`foo_lpdf` will drop additive constants. If there are no `_lupdf`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might just be confused (as is often the case!) but it says just above that

if any other unnormalized density functions are used inside the user-defined function, the _lpdf and _lpmf forms of the user-defined function will change these densities to be normalized

Does that contradict this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is right but I rewrote it to make it clearer.

@bbbales2
Copy link
Member Author

@jgabry I must have just forgotten documentation existed sometime in December. This is the second piece of nearly finished doc that was just chilling.

@bbbales2 bbbales2 merged commit d8800a6 into master Mar 4, 2021
@WardBrian WardBrian deleted the feature/lupxf_proportionality_constants branch March 28, 2022 13:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants