New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added chapter on proportionality constants (Issue #287) #288
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review!
|
||
## User-defined Distributions | ||
|
||
While it is possible to define custom distributions in Stan, it is not yet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure we need to say "not yet". No one is currently thinking about/working on allowing these. We might end up adding it soon, but if we don't this will be a dangling "soon" :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just mention that currently you always define a UDF as normalized but can call it unnormalized. Calling it as unnormalized will however only help with any calls to Stan-defined lpdfs/lpmfs.
@rok-cesnovar @jgabry actually I forgot to ask. proportionality constants/unnormalized distributions/additive constants (for the log) What lingo should I be leaning towards? These are all kinda referring to the same thing, but it might be confusing to carelessly swap between all three (which I do). |
Not a native speaker nor a statistician by profession, so I have no clue which one is preffered so will let Jonah take this one. |
Yeah that's a good question. Maybe at the beginning just mention that all of these terms will be used? I think the reality is that it's pretty natural to use all of these terms and people will see all of them in different places already. Maybe just one sentence towards the beginning (if it's not there already) that connects these three concepts to make sure the reader is prepared, i.e., unnormalized distributions are those without the constant term and when working on the log scale (like Stan does) that results in additive rather than multiplicative constant. That said, if you want to just pick one way to talk about it then that's fine too, but I think it will be tricky! |
compute the functions up to a proportionality constant (or similarly compute | ||
log densities up to an additive constant). In MCMC this comes from the fact that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In relation to the question you asked about which terminology to use, I think you've done a good job here! You use both "proportionality constant" and "additive constant" but you make it clear that additive is when working on the log scale.
MCMC, variational inference, or optimization, it is usually only necessary to | ||
compute the functions up to a proportionality constant (or similarly compute | ||
log densities up to an additive constant). In MCMC this comes from the fact that | ||
the distribution being sampled does not need to be normalized (similarly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also in relation to your question about terminology, maybe just make the connection between "normalized" and the "constants" that you previously mentioned explicit here. Then I think you can freely use all the terms below and not worry about it.
There are three different syntaxes to work with distributions in Stan. The way | ||
to select between them is by determining if the proportionality constants are | ||
necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it's worth mentioning the speed issue here. Because really the user shouldn't even bother thinking about using the unnormalized versions if they're not worried about the computation time, right? Plenty of models are fast enough that the user could ignore this and save themselves the time understanding the differences between the different versions. What do you think?
|
||
``` | ||
x ~ normal(0, 1); | ||
target += normal_lupdf(x | 0, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first time we show this syntax should we say that "u" stands for "unnormalized" to make the connection between the name and the purpose clear?
In addition to my other comments: thanks for working on this! |
Edits made. I'm still not totally happy with how this renders (there's a separate page for each section and it's not obvious how they feed into each other). As a follow up, where in the reference manual should we add stuff? May as well be part of this pull request. I guess I add a couple sections here: https://mc-stan.org/docs/2_25/reference-manual/increment-log-prob-section.html ? |
Is whether it creates a separate page or not determined by the header level (is that the right terminology?), that is how many
Yeah that seems like a good spot. Maybe also a very quick mention of lupdf in the section after that on sampling statements since they're equivalent? |
I added docs to the reference manual for |
@@ -156,7 +156,9 @@ or `add`, but it may be named `pi` or `e`. | |||
Variable names will also conflict with the names of distributions | |||
suffixed with `_lpdf`, `_lpmf`, `_lcdf`, and `_lccdf`, `_cdf`, and | |||
`_ccdf`, such as `normal_lcdf_log`; this also holds for the deprecated | |||
forms `_log`, `_cdf_log`, and `_ccdf_log`. | |||
forms `_log`, `_cdf_log`, and `_ccdf_log`. No user-defined variable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, we need to revise this section actually. As of 2.25 Stan Math function names are allowed as identifiers. The only restriction is that identifier can not end with _lupdf/_lupmf
So allowed: add, normal_lpdf, sum,...
Not allowed: normal_lupdf, p_lupmf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to push directly into this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a few comments, but this looks good. Thanks @bbbales2!
src/reference-manual/statements.Rmd
Outdated
The `normal_lupdf` function returns the log density of an unnormalized distribution. | ||
With the unnormalized version of the function, Stan does not define what the | ||
normalization constant will be, though usually as many terms as possible are dropped | ||
to make the calculation fast. One possible definition of `normal_lupdf` is: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why "One possible definition of" and not "The definition of"? Is this saying that Stan has multiple possible versions of normal_lupdf
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this to try to make it less ambiguous
Other `_lupdf` and `_lupmf` functions used in the definition of | ||
`foo_lpdf` will drop additive constants. If there are no `_lupdf` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might just be confused (as is often the case!) but it says just above that
if any other unnormalized density functions are used inside the user-defined function, the
_lpdf
and_lpmf
forms of the user-defined function will change these densities to be normalized
Does that contradict this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is right but I rewrote it to make it clearer.
@jgabry I must have just forgotten documentation existed sometime in December. This is the second piece of nearly finished doc that was just chilling. |
Submission Checklist
Summary
@rok-cesnovar @jgabry Can both of you have a look at this?
I'm not sold on the organization yet, and we need to make sure we really want to do this as a separate chapter (the chapter titles need to stay compatible into the future cause this is how links pointing to the old docs are forwarded to the new docs)
I was realizing when I wrote this that maybe I should have done the reference manual version of this first. I'm not sure how much of what I wrote here is duplicate or what.
Copyright and Licensing
Issue: #287
Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Columbia University
By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses: