Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accuracy of normal_lccdf much worse than normal_lcdf #1985

Open
lciti opened this issue Jul 25, 2020 · 8 comments
Open

Accuracy of normal_lccdf much worse than normal_lcdf #1985

lciti opened this issue Jul 25, 2020 · 8 comments
Labels
distributions Issues that deal with distribution functions: pdf, pmf, cdf numerics Numerical issues

Comments

@lciti
Copy link

lciti commented Jul 25, 2020

Description

The implementation for normal_lcdf has been improved significantly following #1284 by introducing a polynomial approximation of the Mills ratio based on a paper by Cody (1969). It looks like normal_lccdf has not been updated accordingly and instead still uses the older and much less accurate solution.

Incidentally, I also noticed that normal_lcdf and std_normal_lcdf implement the same solution with very similar code. I wonder if one could retain the current implementation of std_normal_lcdf and write std_normal_lccdf, normal_lcdf and normal_lccdf to simply call std_normal_lcdf with appropriate parameters (-x, (x-mu)/sigma, (mu-x)/sigma, respectively).

Current Version:

v3.2.0

@bbbales2
Copy link
Member

@PhilClemson what do you think?

@PhilClemson
Copy link
Contributor

Sorry, this is something that was mentioned in the original pull request for the normal_lcdf changes but seems it was forgotten about.

@nhuurre mentioned that the functions could be standardised in a log_Phi function (#1411 (comment)) but this would require separate numerical approximations.

It seems std_normal_lcdf was added by @mcol to follow the structure of the other probability distributions. I guess this might be a wider issue since there's also the std_normal sampling statement within stan?

In any case I don't think there's anything to stop us from adding the improvements to normal_lccdf so that should probably be done as a minimum.

@bob-carpenter
Copy link
Contributor

bob-carpenter commented Jul 27, 2020

call std_normal_lcdf with appropriate parameters

What parameters? As a distribution, std-normal doesn't have any parameters. It's just

std_normal_lpdf(y) == normal_lpdf(y | 0, 1).

Accuracy of normal_lccdf much worse than normal_lcdf

lccdf should be more accurate when the cdf is near 1, and the lcdf should be more accurate when the cdf is near 0.

Most of our lcdf and lccdf functions are terrible log(cdf) and log(1 - cdf) implementations, where the lccdf form will be terrible when the cdf is near 1 rather than better.

@lciti
Copy link
Author

lciti commented Jul 27, 2020

@bob-carpenter Sorry, I should have been clearer. I meant "parameters" in a CS sense as in arguments. There could be a common C++ function (which @nhuurre called log_Phi) implementing the logCDF of a std normal and all the 4 relevant Stan functions could call that same C++ function under the hood, with appropriate function arguments. For example std_normal_lccdf could call this function with argument -x.

This is a log cdf so I am not sure your argument about the accuracy near 0 or 1 really applies. The domain is not (0,1) but (-inf,0). We do not need to sacrifice the accuracy on either side. The issue with CDF and CCDF is that when they are near 1 we have that the important digits are pushed away on the right of the binary representation because of the "big" (implicit) one on the left. This is not the case for logCDF or logCCDF as when the CDF is close to zero then logCDF and logCCDF become large and negative (retaining all their significant digits if computed right as in @PhilClemson 's implementation); on the other side, when the CDF is near one, logCDF and logCCDF are near zero and can retain all their significant digits if implemented right (as they do in @PhilClemson 's implementation). If we did log(CDF(x)) or log(1-CDF(x)), then it would be a problem.

The current implementation of std_normal_lcdf(x) works extremely well. I am currently writing my Stan code using std_normal_lcdf(-x) (which is no big deal) but I would normally write it as std_normal_lccdf(x), which is simpler (I also have to leave a comment for myself to explain why I am doing the former rather than the latter so I don't risk changing it in the future).

@lciti
Copy link
Author

lciti commented Jul 27, 2020

@PhilClemson

@nhuurre mentioned that the functions could be standardised in a log_Phi function (#1411 (comment)) but this would require separate numerical approximations.

I think the comment "this would require separate numerical approximations" only applies when you try to extend log_Phi to non-normal distributions (e.g. exp_mod_normal_lpdf etc). Within the four functions std_normal_lcdf, std_normal_lccdf, normal_lcdf and normal_lccdf, the same approximation works fine as these are just affine transformations of the argument.

@bob-carpenter
Copy link
Contributor

bob-carpenter commented Jul 27, 2020

I meant "parameters" in a CS sense as in arguments.

This gets confusing in stats. But whatever you want to call it, std_normal_lpdf is a function from real numbers to real numbers.

The move from std_normal to normal is the usual location/scale generalization. That's also used for most of our other univariate distributions, though we don't have std_logistic or std_student_t.

The whole point of having std_normal was to have a more efficient way of doing things that didn't involve subtracting 0 and dividing by 1. So presumably we're talking about converting normal to std_normal by (y - mu) / sigma, not the other way around. That will get the right density for the normal, but the derivatives are the bigger concern. We don't want to just autodiff (y - mu) / sigma.

I am not sure your argument about the accuracy near 0 or 1 really applies.

As the cdf approaches 0, log(cdf) approaches -infinity and log(1 - cdf) approaches 0. So you're right that we should be able to get decent accuracy both ways. I was tied up in our current implementations, which are a mess.

Edit: I fat-fingered submitting this before finishing.

We do need the complementary forms for the arguments, though. Just consider a beta(y | 0.5, 0.5) distribution. That's going to go to 0 or 1 as y goes to 0 or 1, but you can't represent a y very close to 1 in floating point, so you have to use the lcdf or cdf as appropriate just for the arguments. Same with any distribution truncated at 0, which needs the cdf.

So I don't think we can just drop the lcdf and lccdf distinction computationally.

@bob-carpenter
Copy link
Contributor

The normal is symmetric around the location parameter, so we can relate

normal_lcdf(y | mu, sigma) == normal_lccdf(mu + -(y - mu) | mu, sigma).

@lciti
Copy link
Author

lciti commented Jul 27, 2020

@bob-carpenter

So presumably we're talking about converting normal to std_normal by (y - mu) / sigma, not the other way around.

Yes.

That will get the right density for the normal, but the derivatives are the bigger concern. We don't want to just autodiff (y - mu) / sigma.

I see.

We do need the complementary forms for the arguments, though. Just consider a beta(y | 0.5, 0.5) distribution. That's going to go to 0 or 1 as y goes to 0 or 1, but you can't represent a y very close to 1 in floating point, so you have to use the lcdf or cdf as appropriate just for the arguments. Same with any distribution truncated at 0, which needs the cdf.

I was only referring to the Gaussian case. This is particularly useful for me because I can model an exponential distribution as a Gaussian in unconstrained space and then define my constrained variable just as y=std_normal_lcdf(-x). It works much better in my case (in terms of convergence) than using the default log transformation you get with <lower=0>.

@syclik syclik added numerics Numerical issues distributions Issues that deal with distribution functions: pdf, pmf, cdf labels Jul 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
distributions Issues that deal with distribution functions: pdf, pmf, cdf numerics Numerical issues
Projects
None yet
Development

No branches or pull requests

5 participants