-
-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/issue 2966 add 7 parameter ddm cdf and ccdf #3042
base: develop
Are you sure you want to change the base?
Feature/issue 2966 add 7 parameter ddm cdf and ccdf #3042
Conversation
…-7-parameter-DDM-CDF-and-CCDF merge PDF branch
merge develop with wiener_lpdf
You can continue. I changed some aspects regarding the structure. |
Jenkins Console Log Machine informationNo LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focalCPU: G++: Clang: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few Qs about the design of some of the functions. I'm seeing a lot of if statements on conditions. It's unclear which exist as an optimization and which exist to stop NaN or Infinite.
If it's an optimization then tbh I'd rather we remove it and just have one path. Often these branches are very expensive for the compiler to take since they can change each iteration. Having one linear path makes things simpler for the compiler even if it involves a extra computation.
If there are places where there are massive saves from a branch then I'm open to it if they have greater than or less than as those are not unlikely. But the odds of a parameter being an exact value are very low so those branches have a very low odds of being taken
inline auto wiener_prob_grad_a(const T_a& a, const T_v& v, | ||
const T_w& w) noexcept { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used once so I'd inline this as well. I think it would be fine to just comment above the section
// derivative of the wiener probability w.r.t. 'a' (on log-scale)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I inlined the three partial derivatives.
*/ | ||
template <typename T_a, typename T_w, typename T_v> | ||
inline auto wiener_prob(const T_a& a, const T_v& v_value, | ||
const T_w& w_value) noexcept { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd remove the noexcept
s on these. It's unlikely, but for any function that can hold an autodiff type we call malloc
sometimes and our code to call malloc can throw an exception if the user runs out of memory. The exception is a program terminating one, but with noexcept
here the program would just crash if it hit the exception instead of reporting the out of memory issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. For which functions shall I delete the noexcept
? I now deleted it for wiener_prob
. When do we use noexcept
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You only want to use noexcept
when the function is guranteed to not throw an exception. It just tells the compiler, "I promise I cannot throw an exception here" so then the compiler can do optimizations.
For anything that uses Eigen matrices we can possibly throw an exception. It's good for simpler functions where you can say for certain there cannot be an exception thrown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. But in this wiener_prob
function I do not see a term like Eigen
or Matrix
. And in the other functions in this file neither. So, every function can keep its noexcept
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but it can call malloc and that would cause an exception
if (exponent < 0) { | ||
return ret_t(log1m_exp(exponent) - log_diff_exp(2 * v * a * w, exponent)); | ||
} else { | ||
return ret_t(log1m_exp(-exponent) - log1m_exp(2 * v * a)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm kind of confused by these cutpoints. Is this because the derivative is ill defined at certain areas or is this a math optimization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is a math optimization and should stay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The if statement here could be more more expensive than the ops that are saved. I'd remove all of these and just keep things simple. It also just becomes really hard to read and maintain with all of these if statements in the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Results are more robust when we have this case distinction. They both shall compute the same result, but when exponent < 0
then the upper case is more robust and when exponent >=0
the lower case is more robust. We could insert a comment on this to make the case distinction clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by robust here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Numerically robust.
if (fabs(v) == 0.0) { | ||
return ret_t(0.0); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the odds v
will be exactly 0? I think this path is unlikely and would remove it.
In general, the odds a parameter takes on any given single value is very low so I would remove if statements like this. 99.9% of the time they won't be taken and forces the compiler to do a check and branch that is often unnecessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The user can call the model with a fixed value for v. In some cases the user could want to set v=0
for his tests. Then, v
will not be estimated and is constant at 0. For this case we need these if-statements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would a value of 0 for v cause the rest of the code to error out? If not then remove these if statements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed all spots where it is not necessary. One spot is still needed in wiener_prob
to not cause an error.
inline auto wiener_prob_derivative_term(const T_a& a, const T_v& v_value, | ||
const T_w& w_value) noexcept { | ||
using ret_t = return_type_t<T_a, T_w, T_v>; | ||
const auto exponent_m1 = log1p(-1.1 * 1.0e-8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does this hard coded value come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This hard coded value is connected to the internal precision of this computation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it have 1e-8 precision? I'm asking where that number comes from
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, I have to correct myself.
This term serves also numerical stability. We test whether the exponents are larger than this small value. If this is the case, then they are negative and very near to 0. In the limit, when the exponents go to 0, the result is -w
. As we later have to divide by the exponents, we would have to divide by nealry zero. Therefore, we do this check and return -w
for exponents very near to zero to make compuatations more stable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to have to tag @bob-carpenter here as I'm unsure of how we usually handle things like this.
Bob the value exponent_m1
is essentially -1.1e-08
. When we do the calculations starting on line 62 we check that each of the exponents are greater than this value. If any of those values are less than -1.1e-08
then the code returns -w
.
As we later have to divide by the exponents, we would have to divide by nearly zero.
Sorry where in this code is the divide happening? Everything is on the log scale here so division is just turning into subtraction
This comment was marked as off-topic.
This comment was marked as off-topic.
@SteveBronder: Is the code better now? |
This comment was marked as off-topic.
This comment was marked as off-topic.
merge develop with new hcubature
This comment was marked as off-topic.
This comment was marked as off-topic.
Hey @SteveBronder, is the code ok now or shall I change some more things? |
Hey @SteveBronder, which changes would you like to have before continuing? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had another look today. @andrjohns can you look at this code? I'm not super familiar with why a lot of the branches here exist so it's pretty difficult for me to review
inline auto wiener_prob_derivative_term(const T_a& a, const T_v& v_value, | ||
const T_w& w_value) noexcept { | ||
using ret_t = return_type_t<T_a, T_w, T_v>; | ||
const auto exponent_m1 = log1p(-1.1 * 1.0e-8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to have to tag @bob-carpenter here as I'm unsure of how we usually handle things like this.
Bob the value exponent_m1
is essentially -1.1e-08
. When we do the calculations starting on line 62 we check that each of the exponents are greater than this value. If any of those values are less than -1.1e-08
then the code returns -w
.
As we later have to divide by the exponents, we would have to divide by nearly zero.
Sorry where in this code is the divide happening? Everything is on the log scale here so division is just turning into subtraction
template <typename T_x> | ||
inline auto rexp(T_x&& x) noexcept { | ||
return (x <= 700) ? exp(x) : exp(700); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tagging @andrjohns do we do things like this anywhere else in Stan math? tmk we just check if it would overflow and if so we normally just return +/- inf or we let it overflow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete this. If the user is going to rollover a double they can rescale their problem
inline auto log_probability_distribution(const T_a& a, const T_v& v, | ||
const T_w& w) noexcept { | ||
using ret_t = return_type_t<T_a, T_w, T_v>; | ||
auto nearly_one = ret_t(1.0 - 1.0e-6); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bob-carpenter this is another spot where I'm getting a bit confused. Here the code is checking if we are within 1.0e-6 of 1 and then branching off of that. How do we normally handle cases like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you just use std::numeric_limits<ret_t>::min()
here instead of 1e-6?
Hey @SteveBronder, I worked through your suggestions and pushed a new version. Is it good for these points? |
Jenkins Console Log Machine informationNo LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focalCPU: G++: Clang: |
Hello, could somebody build a tarbal such that other people can easily install this version of Stan comprising the new wiener_cdf and wiener_ccdf function and signature? |
Hey, @andrjohns built a tarball for me. My stanc3 signature is in this repo. @SteveBronder, which are the next steps to be done to continue this PR? |
The math here is a difficult for me to review. @andrjohns can you look at this? |
if (v > 0) { | ||
const auto exponent = -2.0 * v * a * w; | ||
prob_grad_w | ||
= exp(LOG_TWO + exponent + log(fabs(v)) + log(a) - log1m_exp(exponent)); | ||
} else if (v < 0) { | ||
const auto exponent = 2.0 * v * a * w; | ||
prob_grad_w = exp(LOG_TWO + log(fabs(v)) + log(a) - log1m_exp(exponent)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this just be rewritten as
if (v > 0) { | |
const auto exponent = -2.0 * v * a * w; | |
prob_grad_w | |
= exp(LOG_TWO + exponent + log(fabs(v)) + log(a) - log1m_exp(exponent)); | |
} else if (v < 0) { | |
const auto exponent = 2.0 * v * a * w; | |
prob_grad_w = exp(LOG_TWO + log(fabs(v)) + log(a) - log1m_exp(exponent)); | |
} | |
const bool exp_sign = (v > 0) ? -1 : 1; | |
const auto exponent = exp_sign * 2.0 * v * a * w; | |
prob_grad_w = exp(LOG_TWO + log(fabs(v)) + log(a) - log1m_exp(exponent)); | |
if (exp_sign == -1) { | |
prob_grad_w *= exp(exponent) | |
} |
Also for places with if statements like this can you make a comment on why this split has to happen?
template <typename T_x> | ||
inline auto rexp(T_x&& x) noexcept { | ||
return (x <= 700) ? exp(x) : exp(700); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete this. If the user is going to rollover a double they can rescale their problem
if (x > 1.0e5) { | ||
return -log(x); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this cutoff point here?
inline auto log_probability_distribution(const T_a& a, const T_v& v, | ||
const T_w& w) noexcept { | ||
using ret_t = return_type_t<T_a, T_w, T_v>; | ||
auto nearly_one = ret_t(1.0 - 1.0e-6); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you just use std::numeric_limits<ret_t>::min()
here instead of 1e-6?
auto minus_two_va_one_minus_w = (-2.0 * v * a * (1.0 - w)); | ||
ret_t prob; | ||
if (minus_two_va_one_minus_w < 0) { | ||
const auto exp_arg = exp(minus_two_va_one_minus_w); | ||
if (exp_arg >= nearly_one) { | ||
return ret_t(log1p(-w)); | ||
} | ||
auto two_vaw = 2 * v * a * w; | ||
if (two_vaw > minus_two_va_one_minus_w) { | ||
prob = log1p(-exp_arg) - log_diff_exp(two_vaw, minus_two_va_one_minus_w); | ||
} else if (two_vaw < minus_two_va_one_minus_w) { | ||
prob = log1p(-exp_arg) - log_diff_exp(minus_two_va_one_minus_w, two_vaw); | ||
} else { | ||
prob = log1p(-exp_arg) - NEGATIVE_INFTY; | ||
} | ||
} else { | ||
const auto exp_arg = exp(-minus_two_va_one_minus_w); | ||
if (exp_arg >= nearly_one) | ||
return ret_t(log1p(-w)); | ||
prob = log1p(-exp_arg) - log1p(-exp(2 * v * a)); | ||
} | ||
return prob; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Early returns are preferred to show when a branch has finished
auto minus_two_va_one_minus_w = (-2.0 * v * a * (1.0 - w)); | |
ret_t prob; | |
if (minus_two_va_one_minus_w < 0) { | |
const auto exp_arg = exp(minus_two_va_one_minus_w); | |
if (exp_arg >= nearly_one) { | |
return ret_t(log1p(-w)); | |
} | |
auto two_vaw = 2 * v * a * w; | |
if (two_vaw > minus_two_va_one_minus_w) { | |
prob = log1p(-exp_arg) - log_diff_exp(two_vaw, minus_two_va_one_minus_w); | |
} else if (two_vaw < minus_two_va_one_minus_w) { | |
prob = log1p(-exp_arg) - log_diff_exp(minus_two_va_one_minus_w, two_vaw); | |
} else { | |
prob = log1p(-exp_arg) - NEGATIVE_INFTY; | |
} | |
} else { | |
const auto exp_arg = exp(-minus_two_va_one_minus_w); | |
if (exp_arg >= nearly_one) | |
return ret_t(log1p(-w)); | |
prob = log1p(-exp_arg) - log1p(-exp(2 * v * a)); | |
} | |
return prob; | |
auto minus_two_va_one_minus_w = (-2.0 * v * a * (1.0 - w)); | |
if (minus_two_va_one_minus_w < 0) { | |
const auto exp_arg = exp(minus_two_va_one_minus_w); | |
if (two_vaw > minus_two_va_one_minus_w) { | |
return ret_t(log1p(-exp_arg) - log_diff_exp(two_vaw, minus_two_va_one_minus_w)); | |
} else if (two_vaw < minus_two_va_one_minus_w) { | |
return ret_t(log1p(-exp_arg) - log_diff_exp(minus_two_va_one_minus_w, two_vaw)); | |
} else { | |
return ret_t(log1p(-exp_arg) - NEGATIVE_INFTY); | |
} | |
} else { | |
const auto exp_arg = exp(-minus_two_va_one_minus_w); | |
return ret_t(log1p(-exp_arg) - log1p(-exp(2 * v * a))); | |
} |
I also removed exp_arg >= nearly_one
. Can you show me an example where removing that causes the wrong answer?
if (fabs(v) == 0.0) { | ||
return ret_t(-w); | ||
} | ||
nearly_one = ret_t(1.0 - 1.1 * 1.0e-5); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make a test case that shows where this would fail if nearly_one
was removed
auto log_quotient = log1p(-exp_two_avw) - log1p(-exp_two_av); | ||
if (log(w) > log_quotient) { | ||
prob += log_diff_exp(log(w), log_quotient); | ||
prob = exp(prob); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Early return for all of these
prob = exp(prob); | |
return exp(prob); |
prob = exp(prob); | ||
} | ||
} | ||
return (is_inf(prob)) ? NEGATIVE_INFTY : prob; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inf is fine no need for the check here
if (is_inf(y)) { | ||
return ret_t(log_probability_distribution(a, v, w)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If y is infinity here why is it removed from the calculation?
Summary
With this PR the CDF and the CCDF of the 7-parameter diffuion model are added.
See issue #2966
Relates to issue #2822
Tests
We implemented analogous tests as for the PDF
Side Effects
no
Release notes
CDF and CCDF for the 7-parameter diffusion model. Allows modeling truncated and censored data.
Checklist
Copyright holder: Franziska Henrich, Christoph Klauer
The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
the basic tests are passing
./runTests.py test/unit
)make test-headers
)make test-math-dependencies
)make doxygen
)make cpplint
)the code is written in idiomatic C++ and changes are documented in the doxygen
the new changes are tested