-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Introduce complementary return value in linkinv #190
Conversation
criterion. Introduce complementary return value in linkinv indicating when the complimentary inverse link function has been evaluated instead of the inverse link function. This greatly improves the range for which the linear predictor gives nonzero variance components.
@@ -208,7 +213,7 @@ function _fit!(m::AbstractGLM, verbose::Bool, maxIter::Integer, minStepFac::Real | |||
linpred!(lp, p, 0) | |||
updateμ!(r, lp) | |||
end | |||
devold = deviance(m) | |||
devold = deviance(m) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this spacing change accidental?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it was accidental
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks quite interesting. I'll try the example we were having problems with.
This looks fine to me. It is a very good idea. In my experience the vast majority of glm's to be fit are Bernoulli, Binomial or Poisson distributions and the only time I see non-canonical links being used is with the Bernoulli or Binomial. Very occasionally a non-canonical link is used with Poisson or Gamma but that is often a result of misunderstanding. Switching to a complementary inverse link in the Bernoulli and Binomial cases when it is well-defined makes good sense. |
Tests are showing some failures due to, I think, the assumption in predict that The particular case of using the complementary inverse link makes sense for cases where 0 < μ < 1 is enforced and I think that only the Bernoulli and Binomial distributions do that. What I would suggest is creating a inverselinkorcomplement function that is used only for those distributions and defining a default
All the link functions that do not map to (0, 1) just define This should never end up being called if we simply define a method for I'll write this up and you can see what you think. While I am at it, I think I will remove the calls to |
src/glmtools.jl
Outdated
linkfun(::LogitLink, μ) = logit(μ) | ||
linkinv(::LogitLink, η) = (logistic(-abs(η)), η > 0) | ||
linkfun(::LogitLink, μ) = log(μ / (one(μ) - μ)) | ||
linkinv(::LogitLink, η) = (inv(xp1(exp(-abs(η)))), η > 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be inv(xp1(exp(abs(η))))
and is the reason why the tests fail. However, I think we should just keep using StatsFuns
for now. It is a dependency of Distributions
which GLM
depends on so it is already loaded anyway.
Notice that all the other |
We should check and see if there is a problem with I might have mentioned this before, but a thing I'd also like to look into is using a gradient based criterion for the convergence since that would save us a lot of logarithms. |
Regarding the definition of
so that Methods for I have this written now in |
No, it's fine. Let's just work on this branch. |
Normal() | ||
``` | ||
generates the `ProbitLink`. | ||
Its cdf and quantile function are defined in terms of the "complementary error function", `erfc`, and its inverse, `erfcinv`, from the `StatsFuns` package. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are actually in SpecialFunctions
but imported in StatsFuns
### Properties of linear models | ||
|
||
The probability model for a linear model can be written | ||
\begin{equation} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that valid Markdown? Also, should probably break lines at 92 chars, as we do it e.g. in README.md.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The file extension is .jmd, but I'm not sure what that is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT that's a Weave.jl Markdown file, but I'm not sure what this implies in terms of syntax.
""" | ||
loglik_obs(D::Distribution, y, μ, wt, ϕ) | ||
|
||
Return the log-likelihood contribution for `y` under distribution `D` with mean `μ`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be worth mentioning "for a single observation" or something similar.
Superseded by #192 |
Introduce complementary return value in
linkinv
indicating when the complimentaryinverse link function has been evaluated instead of the inverse link function.
This greatly improves the range for which the linear predictor gives non-zero
variance components.
This PR also changes the convergence criterion to use absolute difference alongside relative difference.
This is useful when data is completely separated.
@dmbates This will greatly improve the accuracy of the
glmvar
evaluation in theCloglogLink
case. More generally, it also simplifies the code and avoids computing μ twice so there should be
a (slight) speedup here in the non-canonical link case. The cost is that
linkinv
returns a tuple of two argumentswhich might be a little inconvenient but since it is mainly an internal helper function I guess it is okay.
I think we have discussed earlier what can be assumed about
(dμ/dη)/glmvar
but I've now convinced myselfthat we can avoid clamping the values and instead assume that
0/0=0
. For the score contributions in the binomial case,the computations are something like (sorry about ∂s. I forgot that these aren't partials)
so in practice, I'm setting
wrkresid
to zero when aNaN
is detected to neutralize the score contribution. For the varianceweight, the computation is something like
So as long as the second derivative of μ has limit zero when η goes to infinity, we should be fine. I think that should always be the case.
Similar situations can happen for the Poisson when μ goes to zero but substituting
NaN
s with zeros should be okayunder a similar condition on the second derivative (going to negative infinity).
@dmbates I'm fine with analyzing all the relevant cases but I'm not sure which non-canonical link functions are common outside the binomial model. Do you think there are relevant cases where these assumptions won't hold?