Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Introduce complementary return value in linkinv #190

Closed
wants to merge 8 commits into from

Conversation

andreasnoack
Copy link
Member

Introduce complementary return value in linkinv indicating when the complimentary
inverse link function has been evaluated instead of the inverse link function.
This greatly improves the range for which the linear predictor gives non-zero
variance components.

This PR also changes the convergence criterion to use absolute difference alongside relative difference.
This is useful when data is completely separated.

@dmbates This will greatly improve the accuracy of the glmvar evaluation in the CloglogLink
case. More generally, it also simplifies the code and avoids computing μ twice so there should be
a (slight) speedup here in the non-canonical link case. The cost is that linkinv returns a tuple of two arguments
which might be a little inconvenient but since it is mainly an internal helper function I guess it is okay.

I think we have discussed earlier what can be assumed about (dμ/dη)/glmvar but I've now convinced myself
that we can avoid clamping the values and instead assume that 0/0=0. For the score contributions in the binomial case,
the computations are something like (sorry about ∂s. I forgot that these aren't partials)

image

so in practice, I'm setting wrkresid to zero when a NaN is detected to neutralize the score contribution. For the variance
weight, the computation is something like

image

So as long as the second derivative of μ has limit zero when η goes to infinity, we should be fine. I think that should always be the case.

Similar situations can happen for the Poisson when μ goes to zero but substituting NaNs with zeros should be okay
under a similar condition on the second derivative (going to negative infinity).

@dmbates I'm fine with analyzing all the relevant cases but I'm not sure which non-canonical link functions are common outside the binomial model. Do you think there are relevant cases where these assumptions won't hold?

criterion.

Introduce complementary return value in linkinv indicating when the complimentary
inverse link function has been evaluated instead of the inverse link function.
This greatly improves the range for which the linear predictor gives nonzero
variance components.
@@ -208,7 +213,7 @@ function _fit!(m::AbstractGLM, verbose::Bool, maxIter::Integer, minStepFac::Real
linpred!(lp, p, 0)
updateμ!(r, lp)
end
devold = deviance(m)
devold = deviance(m)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this spacing change accidental?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was accidental

Copy link
Contributor

@dmbates dmbates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks quite interesting. I'll try the example we were having problems with.

@dmbates
Copy link
Contributor

dmbates commented Jul 25, 2017

This looks fine to me. It is a very good idea.

In my experience the vast majority of glm's to be fit are Bernoulli, Binomial or Poisson distributions and the only time I see non-canonical links being used is with the Bernoulli or Binomial. Very occasionally a non-canonical link is used with Poisson or Gamma but that is often a result of misunderstanding. Switching to a complementary inverse link in the Bernoulli and Binomial cases when it is well-defined makes good sense.

@dmbates
Copy link
Contributor

dmbates commented Jul 26, 2017

Tests are showing some failures due to, I think, the assumption in predict that linkinv returns a scalar.

The particular case of using the complementary inverse link makes sense for cases where 0 < μ < 1 is enforced and I think that only the Bernoulli and Binomial distributions do that. What I would suggest is creating a inverselinkorcomplement function that is used only for those distributions and defining a default

function linkinv(L::Link, η)
    μ, c = inverselinkorcomplement(η)
    c ? μ : one(μ) - μ
end

All the link functions that do not map to (0, 1) just define linkinv directly (sloppy terminology but I think you know what I mean).

This should never end up being called if we simply define a method for updateμ! that is specific to
D<:Union{Bernoulli,Binomial} in addition to the one with an explicit LogitLink. It is still worthwhile keeping an explicit method for the D<:Union{Bernoulli,Binomial},L<:LogitLink case because of the cancellation for a canonical link.

I'll write this up and you can see what you think. While I am at it, I think I will remove the calls to Threads.@threads in updateμ!. I checked on some big models and it actually slows things down to use 4 threads.

src/glmtools.jl Outdated
linkfun(::LogitLink, μ) = logit(μ)
linkinv(::LogitLink, η) = (logistic(-abs(η)), η > 0)
linkfun(::LogitLink, μ) = log(μ / (one(μ) - μ))
linkinv(::LogitLink, η) = (inv(xp1(exp(-abs(η)))), η > 0)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be inv(xp1(exp(abs(η)))) and is the reason why the tests fail. However, I think we should just keep using StatsFuns for now. It is a dependency of Distributions which GLM depends on so it is already loaded anyway.

@andreasnoack
Copy link
Member Author

The particular case of using the complementary inverse link makes sense for cases where 0 < μ < 1 is enforced and I think that only the Bernoulli and Binomial distributions do that.

Notice that all the other linkinvs just return false in the second argument so the general updateμ! should still be fine although it is a bit ugly with all the ternaries. I thought about defining linkcinv methods for the link functions for Binomial/Bernoulli but, as you mention, then we'd have to specialize updateμ! for these models just for this so I decided it would be nicer to hide behind dispatch in linkinv. It is a bit annoying, though, that the inverse link function now returns two values.

@andreasnoack
Copy link
Member Author

andreasnoack commented Jul 26, 2017

While I am at it, I think I will remove the calls to Threads.@threads in updateμ!. I checked on some big models and it actually slows things down to use 4 threads.

We should check and see if there is a problem with Boxed variable in the closure. I can take look.

I might have mentioned this before, but a thing I'd also like to look into is using a gradient based criterion for the convergence since that would save us a lot of logarithms.

@dmbates
Copy link
Contributor

dmbates commented Jul 26, 2017

Regarding the definition of linkinv. I would propose creating

@compat abstract type Link01 <: Link end

so that LogitLink, ProbitLink, CauchitLink and CloglogLink are subtypes of Link01.

Methods for linkinvcomplement are defined for these links whereas other links have methods for linkinv.

I have this written now in anj/abstol branch. Would you prefer that I create a new branch for this or can I commit to this branch?

@andreasnoack
Copy link
Member Author

Would you prefer that I create a new branch for this or can I commit to this branch?

No, it's fine. Let's just work on this branch.

Normal()
```
generates the `ProbitLink`.
Its cdf and quantile function are defined in terms of the "complementary error function", `erfc`, and its inverse, `erfcinv`, from the `StatsFuns` package.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are actually in SpecialFunctions but imported in StatsFuns

### Properties of linear models

The probability model for a linear model can be written
\begin{equation}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that valid Markdown? Also, should probably break lines at 92 chars, as we do it e.g. in README.md.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file extension is .jmd, but I'm not sure what that is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT that's a Weave.jl Markdown file, but I'm not sure what this implies in terms of syntax.

"""
loglik_obs(D::Distribution, y, μ, wt, ϕ)

Return the log-likelihood contribution for `y` under distribution `D` with mean `μ`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be worth mentioning "for a single observation" or something similar.

@andreasnoack
Copy link
Member Author

Superseded by #192

@andreasnoack andreasnoack deleted the anj/abstol branch September 22, 2017 08:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants