RFC: Introduce complementary return value in linkinv #190

andreasnoack · 2017-07-24T22:19:32Z

Introduce complementary return value in linkinv indicating when the complimentary
inverse link function has been evaluated instead of the inverse link function.
This greatly improves the range for which the linear predictor gives non-zero
variance components.

This PR also changes the convergence criterion to use absolute difference alongside relative difference.
This is useful when data is completely separated.

@dmbates This will greatly improve the accuracy of the glmvar evaluation in the CloglogLink
case. More generally, it also simplifies the code and avoids computing μ twice so there should be
a (slight) speedup here in the non-canonical link case. The cost is that linkinv returns a tuple of two arguments
which might be a little inconvenient but since it is mainly an internal helper function I guess it is okay.

I think we have discussed earlier what can be assumed about (dμ/dη)/glmvar but I've now convinced myself
that we can avoid clamping the values and instead assume that 0/0=0. For the score contributions in the binomial case,
the computations are something like (sorry about ∂s. I forgot that these aren't partials)

so in practice, I'm setting wrkresid to zero when a NaN is detected to neutralize the score contribution. For the variance
weight, the computation is something like

So as long as the second derivative of μ has limit zero when η goes to infinity, we should be fine. I think that should always be the case.

Similar situations can happen for the Poisson when μ goes to zero but substituting NaNs with zeros should be okay
under a similar condition on the second derivative (going to negative infinity).

@dmbates I'm fine with analyzing all the relevant cases but I'm not sure which non-canonical link functions are common outside the binomial model. Do you think there are relevant cases where these assumptions won't hold?

criterion. Introduce complementary return value in linkinv indicating when the complimentary inverse link function has been evaluated instead of the inverse link function. This greatly improves the range for which the linear predictor gives nonzero variance components.

dmbates · 2017-07-25T15:42:55Z

src/glmfit.jl

@@ -208,7 +213,7 @@ function _fit!(m::AbstractGLM, verbose::Bool, maxIter::Integer, minStepFac::Real
        linpred!(lp, p, 0)
        updateμ!(r, lp)
    end
-    devold = deviance(m)
+    devold  = deviance(m)


Was this spacing change accidental?

I think it was accidental

dmbates

Looks quite interesting. I'll try the example we were having problems with.

dmbates · 2017-07-25T16:04:37Z

This looks fine to me. It is a very good idea.

In my experience the vast majority of glm's to be fit are Bernoulli, Binomial or Poisson distributions and the only time I see non-canonical links being used is with the Bernoulli or Binomial. Very occasionally a non-canonical link is used with Poisson or Gamma but that is often a result of misunderstanding. Switching to a complementary inverse link in the Bernoulli and Binomial cases when it is well-defined makes good sense.

dmbates · 2017-07-26T17:06:24Z

Tests are showing some failures due to, I think, the assumption in predict that linkinv returns a scalar.

The particular case of using the complementary inverse link makes sense for cases where 0 < μ < 1 is enforced and I think that only the Bernoulli and Binomial distributions do that. What I would suggest is creating a inverselinkorcomplement function that is used only for those distributions and defining a default

function linkinv(L::Link, η)
    μ, c = inverselinkorcomplement(η)
    c ? μ : one(μ) - μ
end

All the link functions that do not map to (0, 1) just define linkinv directly (sloppy terminology but I think you know what I mean).

This should never end up being called if we simply define a method for updateμ! that is specific to
D<:Union{Bernoulli,Binomial} in addition to the one with an explicit LogitLink. It is still worthwhile keeping an explicit method for the D<:Union{Bernoulli,Binomial},L<:LogitLink case because of the cancellation for a canonical link.

I'll write this up and you can see what you think. While I am at it, I think I will remove the calls to Threads.@threads in updateμ!. I checked on some big models and it actually slows things down to use 4 threads.

andreasnoack · 2017-07-26T18:16:36Z

src/glmtools.jl

-linkfun(::LogitLink, μ) = logit(μ)
-linkinv(::LogitLink, η) = (logistic(-abs(η)), η > 0)
+linkfun(::LogitLink, μ) = log(μ / (one(μ) - μ))
+linkinv(::LogitLink, η) = (inv(xp1(exp(-abs(η)))), η > 0)


This should be inv(xp1(exp(abs(η)))) and is the reason why the tests fail. However, I think we should just keep using StatsFuns for now. It is a dependency of Distributions which GLM depends on so it is already loaded anyway.

andreasnoack · 2017-07-26T18:36:54Z

The particular case of using the complementary inverse link makes sense for cases where 0 < μ < 1 is enforced and I think that only the Bernoulli and Binomial distributions do that.

Notice that all the other linkinvs just return false in the second argument so the general updateμ! should still be fine although it is a bit ugly with all the ternaries. I thought about defining linkcinv methods for the link functions for Binomial/Bernoulli but, as you mention, then we'd have to specialize updateμ! for these models just for this so I decided it would be nicer to hide behind dispatch in linkinv. It is a bit annoying, though, that the inverse link function now returns two values.

andreasnoack · 2017-07-26T18:46:10Z

While I am at it, I think I will remove the calls to Threads.@threads in updateμ!. I checked on some big models and it actually slows things down to use 4 threads.

We should check and see if there is a problem with Boxed variable in the closure. I can take look.

I might have mentioned this before, but a thing I'd also like to look into is using a gradient based criterion for the convergence since that would save us a lot of logarithms.

dmbates · 2017-07-26T19:25:34Z

Regarding the definition of linkinv. I would propose creating

@compat abstract type Link01 <: Link end

so that LogitLink, ProbitLink, CauchitLink and CloglogLink are subtypes of Link01.

Methods for linkinvcomplement are defined for these links whereas other links have methods for linkinv.

I have this written now in anj/abstol branch. Would you prefer that I create a new branch for this or can I commit to this branch?

andreasnoack · 2017-07-27T00:19:18Z

Would you prefer that I create a new branch for this or can I commit to this branch?

No, it's fine. Let's just work on this branch.

andreasnoack · 2017-07-28T00:17:22Z

docs/src/links.jmd

+Normal()
+```
+generates the `ProbitLink`.
+Its cdf and quantile function are defined in terms of the "complementary error function", `erfc`, and its inverse, `erfcinv`, from the `StatsFuns` package.


They are actually in SpecialFunctions but imported in StatsFuns

nalimilan · 2017-07-29T15:53:09Z

docs/src/links.jmd

+### Properties of linear models
+
+The probability model for a linear model can be written
+\begin{equation}


Is that valid Markdown? Also, should probably break lines at 92 chars, as we do it e.g. in README.md.

The file extension is .jmd, but I'm not sure what that is.

AFAICT that's a Weave.jl Markdown file, but I'm not sure what this implies in terms of syntax.

nalimilan · 2017-07-30T15:35:28Z

src/glmtools.jl

+"""
+    loglik_obs(D::Distribution, y, μ, wt, ϕ)
+
+Return the log-likelihood contribution for `y` under distribution `D` with mean `μ`,


Could be worth mentioning "for a single observation" or something similar.

andreasnoack · 2017-09-22T08:44:01Z

Superseded by #192

dmbates reviewed Jul 25, 2017

View reviewed changes

dmbates approved these changes Jul 25, 2017

View reviewed changes

dmbates added 3 commits July 26, 2017 11:15

Remove StatsFuns dependency; use function for x + one(x)

140e947

Use broadcast on float

2377758

Use explicit linkinv to avoid complement cases

b8d6885

andreasnoack commented Jul 26, 2017

View reviewed changes

dmbates added 2 commits July 26, 2017 16:43

Link01 definitions; add documentation

dc8632c

Revert to logit, logistic, xlogy from StatsFuns

be76bfd

Added notes on links

ec3c4a8

andreasnoack commented Jul 28, 2017

View reviewed changes

Update docs

0a9d671

nalimilan reviewed Jul 29, 2017

View reviewed changes

nalimilan reviewed Jul 30, 2017

View reviewed changes

dmbates mentioned this pull request Aug 9, 2017

Evaluate inverse link and its derivative simultaneously #192

Merged

andreasnoack closed this Sep 22, 2017

andreasnoack deleted the anj/abstol branch September 22, 2017 08:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Introduce complementary return value in linkinv #190

RFC: Introduce complementary return value in linkinv #190

andreasnoack commented Jul 24, 2017

dmbates Jul 25, 2017

andreasnoack Jul 26, 2017

dmbates left a comment

dmbates commented Jul 25, 2017

dmbates commented Jul 26, 2017

andreasnoack Jul 26, 2017

andreasnoack commented Jul 26, 2017

andreasnoack commented Jul 26, 2017 •

edited by ararslan

Loading

dmbates commented Jul 26, 2017

andreasnoack commented Jul 27, 2017

andreasnoack Jul 28, 2017

nalimilan Jul 29, 2017

ararslan Jul 29, 2017

nalimilan Jul 29, 2017

nalimilan Jul 30, 2017

andreasnoack commented Sep 22, 2017

RFC: Introduce complementary return value in linkinv #190

RFC: Introduce complementary return value in linkinv #190

Conversation

andreasnoack commented Jul 24, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dmbates left a comment

Choose a reason for hiding this comment

dmbates commented Jul 25, 2017

dmbates commented Jul 26, 2017

Choose a reason for hiding this comment

andreasnoack commented Jul 26, 2017

andreasnoack commented Jul 26, 2017 • edited by ararslan Loading

dmbates commented Jul 26, 2017

andreasnoack commented Jul 27, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andreasnoack commented Sep 22, 2017

andreasnoack commented Jul 26, 2017 •

edited by ararslan

Loading