Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement nulldeviance and nullloglikelihood for GLMs #479

Merged
merged 12 commits into from
Jul 1, 2022
Merged

Conversation

nalimilan
Copy link
Member

@nalimilan nalimilan commented Apr 18, 2022

No description provided.

Models with offets are not supported for now.
src/glmfit.jl Show resolved Hide resolved
@palday
Copy link
Member

palday commented Apr 20, 2022

From McCullagh and Nelder:
image

For the Binomial, m is the number of Bernoulli trials, i.e. the first parameter in the usual formulation.

@nalimilan
Copy link
Member Author

What am I supposed to conclude from this excerpt? :-D AFAICT the call to devresid is equivalent to these formulas, right? But I'm still not sure how to handle offsets.

@palday
Copy link
Member

palday commented Apr 20, 2022

Yeah, the excerpt was also just for my own note-taking as the closest thing I found to an efficient closed-form formula for the null deviance.

I think offsets are just always awkward to handle. The offset is added to mu, right? Then we have mean(mu+offset) = mean(mu) + mean(offset) and can just use that in all our calculations, which will give the same problems you mentioned in R (#477 (comment)).

For models without an intercept, the null model is actually the same as forcing the mean to zero, right? I'm not sure if that will work numerically for some distributions, but maybe we should warn/error in those cases.

@nalimilan
Copy link
Member Author

nalimilan commented Apr 20, 2022

I think offsets are just always awkward to handle. The offset is added to mu, right? Then we have mean(mu+offset) = mean(mu) + mean(offset) and can just use that in all our calculations, which will give the same problems you mentioned in R (#477 (comment)).

Not exactly, but close: the offset is included in eta, i.e. eta = XB + offset.

After some more investigations, I've made some progress: I'm able to compute the null deviance for a Poisson model with a log link and an offset. I'm not sure why it doesn't work for a Normal model with a log link and an offset. I'll see whether I can find a solution, and if not I will at least push what I have. EDIT: the Normal distribution is probably one of the cases that the R docs refer to with "this will be incorrect if the link function depends on the data other than through the fitted mean". Maybe we should just throw an error for distributions with a dispersion parameter, as it's always better than returning incorrect results.

For models without an intercept, the null model is actually the same as forcing the mean to zero, right? I'm not sure if that will work numerically for some distributions, but maybe we should warn/error in those cases.

I wondered the same thing. The null model is defined as the one with only the intercept in the docstring currently. For models without an intercept, it could make sense to define it as the model for which all predicted values are 0. Though these things are confusing, so throwing an error would be safer, at least as long as users don't request that feature. This matters especially because people may call r2 for GLMs without realizing what null model is taken as the reference.

@codecov-commenter
Copy link

codecov-commenter commented Apr 24, 2022

Codecov Report

Merging #479 (a58f8e6) into master (f149f2b) will increase coverage by 0.28%.
The diff coverage is 92.30%.

@@            Coverage Diff             @@
##           master     #479      +/-   ##
==========================================
+ Coverage   87.16%   87.44%   +0.28%     
==========================================
  Files           7        7              
  Lines         849      900      +51     
==========================================
+ Hits          740      787      +47     
- Misses        109      113       +4     
Impacted Files Coverage Δ
src/glmfit.jl 81.25% <92.30%> (+1.95%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f149f2b...a58f8e6. Read the comment docs.

@nalimilan
Copy link
Member Author

I had code that computes the null deviance/log-likelihood for Poisson models with offsets, but I'm afraid it loses precision significantly in some cases due to the back and forth between operations on the response scale and operations on the linear predictor scale. For a Binomial model with large offsets, this gave an infinite deviance. Given that we need a fallback that does fit the null model for other distributions anyway, I figured it would be simpler to always use it when there are offsets. I've kept the intermediate commits if you want to have a look.

Comment on lines +339 to +340
maxiter=m.maxiter, minstepfac=m.minstepfac,
atol=m.atol, rtol=m.rtol)
Copy link
Member Author

@nalimilan nalimilan Apr 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure whether that passing the tolerance is really needed, as the null model should converge much more quickly than more complex models. Maybe we don't need to use a low tolerance just to compute the deviance.

Note that this requires storing these values in new GeneralizedLinearModel fields (which could be useful in its own right).

@nalimilan
Copy link
Member Author

Another issue is that for models fitted with negbin, the null model is taken to be the one with the same theta parameter, which will have a worse fit than calling negbin again. Maybe negbin should fill a field in the model object to indicate that theta was chosen automatically so that we throw an error from nulldeviance and nullloglikelihood instead of returning misleading results? (We can't always throw an error for NegativeBinomial as users can specify theta manually.)

src/glmfit.jl Outdated Show resolved Hide resolved
src/glmfit.jl Outdated
Comment on lines 281 to 285
nullm = fit(GeneralizedLinearModel,
fill(1.0, length(y), 1), y, d, L(), wts=wts, offset=offset,
maxiter=m.maxiter, minstepfac=m.minstepfac,
atol=m.atol, rtol=m.rtol)
dev = deviance(nullm)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we're missing an obvious optimization here, but I'm also unwilling to spend too much time optimizing a code path for something that I view as statistically a bit dubious/problematic weird.

@nalimilan
Copy link
Member Author

Given that we now have a clear definition the null model when theere's no intercept in StatsAPI (JuliaStats/StatsAPI.jl#14) and an implementation for linear models (#481), I've pushed changes to support the no intercept case.

@nalimilan nalimilan requested a review from palday June 11, 2022 14:11
src/glmfit.jl Show resolved Hide resolved
src/glmfit.jl Show resolved Hide resolved
@inbounds for i in eachindex(y)
dev += devresid(d, y[i], mu)
end
end
else
X = fill(1.0, length(y), hasint ? 1 : 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
X = fill(1.0, length(y), hasint ? 1 : 0)
X = fill(1.0, length(y), hasint)

src/glmfit.jl Show resolved Hide resolved
@nalimilan
Copy link
Member Author

@palday Good to go?

@nalimilan nalimilan merged commit ebc82cc into master Jul 1, 2022
@nalimilan nalimilan deleted the nl/null branch July 1, 2022 15:52
@nalimilan nalimilan mentioned this pull request Jul 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants