Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

About dof_residual #509

Open
gragusa opened this issue Nov 16, 2022 · 1 comment 路 May be fixed by #560
Open

About dof_residual #509

gragusa opened this issue Nov 16, 2022 · 1 comment 路 May be fixed by #560
Milestone

Comments

@gragusa
Copy link

gragusa commented Nov 16, 2022

I noticed something odd about dof_residual. In few words, when the family does not have a dispersion_parameter, dof_residual is bumped by 1which is inconsistent with other software (R, Stata, etc.) and with statistics more generally 馃槃.

using DataFrames
using StableRNGs
using RCall

rng = StableRNG(123);

x1 = rand(rng, 25);
x2 = ifelse.(randn(rng, 25) .> 0, 1, 0);
y = ifelse.(0.004 .- 0.01 .* x1 .+ 1.5 .* x2 .+ randn(rng, 25) .> 0, 1, 0);
df = DataFrame(y=y, x1=x1, x2=x2);

## Julia
l = glm(@formula(y~x1+x2), df, Binomial());

## R
@rput df;
R"rl <- glm(y~x1+x2, df, family='binomial')";
@rget rl;

println("Julia dof_residual:", dof_residual(l))
println("R dof_residual: ", rl[:df_residual])

which gives

Julia dof_residual: 23
R dof_residual: 22

The families without a dispersion_parameter are Bernoulli, Binomial, Poisson:

dispersion_parameter(D) = true
dispersion_parameter(::Union{Bernoulli, Binomial, Poisson}) = false

Fortunately, dof_residuals is not used to scale the vcov and so this bug does not have important ramifications. I also noted that we are not testing dof_residual for these families.

The fix is relatively simple: change

dof_residual(obj::LinPredModel) = nobs(obj) - dof(obj) + 1

to

dof_residual(obj::LinPredModel) = nobs(obj) - dof(obj) + dispersion_parameter(obj.rr.d)
@nalimilan
Copy link
Member

Good catch. Actually this definition seems to have been there forever, though the implementation has been changed recently by #265. We should probably add a docstring to explain more precisely what this method returns as it's not necessarily obvious for users how the dispersion parameter must be handled (R doesn't say anything).

Cc: @andreasnoack @Nosferican

@nalimilan nalimilan added this to the 2.0 milestone Jun 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants