-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjust definition of null model when there is no intercept #14
Conversation
Technically this is breaking, but the current definition gives negative R² for models without the intercept.
If the ping for me was regarding Survival, which defines |
Tagging @cecileane and @pbastide who created the functions that use |
Just to be sure: what would the predictions of a (say) logistic model without any regressor or an intercept be? 0.5 everywhere? |
Yes. For GLMs, the predictor would always be 0 (before applying the link function, giving 0.5 for logit link). |
|
@Nosferican For model families that always have an intercept, this PR is irrelevant, so I guess it's OK?
Indeed, adding
I'm not sure what's the question. This PR wouldn't change the relationship between |
In terms of the within estimator... The transformed data basically zeroes out the intercept. The intercept is then set in various manners for simplicity (e.g., adjusting it such that the predictions are centered around the mean response). That is more of a heuristic/artifact but not really something that comes from the estimator/model... I don't recall if, for some tests that use the ratio between deviance or loglikelihood (e.g., likelihood test), I think some use the other methods for simplicity and this would make that relation break (I think that happens in MixedModels). I sort of like that the |
OK. But then why does adding Anyway, that's more out of curiosity, as it seems that the within model should keep
To be clear, this PR changes the definition of both so that they still refer to the same model. Otherwise things would indeed be really weird. |
will avoid negative r2. consistent with new definition in JuliaStats/StatsAPI.jl#14 Co-authored-by: Cecile Ane <cecileane@users.noreply.github.com>
Great proposal. We updated PhyloNetworks to reflect the change. |
For mixed effects models, I think this is fine and doesn't have much impact. We don't define the tl;dr no objections to the current proposal and doesn't really impact MixedModels.jl because we intentionally don't define these methods. |
Codecov Report
@@ Coverage Diff @@
## main #14 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 2 2
Lines 36 36
=========================================
Hits 36 36
Continue to review full report at Codecov.
|
I've realized there's another tricky point to clarify when defining what is the null model. For models with no intercept but at least one categorical predictor (full dummy coding, as in What do you think? Currently, the definition of Cc: @kleinschmidt |
The latter approach makes more sense to me also, where the null model excludes the intercept for |
OK, let's go with this then. Thanks everybody! |
Technically this is breaking, but the current definition gives negative R² for models without the intercept (JuliaStats/GLM.jl#481).
@Nosferican @jmboehm @lbittarello @crsl4 @getzze @ararslan I saw you define
nulldeviance
and/ornullloglikelihood
in your modeling packages. What do you think of this change? It would be problematic to change the documentation of the function if packages are not updated to reflect it.Cc: @palday @mousum-github