Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust definition of null model when there is no intercept #14

Merged
merged 3 commits into from
Jun 2, 2022

Conversation

nalimilan
Copy link
Member

@nalimilan nalimilan commented May 12, 2022

Technically this is breaking, but the current definition gives negative R² for models without the intercept (JuliaStats/GLM.jl#481).

@Nosferican @jmboehm @lbittarello @crsl4 @getzze @ararslan I saw you define nulldeviance and/or nullloglikelihood in your modeling packages. What do you think of this change? It would be problematic to change the documentation of the function if packages are not updated to reflect it.

Cc: @palday @mousum-github

Technically this is breaking, but the current definition gives negative R² for models without the intercept.
@ararslan
Copy link
Member

If the ping for me was regarding Survival, which defines nullloglikelihood(::CoxModel), all is well since Cox proportional hazards regression models don't have an intercept. 😄 No change would be needed to the package or its documentation. (You can do fit(CoxModel, zeros(length(events), 1), events), for which loglikelihood is the same as nullloglikelihood on a model fit to events with actual data. That's perfectly in line with the suggested changes here.)

@ararslan ararslan changed the title Ajust definition of null model when there is no intercept Adjust definition of null model when there is no intercept May 12, 2022
@crsl4
Copy link

crsl4 commented May 12, 2022

Tagging @cecileane and @pbastide who created the functions that use nulldeviance.

@lbittarello
Copy link

Just to be sure: what would the predictions of a (say) logistic model without any regressor or an intercept be? 0.5 everywhere?

@nalimilan
Copy link
Member Author

Yes. For GLMs, the predictor would always be 0 (before applying the link function, giving 0.5 for logit link).

src/statisticalmodel.jl Outdated Show resolved Hide resolved
@Nosferican
Copy link

  • In the case of a panel random effects (à la Swamy-Arora), the intercept is based on the error model composition (weighted average of the between and within estimator components). In that case, there isn't a model without an "intercept".
  • For the nominal (logistic) model, there is no model without an intercept for each of the potential outcomes. I noticed I had not implemented the nulldeviance for that model.
  • For the ordinal (logistic) model, there is an intercept for the number of outcomes - 1. I had not implemented the nulldeviance for those yet...
  • In terms of the within estimator (or LSDV), the intercept is a dummy for adjusting the predictions and it does not affect the null deviance as far as I can tell, may be wrong here.
    In general, it would be good to know how that affects the relationship between deviance and loglikelihood.

@nalimilan
Copy link
Member Author

@Nosferican For model families that always have an intercept, this PR is irrelevant, so I guess it's OK?

In terms of the within estimator (or LSDV), the intercept is a dummy for adjusting the predictions and it does not affect the null deviance as far as I can tell, may be wrong here.

Indeed, adding 0 + to the within model from the Econometrics.jl manual gives a different deviance but nulldeviance doesn't change. The CRMRTE ~ 0 model cannot be fitted, so I'm not sure what nulldeviance could mean for a within model without an intercept. I thought models with fixed effects always include an intercept (even if implicitly), just like any model that includes a categorical predictor. What does it mean to fit a within model without an intercept? What do you think nulldeviance should return for such models?

In general, it would be good to know how that affects the relationship between deviance and loglikelihood.

I'm not sure what's the question. This PR wouldn't change the relationship between nulldeviance and nullloglikelihood, they would just refer to a different model (the one without an intercept) than previously (the one with only the intercept) -- at least when the passed model has no intercept.

@Nosferican
Copy link

In terms of the within estimator... The transformed data basically zeroes out the intercept. The intercept is then set in various manners for simplicity (e.g., adjusting it such that the predictions are centered around the mean response). That is more of a heuristic/artifact but not really something that comes from the estimator/model...

I don't recall if, for some tests that use the ratio between deviance or loglikelihood (e.g., likelihood test), I think some use the other methods for simplicity and this would make that relation break (I think that happens in MixedModels). I sort of like that the nulldeviance and nullloglikelihood refer to the same model.

@nalimilan
Copy link
Member Author

In terms of the within estimator... The transformed data basically zeroes out the intercept. The intercept is then set in various manners for simplicity (e.g., adjusting it such that the predictions are centered around the mean response). That is more of a heuristic/artifact but not really something that comes from the estimator/model...

OK. But then why does adding 0 + change the return value of deviance?

Anyway, that's more out of curiosity, as it seems that the within model should keep nulldeviance and nullloglikelihood as they are now as it always includes the intercept in mathematical terms, even if the presentation to users can change.

I sort of like that the nulldeviance and nullloglikelihood refer to the same model.

To be clear, this PR changes the definition of both so that they still refer to the same model. Otherwise things would indeed be really weird.

cecileane pushed a commit to crsl4/PhyloNetworks.jl that referenced this pull request May 16, 2022
will avoid negative r2. consistent with new definition in JuliaStats/StatsAPI.jl#14 
Co-authored-by: Cecile Ane <cecileane@users.noreply.github.com>
@cecileane
Copy link

Great proposal. We updated PhyloNetworks to reflect the change.

@palday
Copy link
Member

palday commented May 17, 2022

For mixed effects models, I think this is fine and doesn't have much impact. We don't define the null methods. @dmbates may have a different perspective on this, but my take is that it's not exactly clear what a null model would be for a mixed model, even when we assume an intercept. Is it just the equivalent null non-mixed model? Or do you include random intercepts? More generally, I think these methods are mostly useful for defining a coefficient of determination, but there is no single candidate for linear mixed models that has all the properties of R² for classical OLS linear models. The situation is even worse for generalized linear mixed models, which combine all the difficulties of both LMM and GLM.

tl;dr no objections to the current proposal and doesn't really impact MixedModels.jl because we intentionally don't define these methods.

@codecov-commenter
Copy link

codecov-commenter commented May 23, 2022

Codecov Report

Merging #14 (3bb2e9f) into main (dfacfa6) will not change coverage.
The diff coverage is n/a.

@@            Coverage Diff            @@
##              main       #14   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            2         2           
  Lines           36        36           
=========================================
  Hits            36        36           
Impacted Files Coverage Δ
src/statisticalmodel.jl 100.00% <ø> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dfacfa6...3bb2e9f. Read the comment docs.

@nalimilan
Copy link
Member Author

I've realized there's another tricky point to clarify when defining what is the null model. For models with no intercept but at least one categorical predictor (full dummy coding, as in y ~ 0 + x with x categorical), should the null model should include an intercept? On the one hand, that could make sense, as one wouldn't expect the R² to be defined with reference with a model without intercept AFAICT, as full dummy coding is just used to change the way results are presented. On the other hand, if you fit both a model y ~ 0 + z (with z continuous) and y ~ 0 + x + z (with x categorical), the reference model used to compute the R² shouldn't change to give meaningful comparisons, so in both cases the null model should have no intercept.

What do you think?

Currently, the definition of hasintercept we use merely checks the presence of a column full of ones, so it adopts the latter approach.

Cc: @kleinschmidt

@cecileane
Copy link

The latter approach makes more sense to me also, where the null model excludes the intercept for y ~ 0 + x. I like that the interpretation does not depend on whether x is continuous or categorical. Users should be able to test various contrasts downstream of model fitting.

@nalimilan nalimilan merged commit d058f52 into main Jun 2, 2022
@nalimilan nalimilan deleted the nl/null branch June 2, 2022 16:45
@nalimilan
Copy link
Member Author

OK, let's go with this then. Thanks everybody!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants