Including number of observations used to build the model #82

mschubert · 2015-10-26T15:45:56Z

I think it would be useful if some version of the data.frame that represents the result also include a column with the number of observations that were used to build the model.

An easy way to access would be using e.g. nobs() for stats::lm(), but I'm sure other models have similar reporting of this.

It would be even more useful if it could include the actual number of observations for logicalTRUE or factorLEVEL.

The text was updated successfully, but these errors were encountered:

grasshoppermouse · 2016-07-09T20:26:51Z

I also think it would be useful if glance output included nobs. It looks like glance.stanreg already does:

https://github.com/dgrtwo/broom/blob/master/R/rstanarm_tidiers.R

hughjonesd · 2017-02-10T15:03:08Z

Just adding my vote to this feature. I think in most fields, it is standard to report the N as one of the summary statistics.

randomgambit · 2018-05-04T19:09:34Z

hello there! is there a fix for that very important feature?
thanks!!

alexpghayes · 2018-07-13T22:34:29Z

I'd be willing to this in a tryCatch to finish_glance to pick up an n column for relevant models. The question then is how many models actually implement nobs() methods.

I'm hesitant to report counts for each factor level in glance() because this is moving more into data summarizing than properties of a model. If a tidy() method doesn't already inherit this information from summary() or whatnot I don't think it's worth the effort to try implement this consistently across tidiers. Also, skimr::skim() is fantastic for this sort of thing.

vincentarelbundock · 2019-01-24T14:29:40Z

I'm building a regression summary package built on broom and users are requesting this feature. It's pretty important to me, and I would be willing to do it if you tell me about your implementation preferences.

To answer your question, a lot of models actually implement the nobs method. I went through every extract method in the texreg package. I may have missed a couple, but the models not listed below should work with nobs:

#default
stats::nobs(model)

#felm
summary(model)$N

#censReg
summary(model)$nobs

#btergm
#mbtergm
model@nobs

#betaor
#betamfx
model$fit$nobs

#averaging
#model.selection (MuMIn)
as.numeric(attr(model, 'nobs'))

#sienaFit
model$n

#zeroinfl
summary(model)$n

#fGARCH
length(model@data) 

# gel
NROW(model$gt) 

# lme4
dim(model.frame(model))[1] 

#lmrob
#systemfit
#lmRob
length(model$residuals) 

#logitmfx
#probitmfx
#negbinirr
#negbinmfx
nrow(model$fit$model) 

#lrm
model$stats[1]

#mlogit
#plm
#pmg
#rq
#summary.lm
nrow(summary(model)$residuals)

#mnlogit
s$model.size$N

#multinom
#sarlm
nrow(summary(model$fitted.values))

#pgmm
attr(summary(model), 'pdim')$nT$N

#simex
length(model$model$residuals)

#survreg
length(model$linear.predictors)

#zelig
nrow(model$data)

#pglm
length(model$gradientObs[, 1])

vincentarelbundock · 2019-01-24T15:31:31Z

@alexpghayes One possible design that I would be willing to implement:

For each element of the list above which are not compatible with nobs, extract the value explicitly in the model-specific glance function.

Modify the finish_glance function. If 'n' %in% names(ret), then tryCatch(stats::nobs)

This is a bit more work (which I am willing to do), but it's explicit, and would avoid unexpected side effects from having a bunch of ifelse statements.

gavinsimpson · 2019-01-24T20:48:05Z

Sounds simpler to just implement nobs() methods for these internally, where they don't exist. Ideally these would be offered upstream to the respective package maintainers, but there's nothing stopping these being in broom if they are not wanted or maintainers are unresponsive. Using nobs() makes it simpler/safer to implement return of number of observations in glance.

vincentarelbundock · 2019-01-24T21:03:38Z

Sure. I'm happy to write a bunch of nobs methods if people find those useful. There's also a discussion about how these methods would be used in my WIP PR: #594

Basically,

Each glance function calls its own nobs()
finish_glance calls nobs() for all of them, as it currently does for AIC, BIC, etc.

vincentarelbundock · 2019-01-25T02:37:22Z

I'm in the process of checking every model object for which broom offers a glance function to see if they work with stats::nobs. I'm also writing new methods for those that don't. The results are collected in this Gist:

https://gist.github.com/vincentarelbundock/24bedac98499181790aab230cc5b74bc

alexpghayes · 2019-03-09T22:22:53Z

Closed in #597! Thanks @vincentarelbundock!

github-actions · 2021-03-11T00:20:32Z

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

alexpghayes added the feature-request label Jun 6, 2018

nutterb added a commit to nutterb/broom that referenced this issue Jun 14, 2018

Add nobs, addresses tidymodels#82

e34e018

nutterb mentioned this issue Jun 14, 2018

Vif nutterb/broom#1

Merged

jamesfeigenbaum mentioned this issue Jun 15, 2018

Speed up process jamesfeigenbaum/textablr#6

Open

alexpghayes added this to the 0.7.0 milestone Jul 13, 2018

vincentarelbundock mentioned this issue Jan 24, 2019

WIP: Add number of observations to glance output. #594

Closed

alexpghayes closed this as completed Mar 9, 2019

github-actions bot locked and limited conversation to collaborators Mar 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Including number of observations used to build the model #82

Including number of observations used to build the model #82

mschubert commented Oct 26, 2015

grasshoppermouse commented Jul 9, 2016

hughjonesd commented Feb 10, 2017

randomgambit commented May 4, 2018

alexpghayes commented Jul 13, 2018

vincentarelbundock commented Jan 24, 2019

vincentarelbundock commented Jan 24, 2019

gavinsimpson commented Jan 24, 2019

vincentarelbundock commented Jan 24, 2019 •

edited

vincentarelbundock commented Jan 25, 2019 •

edited

alexpghayes commented Mar 9, 2019

github-actions bot commented Mar 11, 2021

Including number of observations used to build the model #82

Including number of observations used to build the model #82

Comments

mschubert commented Oct 26, 2015

grasshoppermouse commented Jul 9, 2016

hughjonesd commented Feb 10, 2017

randomgambit commented May 4, 2018

alexpghayes commented Jul 13, 2018

vincentarelbundock commented Jan 24, 2019

vincentarelbundock commented Jan 24, 2019

gavinsimpson commented Jan 24, 2019

vincentarelbundock commented Jan 24, 2019 • edited

vincentarelbundock commented Jan 25, 2019 • edited

alexpghayes commented Mar 9, 2019

github-actions bot commented Mar 11, 2021

vincentarelbundock commented Jan 24, 2019 •

edited

vincentarelbundock commented Jan 25, 2019 •

edited