-
Added support for
conf.level
inaugment.lm()
(#1191 by@zietzm
). -
Added support for columns
adj.r.squared
andnpar
inglance()
method for objects outputted frommgcv::gam()
(#1172).
-
Soft-deprecated tidiers for margins objects, as the package was archived from CRAN in April 2024. In the case that the package is back on CRAN before the next package release, broom will once again Suggest and test support for the package (#1200).
-
Moved forward with deprecation of tidiers for objects from the sp package. See resources linked in tidymodels/broom#1142 for more information on migration from retiring spatial packages.
-
While this broom release contains no changes to the
tidy.survfit()
method for objects from the survival package, the package has bumped the minimum required version for survival. Before survival 3.6-4,tidy.survfit()
propagated "inconsistent"n.censor
values from survival for multi-state models (#1195). -
Corrected confidence interval values for precision components in
tidy.betareg()
output (#1169). -
Fixed bug in tidier for
car::linearHypothesis()
output with long formulas (#1171). -
Corrected coefficient values in
tidy.varest()
output (#1174).
-
tidy.coxph()
will now pass its ellipses...
tosummary()
internally (#1151 by@ste-tuf
). -
Transitioned the deprecation of the
region
argument totidy.SpatialPolygonsDataFrame
from a warn- to a hard-deprecation (#1142). -
Removed maptools and rgeos as Suggested packages ahead of their retirement. sp tidiers will be removed from a future release of the package (#1142).
-
Addressed bug in mlogit tidiers where
augment.mlogit()
would fail if supplied a model fitted with a non-defaultdfidx()
(#1156 by@gregmacfarlane
). -
Addressed bug in ANOVA tidiers where
tidy.anova()
would fail if passed a model with many predictors (#1159 by@jwilliman
). -
Addressed warnings in ANOVA tidiers for unrecognized column names
Resid..Df
,Resid..Dev
, andDeviance
; those columns will be renameddf.residual
,residual.deviance
, anddeviance
, respectively (#1159 by@jwilliman
).
- Added an
intercept
argument totidy.aov()
, a logical indicating whether to include information on the intercept as the first row of results (#1144 by@victor-vscn
). - Moved forward with soft-deprecation of tidiers for objects from the sp package ahead of the retirement of the rgeos and maptools packages later this year. sp tidiers will be removed from a future release of the package (#1142).
- Fixed bug in
augment.glm()
where the.std.resid
column always contained standardized deviance residuals regardless of the value passed to thetype.residuals
argument (#1147).
- Addressed test failures on R-devel.
- Fixed bug in
tidy.multinom()
where theconf.level
argument would be ignored.
- The default
data
argument foraugment.coxph()
andaugment.survreg()
has been transitioned fromNULL
tomodel.frame(x)
(#1126 by@capnrefsmmat
). - Migrated 'ggplot2' from strong to weak dependency, i.e. moved from
Imports
toSuggests
. - Fixed a bug where
augment()
results would not include residuals when the response term included a function call (#1121, #946, #937, #124).
- Improves performance of
tidy.lm()
andtidy.glm()
for full-rank fits (#1112 by@capnrefsmmat
). - Moves forward with deprecation of tidiers for sparse matrices outputted from the Matrix package, initially soft-deprecated in broom 0.5.0. The Matrix tidiers were light wrappers around coercion methods that will now be deprecated from Matrix itself in the upcoming 1.4-2 release. The affected methods are
tidy.sparseMatrix()
,tidy.dgCMatrix()
, andtidy.dgTMatrix()
. Note thattidy.confusionMatrix()
, for relevant objects outputted from the caret package, is unaffected (#1113). tidy.anova()
works again withanova
objects from thelme4
package (broken by addition of theterms
column in the previous release)
broom 1.0.0 is the first "production" release of the broom package, and includes a number of notable changes to both functionality and governance.
As of this release, the broom team will be following a set of guidelines that clarify the scope of further development on the package. Given the package's wide use and long history, these guidelines prioritize backward compatibility over internal consistency and completeness. You can read those guidelines here!
We've also made notable changes to error handling in this release:
- Adds minimal ellipsis checking to warn on commonly misspecified arguments passed through ellipses. Notably:
tidy()
methods will now warn when supplied anexponentiate
argument if it will be ignored.augment()
methods will now warn when supplied anewdata
argument if it will be ignored.
- The warning regarding tidiers only maintained via dispatch to
lm
andglm
is now displayed only once per session, per unique dispatch. That is, if aclass_a
object is tidied using a(g)lm
method, broom will not warn when tidyingclass_a
objects for the rest of the session, but if aclass_b
object is tidied using a(g)lm
method in the same session, broom will warn again (#1101).
Other fixes and improvements:
- Add
exponentiate
argument totidy.boot()
(#1039). - Update in
tidy.htest()
converting matrix-columns to vector-columns (#1081). - Address failures in
tidy.glht()
withconf.int = TRUE
(#1103). - Address failures in
tidy.zoo()
when input data does not havecolnames
(#1080). - Transition tidiers for bivariate linear or spline-based interpolation---using list tidiers to interface with objects from the akima package is now considered off-label. See the interp package for a FOSS alternative.
- Address failures in
tidy.svyolr()
whenp.values = TRUE
. Instead of aliasingtidy.polr()
directly,tidy.svyolr()
lightly wraps that method and warns ifp.values
is supplied (#1107). - Adds a
term
column and introduces support forcar::lht()
output intidy.anova()
(#1106 by@grantmcdermott
). - Adds a dedicated
glance.anova
method (which previously dispatched to the
deprecatedglance.data.frame()
tidier, #1106 by@grantmcdermott
).
This update makes significant improvements to documentation, fixes a number of bugs, and brings the development flow of the package up to date with other packages in the tidymodels.
In the big picture, this release:
- Makes many improvements to documentation:
- All tidiers now have example code demonstrating usage in their documentation. Tidiers for base packages as well as selected others also include sample code for visualization of results with ggplot2.
- Code examples in the documentation largely now follow consistent style---these changes were made largely to reflect the tidyverse style guide, addressing spacing, object naming, and commenting, among other things.
- Examples previously marked with
\dontrun
or\donttest
have been workshopped to run reliably.
- Clarifies errors and warnings for deprecated and unmaintained tidiers.
- Ensures that tidiers are placed in files named according to the model-supplying package rather than the model object class for easier navigability of the source code.
- Fix
glance.fixest
error when model includes only fixed effects and no regressors (#1018
by@arcruz0
,#1088
by@vincentarelbundock
). - Address excessive messaging from
tidy.speedlm
(#1084
by@cgoo4
,#1087
by@vincentarelbundock
). - Add
nobs
column to the output ofglance.svyglm
(#1085
by@fschaffner
,#1086
by@vincentarelbundock
). - Ensure
tidy.prcomp
description entries use consistent punctuation (#1072
by@PursuitOfDataScience
). - Address breaking changes in
glance.fixest
andtidy.btergm
. - Simplify handling of
MASS::polr
output in the correspondingtidy
andaugment
methods. - Align continuous integration with current standards in tidymodels packages.
Nearly identical source to broom 0.7.11—updates the maintainer email address to an address listed in other CRAN packages maintained by the same person.
- Addressed issue with the ordering of original observations in
augment.rqs
. Now function preserves the originaldata.frame
names also when the inputdata.frame
only has one column (#1052
by@ilapros
). - Addressed warning from
tidy.rma
whenx$ddf
has length greater than 1 (#1064
by@wviechtb
). - Fix errors in
glance.lavaan
in anticipation of upcomingtidyr
release (#1067
by@DavisVaughan
). - Corrected the confidence interval in
tidy.crr()
. Thetidy.crr(conf.level=)
argument was previously ignored (#1068
by@ddsjoberg
).
- Clarifies error when
pysch::mediate
output is dispatched totidy.mediate
(#1037
by@LukasWallrich
). - Allows user to specify confidence level for
tidy.rma
(#1041
by@TarenSanders
) - Clarifies documentation related to usage of
augment_columns()
; most package users should useaugment()
in favor ofaugment_columns()
. See?augment_columns
for more details. - Extends support for
emmeans
by fixing non-standard column names in case of asymptotically derived inferential statistics. (#1046
by@crsh
) - Fixes use of index columns in
augment.mlogit
and adds.resid
column to output. (#1045
,#1053
,#1055
, and#1056
by@jamesrrae
and@gregmacfarlane
) - Correct column naming of standard error measures in
glance.survfit()
. - Various bug fixes and improvements to documentation.
- Fixes confidence intervals in
tidy.crr()
, which were previously exponentiated whenexponentiate = FALSE
(#1023
by@leejasme
) - Deprecates
Rchoice
tidiers, as the newest 0.3-3 release requires R 4.0+ and does not re-export needed generics. - Updates to
ergm
tidiers in anticipation of changes in later releases. (#1034
by@krivit
)
- Fixed bug in
glance.ergm
related to handling of MCMC details. - Address breakages in unit tests for {fixest} tidiers.
- Fixed bug in
tidy.epi.2by2
that resulted in errors with new version ofepiR
(#1028
by@nt-williams
) - Added
exponentiate
argument totidy.gam()
tidier applicable for parametric terms (#1013
by@ddsjoberg
) - Added
exponentiate
argument totidy.negbin()
tidier (#1011
by@ddsjoberg
) - Fixed failures in
spdep
tidiers following breaking changes in the most recent release - Various bug fixes and improvements to documentation
- Fixed bug in
augment
tidiers resulting in.fitted
and.se.fit
array columns. - Fixed bug that made column
y
non-numeric aftertidy_xyz
(#973
by@jiho
) - Added tidiers for
MASS:glm.nb
(#998
by@joshyam-k
) - Fixed bug in
tidy.fixest
that sometimes prevented arguments likese
from being used (#1001
by@karldw
) - Fixed bug in
tidy.fixest
that resulted in errors when columns with namex
are present (#1007
by@grantmcdermott
) - Moved forward with planned deprecation of
gamlss
tidiers in favor of those provided inbroom.mixed
- Various bug fixes and improvements to documentation
- Fixed bug in the
nnet::multinom
tidier in the case that the response variable has only two levels (#993
by@vincentarelbundock
and@hughjonesd
) - Various bug fixes and improvements to documentation
broom 0.7.4 introduces tidier support for a number of new model objects and improves functionality of many existing tidiers!
- Add tidiers for
Rchoice
objects (#961
by@vincentarelbundock
and@Nateme16
) - Add tidiers for objects produced by
car::leveneTest
(#968
by@vincentarelbundock
and@mkirzon
) - Add tidiers for objects produced by
cmprsk::crr
(#971
and#552
by@vincentarelbundock
and@margarethannum
) - Add an
augment()
method forgam
objects (#975
and#645
by@vincentarelbundock
) - Add tidiers for
vars
objects (#979
and#161
by@vincentarelbundock
and@Diego-MX
)
This release also restores tidiers for felm
objects from the lfe
package, which was recently unarchived from CRAN.
tidy.emmGrid
can now returnstd.error
andconf.*
columns at the same time. (#962
by@vincentarelbundock
and@jmbarbone
)tidy.garch
can now produce confidence intervals (#964
by@vincentarelbundock
and@IndrajeetPatil
)tidy.coxph
can now report confidence intervals on models utilizing penalized/clustering terms (#966
by@vincentarelbundock
and@matthieu-faron
)augment.lm
now works when some regression weights are equal to zero (#965
by@vincentarelbundock
and@vnijs
)tidy.coxph
can now handle models utilizing penalized/clustering terms (#966
and#969
by@vincentarelbundock
,@matthieu-faron
, and@KZARCA
)- Fix bug in
tidy.speedglm
on R 4.0.0+ (#974
by@uqzwang
) - tidy.multinom works with matrix response (
#977
and#666
by@vincentarelbundock
and@atyre2
) - Various bug fixes and improvements to documentation and errors.
In broom 0.7.0
, we introduced an error for model objects that subclassed
lm
and relied on tidy.lm()
, or similarly for tidy.glm()
. Tidiers for
these objects were supported unintentionally, and we worried that tidiers for
these objects would silently report inaccurate results.
In hindsight, this change was unnecessarily abrupt. We've decided to roll back
this change, instead providing the following warning before allowing such
objects to fall back to the lm
/glm
tidier methods:
Tidiers for objects of class {subclass} are not maintained by the broom team, and are only supported through the {dispatched_method} tidier method. Please be cautious in interpreting and reporting broom output."
In addition,
- Restores tidiers for
summary.lm
objects (#953
by@grantmcdermott
) - Deprecate tidiers for the
lfe
package, which was archived from CRAN. - Various bug fixes and improvements to documentation and errors.
- Various bug fixes and improvements to documentation and errors.
While broom 0.7.1 is a minor release, it includes a number of exciting new features and bug fixes!
- Add tidiers for
margins
objects. (#700
by@grantmcdermott
) - Added tidier methods for
mlogit
objects (#887
by@gregmacfarlane
) - Add
glance.coeftest()
method (#932
by@grantmcdermott
)
One of the more major improvements in this release is the addition of the
interval
argument to some augment
methods for confidence, prediction,
and credible intervals. These columns will be consistently labeled .lower
and .upper
! (#908
by @grantmcdermott
, #925
by @bwiernik
)
In addition...
- Extended the
glance.aov()
method to include anr.squared
column! glance.survfit()
now passes...
tosummary.survfit()
to allow for adjustment of RMST and other measures (#880
by@vincentarelbundock
)- Several unsupported model objects that subclass
glm
andlm
now error more informatively. - A number of improvements to documentation throughout the package.
- Fixed
newdata
warning message inaugment.*()
output when thenewdata
didn't contain the response variable—augment methods no longer expect the response variable in the suppliednewdata
argument. (#897
by@rudeboybert
) - Fixed a bug related to
tidy.geeglm()
not being sensitive to theexponentiate
argument (#867
) - Fixed
augment.fixest()
returning residuals in the.fitted
column. The method also now takes atype.residuals
argument and defaults to the sametype.predict
argument as thefixest
predict()
method. (#877
by@karldw
) - Fix
tidy.felm
confidence interval bug. Replaces "robust" argument with "se.type". (#919
by@grantmcdermott
; supersedes#818
by@kuriwaki
) - Fix a bug in
tidy.drc()
where some term labels would result in the overwriting of entries in thecurve
column (#914
) - Fixed bug related to univariate zoo series in
tidy.zoo()
(#916
by@WillemVervoort
) - Fixed a bug related to
tidy.prcomp()
assigning the wrong PC labels from "loadings" and "scores" matrices (#910
by@tavareshugo
) - Fixed
tidy.polr()
bug where p-values could only be returned ifexponentiate = FALSE
.
We followed through with the planned deprecation of character vector tidiers in this release. Other vector tidiers that were soft-deprecated in 0.7.0 will be fully deprecated in a later release.
broom 0.7.0
is a major release with a large number of new tidiers,
soft-deprecations, and planned hard-deprecations of functions and arguments.
-
We have changed how we report degrees of freedom for
lm
objects (#212, #273). This is especially important for instructors in statistics courses. Previously thedf
column inglance.lm()
reported the rank of the design matrix. Now it reports degrees of freedom of the numerator for the overall F-statistic. This is equal to the rank of the model matrix minus one (unless you omit an intercept column), so the newdf
should be the olddf
minus one. -
We are moving away from supporting
summary.*()
objects. In particular, we have removedtidy.summary.lm()
as part of a major overhaul of internals. Instead of callingtidy()
onsummary
-like objects, please calltidy()
directly on model objects moving forward. -
We have removed all support for the
quick
argument intidy()
methods. This is to simplify internals and is for maintainability purposes. We anticipate this will not influence many users as few people seemed to use it. If this majorly cramps your style, let us know, as we are considering a new verb to return only model parameters. In the meantime,stats::coef()
together withtibble::enframe()
provides most of the functionality oftidy(..., quick = TRUE)
. -
All
conf.int
arguments now default toFALSE
, and allconf.level
arguments now default to0.95
. This should primarily affecttidy.survreg()
, which previously always returned confidence intervals, although there are some others. -
Tidiers for
emmeans
-objects use the argumentsconf.int
andconf.level
instead of relying on the argument names native to theemmeans::summary()
-methods (i.e.,infer
andlevel
). Similarly,multcomp
-tidiers now include a call tosummary()
as previous behavior was akin to setting the now removed argumentquick = TRUE
. Both families of tidiers now use theadj.p.value
column name when appropriate. Finally,emmeans
-,multcomp
-, andTukeyHSD
-tidiers now consistently use the column namescontrast
andnull.value
instead ofcomparison
,level1
andlevel2
, orlhs
andrhs
(see #692).
This release of broom
soft-deprecates the following functions and tidier
methods:
- Tidier methods for data frames, rowwise data frames, vectors and matrices
bootstrap()
confint_tidy()
fix_data_frame()
finish_glance()
augment.glmRob()
tidy.table()
andtidy.ftable()
have been deprecated in favor oftibble::as_tibble()
tidy.summaryDefault()
andglance.summaryDefault()
have been deprecated in favor ofskimr::skim()
We have also gone forward with our planned mixed model deprecations, and have
removed the following methods, which now live in broom.mixed
:
tidy.brmsfit()
tidy.merMod()
,glance.merMod()
,augment.merMod()
tidy.lme()
,glance.lme()
,augment.lme()
tidy.stanreg()
,glance.stanreg()
tidyMCMC()
,tidy.rjags()
,tidy.stanfit()
-
augment.factanal()
now returns a tibble with columns names.fs1
,.fs2
, ..., instead offactor1
,factor2
, ... (#650) -
We have renamed the output of
augment.htest()
. In particular, we have renamed the.residuals
column to.resid
and the.stdres
to.std.resid
for consistency. These changes will only affect chi-squared tests. -
tidy.ridgelm()
now always return aGCV
column and never returns anxm
column. (#533 by @jmuhlenkamp) -
tidy.dist()
no longer supports theupper
argument.
The internals of augment.*()
methods have largely been overhauled.
-
If you pass a dataset to
augment()
via thedata
ornewdata
arguments, you are now guaranteed that the augmented dataset will have exactly the same number of rows as the original dataset. This differs from previous behavior primarily when there are missing values. Previouslyaugment()
would drop rows containingNA
. This should no longer be the case. -
augment.*()
methods no longer accept anna.action
argument. -
In previous versions, several
augment.*()
methods inherited theaugment.lm()
method, but required additions to theaugment.lm()
method itself. We have shifted away from this approach in favor of re-implementing manyaugment.*()
methods as standalone methods making use of internal helper functions. As a result,augment.lm()
and some related methods have deprecated (previously unused) arguments. -
augment()
tries to give an informative error whendata
isn't the original training data. -
The
.resid
column in the output ofaugment().*
methods is now consistently defined asy - y_hat
anova
objects from thecar
package (#754)pam
objects from thecluster
package (#637 by @abbylsmith)drm
objects from thedrc
package (#574 by @edild)summary_emm
objects from theemmeans
package (#691 by @crsh)epi.2by2
objects from theepiR
package (#711)fixest
objects from thefixest
package (#785 by @karldw)regsubsets
objects from theleaps
package (#535)lm.beta
objects from thelm.beta
package (#545 by @mattle24)rma
objects from themetafor
package (#674 by @malcolmbarrett, @softloud)mfx
,logitmfx
,negbinmfx
,poissonmfx
,probitmfx
, andbetamfx
objects from themfx
package (#700 by @grantmcdermott)lmrob
andglmrob
objects from therobustbase
package (#205, #505)sarlm
objects from thespatialreg
package (#847 by @gregmacfarlane and @petrhrobar)speedglm
objects from thespeedglm
package (#685)svyglm
objects from thesurvey
package (#611)systemfit
objects from thesystemfit
package (by @jaspercooper)- We have restored a simplified version of
glance.aov()
, which used to inherit from theglance.lm()
method and now contains only the following columns:logLik
,AIC
,BIC, deviance
,df.residual
, andnobs
(see #212). Note thattidy.aov()
gives more complete information about degrees of freedom in anaov
object.
-
tidy.felm()
now has arobust = TRUE/FALSE
option that supports robust and cluster standard errors. (#781 by @kuriwaki) -
Make
.fitted
values respecttype.predict
argument ofaugment.clm()
. (#617) -
Return factor rather than numeric class predictions in
.fitted
ofaugment.polr()
. (#619) Add an option to returnp.values
intidy.polr()
. (#833 by @LukasWallrich) -
tidy.kmeans()
now uses the names of the input variables in the output by default. Setcol.names = NULL
to recover the old behavior. -
Previously, F-statistics for weak instruments were returned through
glance.ivreg()
. F-statistics are now returned throughtidy.ivreg(instruments = TRUE)
. Default istidy.ivreg(instruments = FALSE)
.glance.ivreg()
still returns Wu-Hausman and Sargan test statistics. -
glance.biglm()
now returns adf.residual
column. -
tidy.prcomp()
argumentmatrix
gained new options"scores"
,"loadings"
, and"eigenvalues"
. (#557 by @GegznaV) -
tidy_optim()
now provides the standard error if the Hessian is present. (#529 by @billdenney) -
tidy.htest()
column names are now run throughmake.names()
to ensure syntactic correctness. (#549 by @karissawhiting) -
tidy.lmodel2()
now returns ap.value
column. (#570) -
tidy.lsmobj()
gained aconf.int
argument for consistency with other tidiers. -
tidy.polr()
now returns p-values ifp.values
is set to TRUE and the model does not contain factors with more than two levels. -
tidy.zoo()
now doesn't change column names that have spaces or other special characters (previously they were converted todata.frame
friendly column names bymake.names
.) -
glance.lavaan()
now uses lavaan extractor functions instead of subsetting the fit object manually. (#835) -
glance.lm()
no longer errors when only an intercept is provided as an explanatory variable. (#865)
- Bug fix for
tidy.survreg()
whenrobust
is set toTRUE
in model fitting (#842, #728) - Bug fixes in
glance.lavaan()
: address confidence interval error (#577) and correct reportednobs
andnorig
(#835) - Bug fix in muhaz tidiers to ensure output is always a
tibble
(#824) - Several
glance.*()
methods have been refactored in order to return a one-row tibble even when the model matrix is rank-deficient (#823) - Bug fix to return confidence intervals correct in
tidy.drc()
(#798) - Added default methods for objects that subclass
glm
andlm
in order to error more informatively. (#749, #736, #708, #186) - Bug fix to allow
augment.kmeans()
to work with masked data (#609) - Bug fix to allow
augment.Mclust()
to work on univariate data (#490) - Bug fix to allow
tidy.htest()
to supports equal variances (#608) - Bug fix to better allow
tidy.boot()
to support confidence intervals (#581) - Bug fix for
tidy.polr()
when passedconf.int = TRUE
(#498)
-
Many
glance()
methods now return anobs
column, which contains the number of data points used to fit the model! (#597 by @vincentarelbundock) -
tidy()
no longer checks for a log or logit link whenexponentiate = TRUE
, and we have refactored to remove extraneousexponentiate
arguments. If you setexponentiate = TRUE
, we assume you know what you are doing and that you want exponentiated coefficients (and confidence intervals ifconf.int = TRUE
) regardless of link function. -
We now use
rlang::arg_match()
when possible instead ofarg.match()
to give more informative errors on argument mismatches. -
The package's site has moved from https://broom.tidyverse.org/ to https://broom.tidymodels.org/.
-
Revised several vignettes and moved them to the tidymodels.org website. The existing vignettes will now simply link to the revised versions.
-
Many improvements to consistency and clarity of documentation.
-
Various warnings resulting from changes to the tidyr API in v1.0.0 have been fixed. (#870)
-
Removed dependencies on reshape2 and superseded functions in dplyr.
-
All documentation now links to help files rather than topics.
-
Moved core tests to the
modeltests
package. -
Generally, after this release, the broom dev team will first ask that attempts to add tidier methods supporting a model object are first directed to the model-owning package. An article describing best practices in doing so can be found on the {tidymodels} website at https://www.tidymodels.org/learn/develop/broom/, and we will continue adding additional resources to that article as we develop them. In the case that the maintainer is uninterested in taking on the tidier methods, please note this in your issue or PR.
-
Added a new vignette discussing how to implement new tidier methods in non-broom packages.
- Fix failing CRAN checks to due
tibble 3.0.0
release. Removedxergm
dependency.
- Remove tidiers for robust package and drop robust dependency (temporarily)
- Fixes failing CRAN checks as the joineRML package has been removed from CRAN
- Fixes failing CRAN checks due to new matrix classing in R 4.0.0
-
Fixes failing CRAN checks
-
Changes to accommodate ergm 3.10 release.
tidy.ergm()
no longer has aquick
argument. The old default ofquick = FALSE
is now the only option.
tidy()
,glance()
andaugment()
are now re-exported from the generics package.
Tidiers now return tibble::tibble()
s. This release also includes several new
tidiers, new vignettes and a large number of bug fixes. We've also begun to more
rigorously define tidier specifications: we've laid part of the groundwork for
stricter and more consistent tidying, but the new tidier specifications are not
yet complete. These will appear in the next release.
Additionally, users should note that we are in the process of migrating tidying
methods for mixed models and Bayesian models to broom.mixed
. broom.mixed
is
not on CRAN yet, but all mixed model and Bayesian tidiers will be deprecated
once broom.mixed
is on CRAN. No further development of mixed model tidiers
will take place in broom
.
Almost all tidiers should now return tibble
s rather than data.frame
s.
Deprecated tidying methods, Bayesian and mixed model tidiers still return
data.frame
s.
Users are mostly to experience issues when using augment
in situations
where tibbles are stricter than data frames. For example, specifying model
covariates as a matrix object will now error:
library(broom)
library(quantreg)
fit <- rq(stack.loss ~ stack.x, tau = .5)
broom::augment(fit)
#> Error: Column `stack.x` must be a 1d atomic vector or a list
This is because the default data
argument data = model.frame(fit)
cannot be
coerced to tibble
.
Another consequence of this is that augment.survreg
and augment.coxph
from
the survival
package now require that the user explicitly passes data to
either the data
or newdata
arguments.
These restrictions will be relaxed in an upcoming release of broom
pending
support for matrix-columns in tibbles.
Developers are likely to experience issues:
-
subsetting tibbles with
[
, which returns a tibble rather than a vector. -
setting rownames on tibbles, which is deprecated.
-
using matrix and vector tidiers, now deprecated.
-
handling the additional tibble classes
tbl_df
andtbl
beyond thedata.frame
class -
linking to defunct documentation files -- broom recently moved all tidiers to a
roxygen2
template based documentation system.
This version of broom
includes several new vignettes:
-
vignette("available-methods", package = "broom")
contains a table detailing which tidying methods are available -
vignette("adding-tidiers", package = "broom")
is an in-progress guide for contributors on how to add new tidiers to broom -
vignette("glossary", package = "broom")
contains tables describing acceptable argument names and column names for the in-progress new specification.
Several old vignettes have also been updated:
vignette("bootstrapping", package = "broom")
now relies on thersample
package and atidyr::nest
-purrr::map
-tidyr::unnest
workflow. This is now the recommended workflow for working with multiple models, as opposed to the olddplyr::rowwise
-dplyr::do
based workflow.
-
Matrix and vector tidiers have been deprecated in favor of
tibble::as_tibble
andtibble::enframe
-
Dataframe tidiers and rowwise dataframe tidiers have been deprecated
-
bootstrap()
has been deprecated in favor of thersample
-
inflate
has been removed frombroom
-
The
alpha
argument has been removed fromquantreg
tidy methods -
The
separate.levels
argument has been removed fromtidy.TukeyHSD
. To obtain the effect ofseparate.levels = TRUE
, users maytidyr::separate
after tidying. This is consistent with themultcomp
tidier behavior. -
The
fe.error
argument was removed fromtidy.felm
. When fixed effects are tidier, their standard errors are now always included. -
The
diag
argument intidy.dist
has been renameddiagonal
-
Advice to help beginners make PRs (#397 by @karldw)
-
glance
support forarima
objects fit withmethod = "CSS"
(#396 by @josue-rodriguez) -
A bug fix to re-enable tidying
glmnet
objects withfamily = multinomial
(#395 by @erleholgersen) -
A bug fix to allow tidying
quantreg
intercept only models (#378 by @erleholgersen) -
A bug fix for
aovlist
objects (#377 by @mvevans89) -
Support for
glmnetUtils
objects (#352 by @Hong-Revo) -
A bug fix to allow
tidy_emmeans
to handle column names with dashes (#351 by @bmannakee) -
augment.felm
no longer returns.fe_
and.comp
columns -
Support saved formulas in
augment.felm
(#347 by @ShreyasSingh) -
confint_tidy
now drops rows of allNA
(#345 by @atyre2) -
A new tidier for
caret::confusionMatrix
objects (#344 by @mkuehn10) -
Tidiers for
Kendall::Kendall
objects (#343 by @cimentadaj) -
A new tidying method for
car::durbinWatsonTest
objects (#341 by @mkuehn10) -
glance
throws an informative error forquantreg:rq
models fit with multipletau
values (#338 by @bfgray3) -
tidy.glmnet
gains the ability to retain zero-valued coefficients with areturn_zeros
argument that defaults toFALSE
(#337 by @bfgray3) -
tidy.manova
now retains aResiduals
row (#334 by @jarvisc1) -
Tidiers for
ordinal::clm
,ordinal::clmm
,survey::svyolr
andMASS::polr
ordinal model objects (#332 by @larmarange) -
Support for
anova
objects fromcar::Anova
(#325 by @mariusbarth) -
Tidiers for
tseries::garch
models (#323 by @wilsonfreitas) -
Removed dependency on
psych
package (#313 by @nutterb) -
Improved error messages (#303 by @michaelweylandt)
-
Compatibility with new
rstanarm
andloo
packages (#298 by @jgabry) -
Support for tidying lists return by
irlba::irlba
-
A truly huge increase in unit tests (#267 by @dchiu911)
-
Bug fix for
tidy.prcomp
when missing labels (#265 by @corybrunson) -
Added a
pkgdown
site at https://broom.tidyverse.org/ (#260 by @jayhesselberth) -
Added tidiers for
AER::ivreg
models (#247 by @hughjonesd) -
Added tidiers for the
lavaan
package (#233 by @puterleat) -
Added
conf.int
argument totidy.coxph
(#220 by @larmarange) -
Added
augment
method for chi-squared tests (#138 by @larmarange) -
changed default se.type for
tidy.rq
to match that ofquantreg::summary.rq()
(#404 by @ethchr) -
Added argument
quick
fortidy.plm
andtidy.felm
(#502 and #509 by @MatthieuStigler) -
Many small improvements throughout
Many many thanks to all the following for their thoughtful comments on design, bug reports and PRs! The community of broom contributors has been kind, supportive and insightful and I look forward to working you all again!
@atyre2
,
@batpigandme
,
@bfgray3
,
@bmannakee
,
@briatte
,
@cawoodjm
,
@cimentadaj
,
@dan87134
,
@dgrtwo
,
@dmenne
,
@ekatko1
,
@ellessenne
,
@erleholgersen
,
@ethchr
,
@huftis
,
@IndrajeetPatil
,
@jacob-long
,
@jarvisc1
,
@jenzopr
,
@jgabry
,
@jimhester
,
@josue-rodriguez
,
@karldw
,
@kfeilich
,
@larmarange
,
@lboller
,
@mariusbarth
,
@michaelweylandt
,
@mine-cetinkaya-rundel
,
@mkuehn10
,
@mvevans89
,
@nutterb
,
@ShreyasSingh
,
@stephlocke
,
@strengejacke
,
@topepo
,
@willbowditch
,
@WillemSleegers
,
@wilsonfreitas
, and
@MatthieuStigler
.
-
Fixed gam tidiers to work with "Gam" objects, due to an update in gam 1.15. This fixes failing CRAN tests
-
Improved test coverage (thanks to #267 from Derek Chiu)
-
Changed the deprecated
dplyr::failwith
topurrr::possibly
-
augment
andglance
on NULLs now return an empty data frame -
Deprecated the
inflate()
function in favor oftidyr::crossing
-
Fixed confidence intervals in the gmm tidier (thanks to #242 from David Hugh-Jones)
-
Fixed a bug in bootstrap tidiers (thanks to #167 from Jeremy Biesanz)
-
Fixed tidy.lm with
quick = TRUE
to return terms as character rather than factor (thanks to #191 from Matteo Sostero) -
Added tidiers for
ivreg
objects from the AER package (thanks to #245 from David Hugh-Jones) -
Added tidiers for
survdiff
objects from the survival package (thanks to #147 from Michał Bojanowski) -
Added tidiers for
emmeans
from the emmeans package (thanks to #252 from Matthew Kay) -
Added tidiers for
speedlm
andspeedglm
from the speedglm package (#685, thanks to #248 from David Hugh-Jones) -
Added tidiers for
muhaz
objects from the muhaz package (thanks to #251 from Andreas Bender) -
Added tidiers for
decompose
andstl
objects from stats (thanks to #165 from Aaron Jacobs)
-
Added tidiers for
lsmobj
andref.grid
objects from the lsmeans package -
Added tidiers for
betareg
objects from the betareg package -
Added tidiers for
lmRob
andglmRob
objects from the robust package -
Added tidiers for
brms
objects from the brms package (thanks to #149 from Paul Buerkner) -
Fixed tidiers for orcutt 2.0
-
Changed
tidy.glmnet
to filter out rows where estimate == 0. -
Updates to
rstanarm
tidiers (thanks to #177 from Jonah Gabry) -
Fixed issue with survival package 2.40-1 (thanks to #180 from Marcus Walz)
-
Added AppVeyor, codecov.io, and code of conduct
-
Changed name of "NA's" column in summaryDefault output to "na"
-
Fixed
tidy.TukeyHSD
to includeterm
column. Also addedseparate.levels
argument, with option to separatecomparison
intolevel1
andlevel2
-
Fixed
tidy.manova
to use correct column name for test (previously, alwayspillai
) -
Added
kde_tidiers
to tidy kernel density estimates -
Added
orcutt_tidiers
to tidy the results ofcochrane.orcutt
orcutt package -
Added
tidy.dist
to tidy the distance matrix output ofdist
from the stats package -
Added
tidy
andglance
forlmodel2
objects from the lmodel2 package -
Added tidiers for
poLCA
objects from the poLCA package -
Added tidiers for sparse matrices from the Matrix package
-
Added tidiers for
prcomp
objects -
Added tidiers for
Mclust
objects from the Mclust package -
Added tidiers for
acf
objects -
Fixed to be compatible with dplyr 0.5, which is being submitted to CRAN
-
Added tidiers for geeglm, nlrq, roc, boot, bgterm, kappa, binWidth, binDesign, rcorr, stanfit, rjags, gamlss, and mle2 objects.
-
Added
tidy
methods for lists, including u, d, v lists fromsvd
, and x, y, z lists used byimage
andpersp
-
Added
quick
argument totidy.lm
,tidy.nls
, andtidy.biglm
, to create a smaller and faster version of the output. -
Changed
rowwise_df_tidiers
to allow the original data to be saved as a list column, then provided as a column name toaugment
. This required removingdata
from theaugment
S3 signature. Also addedtests-rowwise.R
-
Fixed various issues in ANOVA output
-
Fixed various issues in lme4 output
-
Fixed issues in tests caused by dev version of ggplot2
-
Added tidiers for "plm" (panel linear model) objects from the plm package.
-
Added
tidy.coeftest
for coeftest objects from the lmtest package. -
Set up
tidy.lm
to work with "mlm" (multiple linear model) objects (those with multiple response columns). -
Added
tidy
andglance
for "biglm" and "bigglm" objects from the biglm package. -
Fixed bug in
tidy.coxph
when one-row matrices are returned -
Added
tidy.power.htest
-
Added
tidy
andglance
forsummaryDefault
objects -
Added tidiers for "lme" (linear mixed effects models) from the nlme package
-
Added
tidy
andglance
formultinom
objects from the nnet package.
-
Fixed bug in
tidy.pairwise.htest
, which now can handle cases where the grouping variable is numeric. -
Added
tidy.aovlist
method. This addedstringr
package to IMPORTS to trim whitespace from the beginning and end of theterm
andstratum
columns. This also required adjustingtidy.aov
so that it could handle strata that are missing p-values. -
Set up
glance.lm
to work withaov
objects along withlm
objects. -
Added
tidy
andglance
for matrix objects, withtidy.matrix
converting a matrix to a data frame with rownames included, andglance.matrix
returning the same result asglance.data.frame
. -
Changed DESCRIPTION Authors@R to new format
-
Fixed small bug in
felm
where the.fitted
and.resid
columns were matrices rather than vectors. -
Added tidiers for
rlm
(robust linear model) andgam
(generalized additive model) objects, including adjustments to "lm" tidiers in order to handle them. See?rlm_tidiers
and?gam_tidiers
for more. -
Removed rownames from
tidy.cv.glmnet
output
-
The behavior of
augment
, particularly with regard to missing data and thena.exclude
argument, has through the use of theaugment_columns
function been made consistent across the following models:-
lm
-
glm
-
nls
-
merMod
(lme4
) -
survreg
(survival
) -
coxph
(survival
)
-
Unit tests in tests/testthat/test-augment.R
were added to ensure consistency
across these models.
tidy
,augment
andglance
methods were added forrowwise_df
objects, and are set up to apply across their rows. This allows for simple patterns such as:
regressions <- mtcars %>% group_by(cyl) %>% do(mod = lm(mpg ~ wt, .)) regressions %>% tidy(mod) regressions %>% augment(mod)
See ?rowwise_df_tidiers
for more.
-
Added
tidy
andglance
methods forArima
objects, andtidy
forpairwise.htest
objects. -
Fixes for CRAN: change package description to title case, removed NOTES, mostly by adding
globals.R
to declare global variables. -
This is the original version published on CRAN.
-
Tidiers have been added for S3 objects from the following packages:
-
lme4
-
glmnet
-
survival
-
zoo
-
felm
-
MASS
(ridgelm
objects)
-
-
tidy
andglance
methods for data.frames have also been added, andaugment.data.frame
produces an error (rather than returning the same data.frame). -
stderror
has been changed tostd.error
(affects many functions) to be consistent with broom's naming conventions for columns. -
A function
bootstrap
has been added based on this example, to perform the common use case of bootstrapping models.
-
Added "augment" S3 generic and various implementations. "augment" does something different from tidy: it adds columns to the original dataset, including predictions, residuals, or cluster assignments. This was originally described as "fortify" in ggplot2.
-
Added "glance" S3 generic and various implementations. "glance" produces a one-row data frame summary, which is necessary for tidy outputs with values like R^2 or F-statistics.
-
Re-wrote intro broom vignette/README to introduce all three methods.
-
Wrote a new kmeans vignette.
-
Added tidying methods for multcomp, sp, and map objects (from fortify-multcomp, fortify-sp, and fortify-map from ggplot2).
-
Because this integrates substantial amounts of ggplot2 code (with permission), added Hadley Wickham as an author in DESCRIPTION.