Fix formatting of bag imputation on factors #800

topepo · 2021-09-14T21:21:59Z

There is a bug (reported to maintainer today) in ipred::predict.classbag() where the factor levels of the predictions may not be correct. This results in a reverse dependency breakage.

This PR

temporarily converts the outcome to a factor if a character column is to be imputed. The original data is untouched.
casts the imputation predictions into a format consistent with raw data (factor, character, or numeric)
no longer casts the data to be a factor (to be consistent with prep(rec, strings_as_factors = FALSE))

I also made changes to a $%&# Rmd file that is always changing because of cli::rule() widths and random step id's.

juliasilge · 2021-09-15T02:07:09Z

In workflows, we handle the printing width like this:

options(cli.width = 70, width = 70, cli.unicode = FALSE)

I think this is a really good thing to make sure we have so this is more consistent. 👍

I agree that the random IDs are frustrating, but if you look at how this documentation ends up, we don't ever explain about what id does. It seems like a bad idea (confusing to folks who land here) to use it in the documentation as it exists overall now.

We could add more explanation, like what we have here, but this page is already getting really long-winded and detailed.

Do we prefer to add this discussion here so we can use id? (I don't think I would want to add it here on its own merits TBH. Although now that I look, I can find it on TMwR but maybe not anywhere in the recipes docs?)
Do we want to get used to not staging this chunk? This is what I typically do when I run devtools::document(); I either don't stage this whole file or if I need something from this file, I don't stage that chunk.

DavisVaughan · 2021-09-15T12:17:37Z

I fixed the random step id's here by setting the seed. We shouldn't have to worry about that anymore
#794

R/impute_bag.R

topepo · 2021-09-15T15:31:09Z

Torsten fixed the ipred bug within like 15 and the new version is on CRAN

github-actions · 2021-09-30T00:14:38Z

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex https://reprex.tidyverse.org) and link to this issue.

topepo added 4 commits September 14, 2021 16:56

stop with the random ids being printed

b5fe37e

re-cast the predictions due to ipred bug

3eb9b6e

fix with of Rnd cli output

4b52a60

make bagging imputation work with strings_as_factors = FALSE

845993d

topepo requested review from juliasilge and DavisVaughan September 14, 2021 21:22

This comment has been minimized.

Sign in to view

DavisVaughan reviewed Sep 15, 2021

View reviewed changes

R/impute_bag.R Show resolved Hide resolved

DavisVaughan reviewed Sep 15, 2021

View reviewed changes

R/impute_bag.R Show resolved Hide resolved

topepo added 6 commits September 15, 2021 11:35

undo id specification in Rmd

4b02b3e

better cli options

27ecb68

update from master for Davis's changes

52aee40

ipred version requirement

da562f1

udpated docs

9895672

update news

e68a892

topepo merged commit 66a5137 into master Sep 15, 2021

topepo deleted the bag-imp-levels branch September 15, 2021 18:03

github-actions bot locked and limited conversation to collaborators Sep 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix formatting of bag imputation on factors #800

Fix formatting of bag imputation on factors #800

topepo commented Sep 14, 2021

juliasilge commented Sep 15, 2021 •

edited

DavisVaughan commented Sep 15, 2021

This comment has been minimized.

topepo commented Sep 15, 2021

github-actions bot commented Sep 30, 2021

Fix formatting of bag imputation on factors #800

Fix formatting of bag imputation on factors #800

Conversation

topepo commented Sep 14, 2021

juliasilge commented Sep 15, 2021 • edited

DavisVaughan commented Sep 15, 2021

This comment has been minimized.

topepo commented Sep 15, 2021

github-actions bot commented Sep 30, 2021

juliasilge commented Sep 15, 2021 •

edited