Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix formatting of bag imputation on factors #800

Merged
merged 10 commits into from Sep 15, 2021
Merged

Fix formatting of bag imputation on factors #800

merged 10 commits into from Sep 15, 2021

Conversation

topepo
Copy link
Member

@topepo topepo commented Sep 14, 2021

There is a bug (reported to maintainer today) in ipred::predict.classbag() where the factor levels of the predictions may not be correct. This results in a reverse dependency breakage.

This PR

  1. temporarily converts the outcome to a factor if a character column is to be imputed. The original data is untouched.
  2. casts the imputation predictions into a format consistent with raw data (factor, character, or numeric)
  3. no longer casts the data to be a factor (to be consistent with prep(rec, strings_as_factors = FALSE))

I also made changes to a $%&# Rmd file that is always changing because of cli::rule() widths and random step id's.

@juliasilge
Copy link
Member

juliasilge commented Sep 15, 2021

In workflows, we handle the printing width like this:

options(cli.width = 70, width = 70, cli.unicode = FALSE)

I think this is a really good thing to make sure we have so this is more consistent. 👍

I agree that the random IDs are frustrating, but if you look at how this documentation ends up, we don't ever explain about what id does. It seems like a bad idea (confusing to folks who land here) to use it in the documentation as it exists overall now.

We could add more explanation, like what we have here, but this page is already getting really long-winded and detailed.

  • Do we prefer to add this discussion here so we can use id? (I don't think I would want to add it here on its own merits TBH. Although now that I look, I can find it on TMwR but maybe not anywhere in the recipes docs?)
  • Do we want to get used to not staging this chunk? This is what I typically do when I run devtools::document(); I either don't stage this whole file or if I need something from this file, I don't stage that chunk.

@DavisVaughan
Copy link
Member

I fixed the random step id's here by setting the seed. We shouldn't have to worry about that anymore
#794

@DavisVaughan

This comment has been minimized.

@topepo
Copy link
Member Author

topepo commented Sep 15, 2021

Torsten fixed the ipred bug within like 15 and the new version is on CRAN

@topepo topepo merged commit 66a5137 into master Sep 15, 2021
@topepo topepo deleted the bag-imp-levels branch September 15, 2021 18:03
@github-actions
Copy link

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Sep 30, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants