Skip to content

Commit

Permalink
Fix formatting of bag imputation on factors (#800)
Browse files Browse the repository at this point in the history
* stop with the random ids being printed

* re-cast the predictions due to ipred bug

* fix with of Rnd cli output

* make bagging imputation work with strings_as_factors = FALSE

* undo id specification in Rmd

* better cli options

* update from master for Davis's changes

* ipred version requirement

* udpated docs

* update news
  • Loading branch information
topepo committed Sep 15, 2021
1 parent 77fa788 commit 66a5137
Show file tree
Hide file tree
Showing 6 changed files with 33 additions and 14 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ Imports:
generics (>= 0.1.0),
glue,
gower,
ipred,
ipred (>= 0.9-12),
lifecycle,
lubridate,
magrittr,
Expand Down
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@

* `step_logit()` gained an offset argument for cases where the input is either zero or one (#784)

* A bug was fixed where imputed values via bagged trees would have the wrong levels.

# recipes 0.1.16

## New Steps
Expand Down
8 changes: 6 additions & 2 deletions R/impute_bag.R
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,10 @@ step_impute_bag_new <-
bag_wrap <- function(vars, dat, opt, seed_val) {
seed_val <- seed_val[1]
dat <- as.data.frame(dat[, c(vars$y, vars$x)])
if (is.character(dat[[vars$y]])) {
dat[[vars$y]] <- factor(dat[[vars$y]])
}

if (!is.null(seed_val) && !is.na(seed_val))
set.seed(seed_val)

Expand Down Expand Up @@ -262,8 +266,8 @@ bake.step_impute_bag <- function(object, new_data, ...) {
rlang::warn("All predictors are missing; cannot impute")
} else {
pred_vals <- predict(object$models[[imp_var]], pred_data)
pred_vals <- cast(pred_vals, new_data[[imp_var]])
new_data[[imp_var]] <- vec_cast(new_data[[imp_var]], pred_vals)
# For an ipred bug reported on 2021-09-14:
pred_vals <- cast(pred_vals, object$models[[imp_var]]$y)
new_data[missing_rows, imp_var] <- pred_vals
}
}
Expand Down
20 changes: 10 additions & 10 deletions man/recipe.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/rmd/recipes.Rmd
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
```{r startup, include = FALSE}
options(width = 70)
options(cli.width = 70, width = 70, cli.unicode = FALSE)
set.seed(123)
library(dplyr)
Expand Down
13 changes: 13 additions & 0 deletions tests/testthat/test_impute_bag.R
Original file line number Diff line number Diff line change
Expand Up @@ -100,3 +100,16 @@ test_that('tunable', {
)
})


test_that('non-factor imputation', {
data(scat)
scat$Location <- as.character(scat$Location)
scat$Location[1] <- NA
rec <-
recipe(Species ~ ., data = scat) %>%
step_impute_bag(Location, impute_with = imp_vars(all_predictors())) %>%
prep(strings_as_factors = FALSE)
expect_true(is.character(bake(rec, NULL, Location)[[1]]))

})

0 comments on commit 66a5137

Please sign in to comment.