Skip to content

refreshing already prepped recipe with step_dummy doesn't work #492

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
TylerGrantSmith opened this issue Apr 13, 2020 · 1 comment · Fixed by #957
Closed

refreshing already prepped recipe with step_dummy doesn't work #492

TylerGrantSmith opened this issue Apr 13, 2020 · 1 comment · Fixed by #957
Labels
bug an unexpected problem or unintended behavior

Comments

@TylerGrantSmith
Copy link

You are unable to re-prep a recipe that had a step_dummy step because the term_info field reflects the new dummy terms and the old nominal terms were removed.

When prep is called again on the step_dummy, the modified term_info is passed and col_names below ends up being NULL

col_names <- terms_select(x$terms, info = info, empty_fun = passover)

suppressPackageStartupMessages(library(tidymodels))
#> Warning: package 'tidymodels' was built under R version 3.6.3
#> Warning: package 'broom' was built under R version 3.6.3
#> Warning: package 'ggplot2' was built under R version 3.6.3
#> Warning: package 'infer' was built under R version 3.6.3
#> Warning: package 'parsnip' was built under R version 3.6.3
#> Warning: package 'recipes' was built under R version 3.6.3
#> Warning: package 'tune' was built under R version 3.6.3
#> Warning: package 'workflows' was built under R version 3.6.3
#> Warning: package 'yardstick' was built under R version 3.6.3
test_data <- data.frame(x = factor(c(1,2)), y = c(1,2))

rec <- 
  recipe(y~., data = test_data) %>% 
  step_dummy(all_nominal(), one_hot = TRUE) %>% 
  prep()
rec
#> Data Recipe
#> 
#> Inputs:
#> 
#>       role #variables
#>    outcome          1
#>  predictor          1
#> 
#> Training data contained 2 data points and no missing data.
#> 
#> Operations:
#> 
#> Dummy variables from x [trained]
juice(rec)
#> # A tibble: 2 x 3
#>       y  x_X1  x_X2
#>   <dbl> <dbl> <dbl>
#> 1     1     1     0
#> 2     2     0     1

new_rec <- prep(rec, training = test_data, fresh = TRUE)
new_rec
#> Data Recipe
#> 
#> Inputs:
#> 
#>       role #variables
#>    outcome          1
#>  predictor          1
#> 
#> Training data contained 2 data points and no missing data.
#> 
#> Operations:
#> 
#> Dummy variables were *not* created since no columns were selected. [trained]
juice(new_rec)
#> # A tibble: 2 x 2
#>   x         y
#>   <fct> <dbl>
#> 1 1         1
#> 2 2         2
sessioninfo::session_info()
#> - Session info ---------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.6.2 (2019-12-12)
#>  os       Windows 10 x64              
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language (EN)                        
#>  collate  English_United States.1252  
#>  ctype    English_United States.1252  
#>  tz       America/Chicago             
#>  date     2020-04-13                  
#> 
#> - Packages -------------------------------------------------------------------
#>  package       * version    date       lib source        
#>  assertthat      0.2.1      2019-03-21 [1] CRAN (R 3.6.2)
#>  backports       1.1.5      2019-10-02 [1] CRAN (R 3.6.1)
#>  base64enc       0.1-3      2015-07-28 [1] CRAN (R 3.6.0)
#>  bayesplot       1.7.1      2019-12-01 [1] CRAN (R 3.6.3)
#>  boot            1.3-23     2019-07-05 [1] CRAN (R 3.6.2)
#>  broom         * 0.5.5      2020-02-29 [1] CRAN (R 3.6.3)
#>  callr           3.4.0      2019-12-09 [1] CRAN (R 3.6.2)
#>  class           7.3-15     2019-01-01 [1] CRAN (R 3.6.2)
#>  cli             2.0.2      2020-02-28 [1] CRAN (R 3.6.3)
#>  codetools       0.2-16     2018-12-24 [1] CRAN (R 3.6.2)
#>  colorspace      1.4-1      2019-03-18 [1] CRAN (R 3.6.1)
#>  colourpicker    1.0        2017-09-27 [1] CRAN (R 3.6.2)
#>  crayon          1.3.4      2017-09-16 [1] CRAN (R 3.6.2)
#>  crosstalk       1.0.0      2016-12-21 [1] CRAN (R 3.6.2)
#>  dials         * 0.0.5      2020-04-01 [1] CRAN (R 3.6.2)
#>  DiceDesign      1.8-1      2019-07-31 [1] CRAN (R 3.6.3)
#>  digest          0.6.25     2020-02-23 [1] CRAN (R 3.6.2)
#>  dplyr         * 0.8.3      2019-07-04 [1] CRAN (R 3.6.2)
#>  DT              0.11       2019-12-19 [1] CRAN (R 3.6.2)
#>  dygraphs        1.1.1.6    2018-07-11 [1] CRAN (R 3.6.3)
#>  evaluate        0.14       2019-05-28 [1] CRAN (R 3.6.2)
#>  fansi           0.4.1      2020-01-08 [1] CRAN (R 3.6.2)
#>  fastmap         1.0.1      2019-10-08 [1] CRAN (R 3.6.2)
#>  foreach         1.4.7      2019-07-27 [1] CRAN (R 3.6.2)
#>  furrr           0.1.0      2018-05-16 [1] CRAN (R 3.6.3)
#>  future          1.15.1     2019-11-25 [1] CRAN (R 3.6.2)
#>  generics        0.0.2      2018-11-29 [1] CRAN (R 3.6.2)
#>  ggplot2       * 3.3.0      2020-03-05 [1] CRAN (R 3.6.3)
#>  ggridges        0.5.2      2020-01-12 [1] CRAN (R 3.6.2)
#>  globals         0.12.5     2019-12-07 [1] CRAN (R 3.6.1)
#>  glue            1.3.1      2019-03-12 [1] CRAN (R 3.6.2)
#>  gower           0.2.1      2019-05-14 [1] CRAN (R 3.6.1)
#>  GPfit           1.0-8      2019-02-08 [1] CRAN (R 3.6.3)
#>  gridExtra       2.3        2017-09-09 [1] CRAN (R 3.6.2)
#>  gtable          0.3.0      2019-03-25 [1] CRAN (R 3.6.2)
#>  gtools          3.8.2      2020-03-31 [1] CRAN (R 3.6.2)
#>  highr           0.8        2019-03-20 [1] CRAN (R 3.6.2)
#>  htmltools       0.4.0      2019-10-04 [1] CRAN (R 3.6.2)
#>  htmlwidgets     1.5.1      2019-10-08 [1] CRAN (R 3.6.2)
#>  httpuv          1.5.2      2019-09-11 [1] CRAN (R 3.6.2)
#>  igraph          1.2.4.2    2019-11-27 [1] CRAN (R 3.6.2)
#>  infer         * 0.5.1      2019-11-19 [1] CRAN (R 3.6.3)
#>  inline          0.3.15     2018-05-18 [1] CRAN (R 3.6.3)
#>  ipred           0.9-9      2019-04-28 [1] CRAN (R 3.6.2)
#>  iterators       1.0.12     2019-07-26 [1] CRAN (R 3.6.2)
#>  janeaustenr     0.1.5      2017-06-10 [1] CRAN (R 3.6.2)
#>  knitr           1.26       2019-11-12 [1] CRAN (R 3.6.2)
#>  later           1.0.0      2019-10-04 [1] CRAN (R 3.6.2)
#>  lattice         0.20-38    2018-11-04 [1] CRAN (R 3.6.2)
#>  lava            1.6.6      2019-08-01 [1] CRAN (R 3.6.2)
#>  lhs             1.0.1      2019-02-03 [1] CRAN (R 3.6.3)
#>  lifecycle       0.1.0      2019-08-01 [1] CRAN (R 3.6.2)
#>  listenv         0.8.0      2019-12-05 [1] CRAN (R 3.6.2)
#>  lme4            1.1-21     2019-03-05 [1] CRAN (R 3.6.3)
#>  loo             2.2.0      2019-12-19 [1] CRAN (R 3.6.3)
#>  lubridate       1.7.4      2018-04-11 [1] CRAN (R 3.6.2)
#>  magrittr        1.5        2014-11-22 [1] CRAN (R 3.6.2)
#>  markdown        1.1        2019-08-07 [1] CRAN (R 3.6.2)
#>  MASS            7.3-51.4   2019-03-31 [1] CRAN (R 3.6.2)
#>  Matrix          1.2-18     2019-11-27 [1] CRAN (R 3.6.2)
#>  matrixStats     0.56.0     2020-03-13 [1] CRAN (R 3.6.3)
#>  mime            0.9        2020-02-04 [1] CRAN (R 3.6.2)
#>  miniUI          0.1.1.1    2018-05-18 [1] CRAN (R 3.6.2)
#>  minqa           1.2.4      2014-10-09 [1] CRAN (R 3.6.3)
#>  munsell         0.5.0      2018-06-12 [1] CRAN (R 3.6.2)
#>  nlme            3.1-142    2019-11-07 [1] CRAN (R 3.6.2)
#>  nloptr          1.2.2.1    2020-03-11 [1] CRAN (R 3.6.3)
#>  nnet            7.3-12     2016-02-02 [1] CRAN (R 3.6.2)
#>  parsnip       * 0.0.5      2020-01-07 [1] CRAN (R 3.6.3)
#>  pillar          1.4.3      2019-12-20 [1] CRAN (R 3.6.2)
#>  pkgbuild        1.0.6      2019-10-09 [1] CRAN (R 3.6.2)
#>  pkgconfig       2.0.3      2019-09-22 [1] CRAN (R 3.6.2)
#>  plyr            1.8.5      2019-12-10 [1] CRAN (R 3.6.2)
#>  prettyunits     1.0.2      2015-07-13 [1] CRAN (R 3.6.2)
#>  pROC            1.16.0     2020-01-12 [1] CRAN (R 3.6.2)
#>  processx        3.4.1      2019-07-18 [1] CRAN (R 3.6.2)
#>  prodlim         2019.11.13 2019-11-17 [1] CRAN (R 3.6.2)
#>  promises        1.1.0      2019-10-04 [1] CRAN (R 3.6.2)
#>  ps              1.3.0      2018-12-21 [1] CRAN (R 3.6.2)
#>  purrr         * 0.3.3      2019-10-18 [1] CRAN (R 3.6.2)
#>  R6              2.4.1      2019-11-12 [1] CRAN (R 3.6.2)
#>  Rcpp            1.0.3      2019-11-08 [1] CRAN (R 3.6.2)
#>  recipes       * 0.1.10     2020-03-18 [1] CRAN (R 3.6.3)
#>  reshape2        1.4.3      2017-12-11 [1] CRAN (R 3.6.2)
#>  rlang           0.4.4      2020-01-28 [1] CRAN (R 3.6.2)
#>  rmarkdown       2.0        2019-12-12 [1] CRAN (R 3.6.2)
#>  rpart           4.1-15     2019-04-12 [1] CRAN (R 3.6.2)
#>  rsample       * 0.0.6      2020-03-31 [1] CRAN (R 3.6.2)
#>  rsconnect       0.8.16     2019-12-13 [1] CRAN (R 3.6.2)
#>  rstan           2.19.3     2020-02-11 [1] CRAN (R 3.6.3)
#>  rstanarm        2.19.3     2020-02-11 [1] CRAN (R 3.6.3)
#>  rstantools      2.0.0      2019-09-15 [1] CRAN (R 3.6.3)
#>  rstudioapi      0.11       2020-02-07 [1] CRAN (R 3.6.2)
#>  scales        * 1.1.0      2019-11-18 [1] CRAN (R 3.6.2)
#>  sessioninfo     1.1.1      2018-11-05 [1] CRAN (R 3.6.2)
#>  shiny           1.4.0      2019-10-10 [1] CRAN (R 3.6.2)
#>  shinyjs         1.1        2020-01-13 [1] CRAN (R 3.6.2)
#>  shinystan       2.5.0      2018-05-01 [1] CRAN (R 3.6.3)
#>  shinythemes     1.1.2      2018-11-06 [1] CRAN (R 3.6.3)
#>  SnowballC       0.6.0      2019-01-15 [1] CRAN (R 3.6.0)
#>  StanHeaders     2.21.0-1   2020-01-19 [1] CRAN (R 3.6.2)
#>  stringi         1.4.6      2020-02-17 [1] CRAN (R 3.6.2)
#>  stringr         1.4.0      2019-02-10 [1] CRAN (R 3.6.2)
#>  survival        3.1-8      2019-12-03 [1] CRAN (R 3.6.2)
#>  threejs         0.3.3      2020-01-21 [1] CRAN (R 3.6.3)
#>  tibble        * 2.1.3      2019-06-06 [1] CRAN (R 3.6.2)
#>  tidymodels    * 0.1.0      2020-02-16 [1] CRAN (R 3.6.3)
#>  tidyposterior   0.0.2      2018-11-15 [1] CRAN (R 3.6.3)
#>  tidypredict     0.4.5      2020-02-10 [1] CRAN (R 3.6.3)
#>  tidyr           1.0.0      2019-09-11 [1] CRAN (R 3.6.2)
#>  tidyselect      1.0.0      2020-01-27 [1] CRAN (R 3.6.3)
#>  tidytext        0.2.2      2019-07-29 [1] CRAN (R 3.6.2)
#>  timeDate        3043.102   2018-02-21 [1] CRAN (R 3.6.0)
#>  tokenizers      0.2.1      2018-03-29 [1] CRAN (R 3.6.2)
#>  tune          * 0.0.1      2020-02-11 [1] CRAN (R 3.6.3)
#>  utf8            1.1.4      2018-05-24 [1] CRAN (R 3.6.2)
#>  vctrs           0.2.3      2020-02-20 [1] CRAN (R 3.6.3)
#>  withr           2.1.2      2018-03-15 [1] CRAN (R 3.6.2)
#>  workflows     * 0.1.1      2020-03-17 [1] CRAN (R 3.6.3)
#>  xfun            0.11       2019-11-12 [1] CRAN (R 3.6.2)
#>  xtable          1.8-4      2019-04-21 [1] CRAN (R 3.6.2)
#>  xts             0.12-0     2020-01-19 [1] CRAN (R 3.6.2)
#>  yaml            2.2.1      2020-02-01 [1] CRAN (R 3.6.2)
#>  yardstick     * 0.0.6      2020-03-17 [1] CRAN (R 3.6.3)
#>  zoo             1.8-7      2020-01-10 [1] CRAN (R 3.6.2)
@topepo topepo added the bug an unexpected problem or unintended behavior label May 1, 2020
@github-actions
Copy link

github-actions bot commented May 4, 2022

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators May 4, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants