Skip to content

Bug on step_num2factor() #425

@hermandr

Description

@hermandr

step_num2factor() ignores parameter levels.

Minimal, reproducible example:

library(tidymodels)
#> Registered S3 method overwritten by 'xts':
#>   method     from
#>   as.zoo.xts zoo
#> -- Attaching packages ------------------------------------------------------------ tidymodels 0.0.3 --
#> v broom     0.5.2       v purrr     0.3.2  
#> v dials     0.0.3       v recipes   0.1.7  
#> v dplyr     0.8.3       v rsample   0.0.5  
#> v ggplot2   3.2.1       v tibble    2.1.3  
#> v infer     0.5.0       v yardstick 0.0.4  
#> v parsnip   0.0.3.1
#> -- Conflicts --------------------------------------------------------------- tidymodels_conflicts() --
#> x purrr::discard()  masks scales::discard()
#> x dplyr::filter()   masks stats::filter()
#> x dplyr::lag()      masks stats::lag()
#> x ggplot2::margin() masks dials::margin()
#> x dials::offset()   masks stats::offset()
#> x recipes::step()   masks stats::step()
library(caret)
#> Loading required package: lattice
#> 
#> Attaching package: 'caret'
#> The following objects are masked from 'package:yardstick':
#> 
#>     precision, recall
#> The following object is masked from 'package:purrr':
#> 
#>     lift

data(PimaIndiansDiabetes, package="mlbench")

# Change the target variable from factor to numeric
PimaIndiansDiabetes$diabetes <- as.numeric(PimaIndiansDiabetes$diabetes)

data_split <- initial_split(PimaIndiansDiabetes)

recipe_obj <- training(data_split) %>%
  recipe(diabetes ~ .) %>%
  step_num2factor(diabetes,levels=c("pos","neg"))

recipe_obj %>% prep() %>% juice()   
#> # A tibble: 576 x 9
#>    pregnant glucose pressure triceps insulin  mass pedigree   age diabetes
#>       <dbl>   <dbl>    <dbl>   <dbl>   <dbl> <dbl>    <dbl> <dbl> <fct>   
#>  1        6     148       72      35       0  33.6    0.627    50 2       
#>  2        1      85       66      29       0  26.6    0.351    31 1       
#>  3        8     183       64       0       0  23.3    0.672    32 2       
#>  4        1      89       66      23      94  28.1    0.167    21 1       
#>  5        0     137       40      35     168  43.1    2.29     33 2       
#>  6        5     116       74       0       0  25.6    0.201    30 1       
#>  7        3      78       50      32      88  31      0.248    26 2       
#>  8       10     115        0       0       0  35.3    0.134    29 1       
#>  9        2     197       70      45     543  30.5    0.158    53 2       
#> 10        8     125       96       0       0   0      0.232    54 2       
#> # ... with 566 more rows

Created on 2019-12-08 by the reprex package (v0.3.0)

Minimal Reproducible Dataset:

data(PimaIndiansDiabetes, package="mlbench")#### Minimal, runnable code:

Session Info:

sessionInfo()
#> R version 3.6.1 (2019-07-05)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 18362)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=English_Singapore.1252  LC_CTYPE=English_Singapore.1252   
#> [3] LC_MONETARY=English_Singapore.1252 LC_NUMERIC=C                      
#> [5] LC_TIME=English_Singapore.1252    
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> loaded via a namespace (and not attached):
#>  [1] compiler_3.6.1  magrittr_1.5    tools_3.6.1     htmltools_0.4.0
#>  [5] yaml_2.2.0      Rcpp_1.0.2      stringi_1.4.3   rmarkdown_1.16 
#>  [9] highr_0.8       knitr_1.25      stringr_1.4.0   xfun_0.10      
#> [13] digest_0.6.21   rlang_0.4.0     evaluate_0.14

Created on 2019-12-08 by the reprex package (v0.3.0)

Be sure to test your chunks of code in an empty R session before submitting your issue!

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugan unexpected problem or unintended behaviornext release 🚀

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions