Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug on step_num2factor() #425

Closed
hermandr opened this issue Dec 8, 2019 · 2 comments
Closed

Bug on step_num2factor() #425

hermandr opened this issue Dec 8, 2019 · 2 comments
Labels
bug an unexpected problem or unintended behavior next release 🚀

Comments

@hermandr
Copy link

hermandr commented Dec 8, 2019

step_num2factor() ignores parameter levels.

Minimal, reproducible example:

library(tidymodels)
#> Registered S3 method overwritten by 'xts':
#>   method     from
#>   as.zoo.xts zoo
#> -- Attaching packages ------------------------------------------------------------ tidymodels 0.0.3 --
#> v broom     0.5.2       v purrr     0.3.2  
#> v dials     0.0.3       v recipes   0.1.7  
#> v dplyr     0.8.3       v rsample   0.0.5  
#> v ggplot2   3.2.1       v tibble    2.1.3  
#> v infer     0.5.0       v yardstick 0.0.4  
#> v parsnip   0.0.3.1
#> -- Conflicts --------------------------------------------------------------- tidymodels_conflicts() --
#> x purrr::discard()  masks scales::discard()
#> x dplyr::filter()   masks stats::filter()
#> x dplyr::lag()      masks stats::lag()
#> x ggplot2::margin() masks dials::margin()
#> x dials::offset()   masks stats::offset()
#> x recipes::step()   masks stats::step()
library(caret)
#> Loading required package: lattice
#> 
#> Attaching package: 'caret'
#> The following objects are masked from 'package:yardstick':
#> 
#>     precision, recall
#> The following object is masked from 'package:purrr':
#> 
#>     lift

data(PimaIndiansDiabetes, package="mlbench")

# Change the target variable from factor to numeric
PimaIndiansDiabetes$diabetes <- as.numeric(PimaIndiansDiabetes$diabetes)

data_split <- initial_split(PimaIndiansDiabetes)

recipe_obj <- training(data_split) %>%
  recipe(diabetes ~ .) %>%
  step_num2factor(diabetes,levels=c("pos","neg"))

recipe_obj %>% prep() %>% juice()   
#> # A tibble: 576 x 9
#>    pregnant glucose pressure triceps insulin  mass pedigree   age diabetes
#>       <dbl>   <dbl>    <dbl>   <dbl>   <dbl> <dbl>    <dbl> <dbl> <fct>   
#>  1        6     148       72      35       0  33.6    0.627    50 2       
#>  2        1      85       66      29       0  26.6    0.351    31 1       
#>  3        8     183       64       0       0  23.3    0.672    32 2       
#>  4        1      89       66      23      94  28.1    0.167    21 1       
#>  5        0     137       40      35     168  43.1    2.29     33 2       
#>  6        5     116       74       0       0  25.6    0.201    30 1       
#>  7        3      78       50      32      88  31      0.248    26 2       
#>  8       10     115        0       0       0  35.3    0.134    29 1       
#>  9        2     197       70      45     543  30.5    0.158    53 2       
#> 10        8     125       96       0       0   0      0.232    54 2       
#> # ... with 566 more rows

Created on 2019-12-08 by the reprex package (v0.3.0)

Minimal Reproducible Dataset:

data(PimaIndiansDiabetes, package="mlbench")#### Minimal, runnable code:

Session Info:

sessionInfo()
#> R version 3.6.1 (2019-07-05)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 18362)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=English_Singapore.1252  LC_CTYPE=English_Singapore.1252   
#> [3] LC_MONETARY=English_Singapore.1252 LC_NUMERIC=C                      
#> [5] LC_TIME=English_Singapore.1252    
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> loaded via a namespace (and not attached):
#>  [1] compiler_3.6.1  magrittr_1.5    tools_3.6.1     htmltools_0.4.0
#>  [5] yaml_2.2.0      Rcpp_1.0.2      stringi_1.4.3   rmarkdown_1.16 
#>  [9] highr_0.8       knitr_1.25      stringr_1.4.0   xfun_0.10      
#> [13] digest_0.6.21   rlang_0.4.0     evaluate_0.14

Created on 2019-12-08 by the reprex package (v0.3.0)

Be sure to test your chunks of code in an empty R session before submitting your issue!

@topepo topepo added bug an unexpected problem or unintended behavior next release 🚀 labels Dec 9, 2019
@topepo topepo closed this as completed in 015fcc6 Dec 17, 2019
@topepo
Copy link
Member

topepo commented Dec 17, 2019

I've checked in changes that fixes the issue. However, this required some surgery on the code and there are some breaking changes:

  • If the input are not already integers, the transform function should convert them to integer.

  • The levels are required.

@github-actions
Copy link

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Feb 22, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug an unexpected problem or unintended behavior next release 🚀
Projects
None yet
Development

No branches or pull requests

2 participants