discretization does not work when there's na, na.rm = TRUE does not pass to step_discretize #127

NBRAYKO opened this issue Feb 21, 2018 · 3 comments


NBRAYKO commented Feb 21, 2018

step_discretize fails with this message even when na.rm = T is set
Error in quantile.default(x, probs = seq(0, 1, length = cuts + 1), ...) :

although discretize works successfully on the same vector.


iris_na <- iris
iris_na$sepal_na <- iris_na$Sepal.Length
iris_na$sepal_na[1:5] = NA

recipe(~., data = iris_na) %>% 
  step_discretize(sepal_na,options = list(min.unique = 2,cuts = 2,keep_na = T, na.rm = T)) %>% 

Error in quantile.default(x, probs = seq(0, 1, length = cuts + 1), ...) :
When I do discretize(iris_na$sepal_na,min.unique = 2,cuts = 2,keep_na = T,na.rm = T) it runs fine

Bins: 3 (includes missing category)
Breaks: -Inf, 5.8, Inf
R version 3.4.3 (2017-11-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux

Matrix products: default
BLAS: /opt/R/3.4.3/lib64/R/lib/
LAPACK: /opt/R/3.4.3/lib64/R/lib/

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] bindrcpp_0.2  recipes_0.1.1 broom_0.4.2   dplyr_0.7.4  

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.14      ddalpha_1.3.1     gower_0.1.2       compiler_3.4.3    DEoptimR_1.0-8    plyr_1.8.4        bindr_0.1        
 [8] class_7.3-14      tools_3.4.3       rpart_4.1-11      ipred_0.9-6       lubridate_1.6.0   tibble_1.3.4      nlme_3.1-131     
[15] lattice_0.20-35   pkgconfig_2.0.1   rlang_0.1.4       Matrix_1.2-12     psych_1.7.5       yaml_2.1.14       parallel_3.4.3   
[22] RcppRoll_0.2.2    prodlim_1.6.1     stringr_1.2.0     tidyselect_0.2.3  nnet_7.3-12       CVST_0.2-1        grid_3.4.3       
[29] glue_1.2.0        robustbase_0.92-8 R6_2.2.2          survival_2.41-3   foreign_0.8-69    Amelia_1.7.4      lava_1.5.1       
[36] kernlab_0.9-25    tidyr_0.7.2       reshape2_1.4.2    purrr_0.2.4       magrittr_1.5      DRR_0.0.2         splines_3.4.3    
[43] MASS_7.3-47       sfsmisc_1.1-1     dichromat_2.0-0   assertthat_0.2.0  dimRed_0.1.0      mnormt_1.5-5      timeDate_3012.100
[50] stringi_1.1.6
topepo commented Feb 23, 2018

Go ahead and test and I'll reopen if there is an issue. Thanks

@topepo topepo closed this as completed Feb 23, 2018
NBRAYKO commented Feb 23, 2018

Works now, thank you! You still gotta specify na.rm = TRUE in options, even when keep_na = TRUE.

topepo added a commit to tidymodels/tune that referenced this issue Dec 5, 2019
