Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

discretization does not work when there's na, na.rm = TRUE does not pass to `step_discretize` #127

Closed
NBRAYKO opened this issue Feb 21, 2018 · 2 comments

Comments

@NBRAYKO
Copy link

@NBRAYKO NBRAYKO commented Feb 21, 2018

step_discretize fails with this message even when na.rm = T is set
Error in quantile.default(x, probs = seq(0, 1, length = cuts + 1), ...) :

although discretize works successfully on the same vector.

Example:

data("iris")
iris_na <- iris
iris_na$sepal_na <- iris_na$Sepal.Length
iris_na$sepal_na[1:5] = NA

recipe(~., data = iris_na) %>% 
  step_discretize(sepal_na,options = list(min.unique = 2,cuts = 2,keep_na = T, na.rm = T)) %>% 
   prep

Error in quantile.default(x, probs = seq(0, 1, length = cuts + 1), ...) :
When I do discretize(iris_na$sepal_na,min.unique = 2,cuts = 2,keep_na = T,na.rm = T) it runs fine

Bins: 3 (includes missing category)
Breaks: -Inf, 5.8, Inf
R version 3.4.3 (2017-11-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux

Matrix products: default
BLAS: /opt/R/3.4.3/lib64/R/lib/libRblas.so
LAPACK: /opt/R/3.4.3/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] bindrcpp_0.2  recipes_0.1.1 broom_0.4.2   dplyr_0.7.4  

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.14      ddalpha_1.3.1     gower_0.1.2       compiler_3.4.3    DEoptimR_1.0-8    plyr_1.8.4        bindr_0.1        
 [8] class_7.3-14      tools_3.4.3       rpart_4.1-11      ipred_0.9-6       lubridate_1.6.0   tibble_1.3.4      nlme_3.1-131     
[15] lattice_0.20-35   pkgconfig_2.0.1   rlang_0.1.4       Matrix_1.2-12     psych_1.7.5       yaml_2.1.14       parallel_3.4.3   
[22] RcppRoll_0.2.2    prodlim_1.6.1     stringr_1.2.0     tidyselect_0.2.3  nnet_7.3-12       CVST_0.2-1        grid_3.4.3       
[29] glue_1.2.0        robustbase_0.92-8 R6_2.2.2          survival_2.41-3   foreign_0.8-69    Amelia_1.7.4      lava_1.5.1       
[36] kernlab_0.9-25    tidyr_0.7.2       reshape2_1.4.2    purrr_0.2.4       magrittr_1.5      DRR_0.0.2         splines_3.4.3    
[43] MASS_7.3-47       sfsmisc_1.1-1     dichromat_2.0-0   assertthat_0.2.0  dimRed_0.1.0      mnormt_1.5-5      timeDate_3012.100
[50] stringi_1.1.6
topepo added a commit that referenced this issue Feb 23, 2018
@topepo
Copy link
Collaborator

@topepo topepo commented Feb 23, 2018

Go ahead and test and I'll reopen if there is an issue. Thanks

@topepo topepo closed this Feb 23, 2018
@NBRAYKO
Copy link
Author

@NBRAYKO NBRAYKO commented Feb 23, 2018

Works now, thank you! You still gotta specify na.rm = TRUE in options, even when keep_na = TRUE.

topepo added a commit to tidymodels/tune that referenced this issue Dec 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.