library(modeldata)
suppressPackageStartupMessages(library(recipes))
data(biomass)
biomass$carbon[1] = NA
summary(biomass$carbon)
#> Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
#> 14.61 44.70 47.10 48.29 49.70 97.18 1
# See error (with a not so clear message) because there is a NA:
discretize(biomass$carbon, cuts = 2, infs = FALSE, keep_na = T)
#> Error in quantile.default(x, probs = seq(0, 1, length = cuts + 1), ...): missing values and NaN's not allowed if 'na.rm' is FALSE
#In issue #127, I found out that na.rm = T must be passed,
#although there is no mention of na.rm in ?recipes::discretize,
#and also no mention in the related ?recipes::step_discretize
discretize(biomass$carbon, cuts = 2, infs = FALSE, keep_na = T, na.rm = T)
#> Bins: 3 (includes missing category)
#> Breaks: 14.61, 47.1, 97.18
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.2.0 (2022-04-22)
#> os Arch Linux
#> system x86_64, linux-gnu
#> ui X11
#> language en
#> collate pt_BR.UTF-8
#> ctype pt_BR.UTF-8
#> tz America/Sao_Paulo
#> date 2022-05-15
#> pandoc 2.17.1.1 @ /usr/bin/ (via rmarkdown)
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> assertthat 0.2.1 2019-03-21 [2] CRAN (R 4.0.0)
#> class 7.3-20 2022-01-16 [2] CRAN (R 4.2.0)
#> cli 3.3.0 2022-04-25 [1] CRAN (R 4.2.0)
#> codetools 0.2-18 2020-11-04 [2] CRAN (R 4.2.0)
#> crayon 1.5.1 2022-03-26 [1] CRAN (R 4.2.0)
#> DBI 1.1.2 2021-12-20 [1] CRAN (R 4.1.2)
#> digest 0.6.29 2021-12-01 [1] CRAN (R 4.1.2)
#> dplyr * 1.0.9 2022-04-28 [1] CRAN (R 4.2.0)
#> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.2)
#> evaluate 0.15 2022-02-18 [1] CRAN (R 4.2.0)
#> fansi 1.0.3 2022-03-24 [1] CRAN (R 4.2.0)
#> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.0.3)
#> fs 1.5.2 2021-12-08 [1] CRAN (R 4.1.2)
#> future 1.25.0 2022-04-24 [1] CRAN (R 4.2.0)
#> future.apply 1.9.0 2022-04-25 [1] CRAN (R 4.2.0)
#> generics 0.1.2 2022-01-31 [1] CRAN (R 4.1.3)
#> globals 0.15.0 2022-05-09 [1] CRAN (R 4.2.0)
#> glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.0)
#> gower 1.0.0 2022-02-03 [1] CRAN (R 4.2.0)
#> hardhat 0.2.0 2022-01-24 [1] CRAN (R 4.1.3)
#> highr 0.8 2019-03-20 [2] CRAN (R 4.0.0)
#> htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.1.2)
#> ipred 0.9-12 2021-09-15 [1] CRAN (R 4.1.2)
#> knitr 1.39 2022-04-26 [1] CRAN (R 4.2.0)
#> lattice 0.20-45 2021-09-22 [2] CRAN (R 4.2.0)
#> lava 1.6.10 2021-09-02 [1] CRAN (R 4.1.2)
#> lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.1.2)
#> listenv 0.8.0 2019-12-05 [1] CRAN (R 4.0.1)
#> lubridate 1.8.0 2021-10-07 [1] CRAN (R 4.1.2)
#> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.0)
#> MASS 7.3-56 2022-03-23 [2] CRAN (R 4.2.0)
#> Matrix 1.4-1 2022-03-23 [2] CRAN (R 4.2.0)
#> modeldata * 0.1.1 2021-07-14 [1] CRAN (R 4.1.3)
#> nnet 7.3-17 2022-01-16 [2] CRAN (R 4.2.0)
#> parallelly 1.31.1 2022-04-22 [1] CRAN (R 4.2.0)
#> pillar 1.7.0 2022-02-01 [1] CRAN (R 4.2.0)
#> pkgconfig 2.0.3 2019-09-22 [2] CRAN (R 4.0.0)
#> prodlim 2019.11.13 2019-11-17 [1] CRAN (R 4.0.0)
#> purrr 0.3.4 2020-04-17 [2] CRAN (R 4.0.0)
#> R.cache 0.15.0 2021-04-30 [1] CRAN (R 4.1.2)
#> R.methodsS3 1.8.1 2020-08-26 [1] CRAN (R 4.0.3)
#> R.oo 1.24.0 2020-08-26 [1] CRAN (R 4.0.3)
#> R.utils 2.11.0 2021-09-26 [1] CRAN (R 4.1.2)
#> R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.2)
#> Rcpp 1.0.8.3 2022-03-17 [1] CRAN (R 4.2.0)
#> recipes * 0.2.0 2022-02-18 [1] CRAN (R 4.1.3)
#> reprex 2.0.1 2021-08-05 [1] CRAN (R 4.1.2)
#> rlang 1.0.2 2022-03-04 [1] CRAN (R 4.1.3)
#> rmarkdown 2.14 2022-04-25 [1] CRAN (R 4.2.0)
#> rpart 4.1.16 2022-01-24 [2] CRAN (R 4.2.0)
#> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.0.5)
#> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.1.2)
#> stringi 1.7.6 2021-11-29 [1] CRAN (R 4.1.3)
#> stringr 1.4.0 2019-02-10 [2] CRAN (R 4.0.0)
#> styler 1.7.0 2022-03-13 [1] CRAN (R 4.2.0)
#> survival 3.3-1 2022-03-03 [2] CRAN (R 4.2.0)
#> tibble 3.1.7 2022-05-03 [1] CRAN (R 4.2.0)
#> tidyselect 1.1.2 2022-02-21 [1] CRAN (R 4.1.3)
#> timeDate 3043.102 2018-02-21 [1] CRAN (R 4.0.0)
#> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.2)
#> vctrs 0.4.1 2022-04-13 [1] CRAN (R 4.2.0)
#> withr 2.5.0 2022-03-03 [1] CRAN (R 4.2.0)
#> xfun 0.31 2022-05-10 [1] CRAN (R 4.2.0)
#> yaml 2.2.1 2020-02-01 [2] CRAN (R 4.0.0)
#>
#> [1] /home/marcelo/R/x86_64-pc-linux-gnu-library/3.5
#> [2] /usr/lib/R/library
#>
#> ──────────────────────────────────────────────────────────────────────────────
The problem
I had trouble running
recipes::step_discretizeto predictors withNAvalues. This problem is related to issue #127.Reproducible example
Using the example in
?recipes::discretize:Suggested solution
I suggest that
na.rm,keep_naeffect meaningfully. Perhaps using the previous reprex.keep_na=Tshould also automatically makena.rm = T.