Skip to content

step_pca doesn't allow to tune threshold #534

@cimentadaj

Description

@cimentadaj

Hi!

I've been using recipes to tune step_pca for num_comp and it works well. However, whenever I try to tune using threshold, it doesn't work. I thought it would, because there is indeed a threshold object in dials but if I run tunable on a recipe with threshold, it doesn't flag threshold as tunable.

library(recipes)
#> Loading required package: dplyr
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
#> 
#> Attaching package: 'recipes'
#> The following object is masked from 'package:stats':
#> 
#>     step
library(tune)
library(parsnip)
library(rsample)

rcp <-
  recipe(mtcars, mpg ~ .) %>%
  step_pca(cyl, disp, hp, num_comp = tune())

tunable(rcp)
#> # A tibble: 1 x 5
#>   name     call_info        source component component_id
#>   <chr>    <list>           <chr>  <chr>     <chr>       
#> 1 num_comp <named list [3]> recipe step_pca  pca_M1I52

rcp <-
  recipe(mtcars, mpg ~ .) %>%
  step_pca(cyl, disp, hp, threshold = tune())

tunable(rcp)
#> # A tibble: 1 x 5
#>   name     call_info        source component component_id
#>   <chr>    <list>           <chr>  <chr>     <chr>       
#> 1 num_comp <named list [3]> recipe step_pca  pca_W84oz

As expected, this doesn't allow to perform a grid search using threshold:

lm_mod <-
  linear_reg() %>%
  set_engine("lm") %>%
  set_mode("regression")

mt_fold <- vfold_cv(mtcars)

tune_grid(lm_mod, rcp, resamples = mt_fold)
#> Warning: No tuning parameters have been detected, performance will be evaluated
#> using the resamples with no tuning. Did you want [fit_resamples()]?
#> ! Fold01: recipe: is.na() applied to non-(list or vector) of type 'language'
#> ! Fold02: recipe: is.na() applied to non-(list or vector) of type 'language'
#> ! Fold03: recipe: is.na() applied to non-(list or vector) of type 'language'
#> ! Fold04: recipe: is.na() applied to non-(list or vector) of type 'language'
#> ! Fold05: recipe: is.na() applied to non-(list or vector) of type 'language'
#> ! Fold06: recipe: is.na() applied to non-(list or vector) of type 'language'
#> ! Fold07: recipe: is.na() applied to non-(list or vector) of type 'language'
#> ! Fold08: recipe: is.na() applied to non-(list or vector) of type 'language'
#> ! Fold09: recipe: is.na() applied to non-(list or vector) of type 'language'
#> ! Fold10: recipe: is.na() applied to non-(list or vector) of type 'language'
#> #  10-fold cross-validation 
#> # A tibble: 10 x 4
#>    splits         id     .metrics         .notes          
#>    <list>         <chr>  <list>           <list>          
#>  1 <split [28/4]> Fold01 <tibble [2 × 3]> <tibble [1 × 1]>
#>  2 <split [28/4]> Fold02 <tibble [2 × 3]> <tibble [1 × 1]>
#>  3 <split [29/3]> Fold03 <tibble [2 × 3]> <tibble [1 × 1]>
#>  4 <split [29/3]> Fold04 <tibble [2 × 3]> <tibble [1 × 1]>
#>  5 <split [29/3]> Fold05 <tibble [2 × 3]> <tibble [1 × 1]>
#>  6 <split [29/3]> Fold06 <tibble [2 × 3]> <tibble [1 × 1]>
#>  7 <split [29/3]> Fold07 <tibble [2 × 3]> <tibble [1 × 1]>
#>  8 <split [29/3]> Fold08 <tibble [2 × 3]> <tibble [1 × 1]>
#>  9 <split [29/3]> Fold09 <tibble [2 × 3]> <tibble [1 × 1]>
#> 10 <split [29/3]> Fold10 <tibble [2 × 3]> <tibble [1 × 1]>

SI

devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.1 (2020-06-06)
#>  os       Ubuntu 20.04 LTS            
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       Europe/Berlin               
#>  date     2020-06-18                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date       lib source        
#>  assertthat    0.2.1      2019-03-21 [1] CRAN (R 4.0.0)
#>  backports     1.1.7      2020-05-13 [1] CRAN (R 4.0.0)
#>  callr         3.4.3      2020-03-28 [1] CRAN (R 4.0.0)
#>  class         7.3-17     2020-04-26 [3] CRAN (R 4.0.0)
#>  cli           2.0.2      2020-02-28 [1] CRAN (R 4.0.0)
#>  codetools     0.2-16     2018-12-24 [3] CRAN (R 4.0.0)
#>  colorspace    1.4-1      2019-03-18 [1] CRAN (R 4.0.0)
#>  crayon        1.3.4      2017-09-16 [1] CRAN (R 4.0.0)
#>  desc          1.2.0      2018-05-01 [1] CRAN (R 4.0.0)
#>  devtools      2.3.0      2020-04-10 [1] CRAN (R 4.0.0)
#>  dials         0.0.6      2020-04-03 [1] CRAN (R 4.0.0)
#>  DiceDesign    1.8-1      2019-07-31 [1] CRAN (R 4.0.0)
#>  digest        0.6.25     2020-02-23 [1] CRAN (R 4.0.0)
#>  dplyr       * 1.0.0      2020-05-29 [1] CRAN (R 4.0.0)
#>  ellipsis      0.3.1      2020-05-15 [1] CRAN (R 4.0.0)
#>  evaluate      0.14       2019-05-28 [1] CRAN (R 4.0.0)
#>  fansi         0.4.1      2020-01-08 [1] CRAN (R 4.0.0)
#>  foreach       1.5.0      2020-03-30 [1] CRAN (R 4.0.0)
#>  fs            1.4.1      2020-04-04 [1] CRAN (R 4.0.0)
#>  furrr         0.1.0      2018-05-16 [1] CRAN (R 4.0.0)
#>  future        1.17.0     2020-04-18 [1] CRAN (R 4.0.0)
#>  generics      0.0.2      2018-11-29 [1] CRAN (R 4.0.0)
#>  ggplot2       3.3.1      2020-05-28 [1] CRAN (R 4.0.0)
#>  globals       0.12.5     2019-12-07 [1] CRAN (R 4.0.0)
#>  glue          1.4.1      2020-05-13 [1] CRAN (R 4.0.0)
#>  gower         0.2.1      2019-05-14 [1] CRAN (R 4.0.0)
#>  GPfit         1.0-8      2019-02-08 [1] CRAN (R 4.0.0)
#>  gtable        0.3.0      2019-03-25 [1] CRAN (R 4.0.0)
#>  hardhat       0.1.3      2020-05-20 [1] CRAN (R 4.0.0)
#>  highr         0.8        2019-03-20 [1] CRAN (R 4.0.0)
#>  htmltools     0.5.0      2020-06-16 [1] CRAN (R 4.0.1)
#>  ipred         0.9-9      2019-04-28 [1] CRAN (R 4.0.0)
#>  iterators     1.0.12     2019-07-26 [1] CRAN (R 4.0.0)
#>  knitr         1.28       2020-02-06 [1] CRAN (R 4.0.0)
#>  lattice       0.20-41    2020-04-02 [3] CRAN (R 4.0.0)
#>  lava          1.6.7      2020-03-05 [1] CRAN (R 4.0.0)
#>  lhs           1.0.2      2020-04-13 [1] CRAN (R 4.0.0)
#>  lifecycle     0.2.0      2020-03-06 [1] CRAN (R 4.0.0)
#>  listenv       0.8.0      2019-12-05 [1] CRAN (R 4.0.0)
#>  lubridate     1.7.8      2020-04-06 [1] CRAN (R 4.0.0)
#>  magrittr      1.5        2014-11-22 [1] CRAN (R 4.0.0)
#>  MASS          7.3-51.6   2020-04-26 [3] CRAN (R 4.0.0)
#>  Matrix        1.2-18     2019-11-27 [3] CRAN (R 4.0.0)
#>  memoise       1.1.0      2017-04-21 [1] CRAN (R 4.0.0)
#>  munsell       0.5.0      2018-06-12 [1] CRAN (R 4.0.0)
#>  nnet          7.3-14     2020-04-26 [3] CRAN (R 4.0.0)
#>  parsnip     * 0.1.1      2020-05-06 [1] CRAN (R 4.0.0)
#>  pillar        1.4.4      2020-05-05 [1] CRAN (R 4.0.0)
#>  pkgbuild      1.0.8      2020-05-07 [1] CRAN (R 4.0.0)
#>  pkgconfig     2.0.3      2019-09-22 [1] CRAN (R 4.0.0)
#>  pkgload       1.1.0      2020-05-29 [1] CRAN (R 4.0.0)
#>  plyr          1.8.6      2020-03-03 [1] CRAN (R 4.0.0)
#>  prettyunits   1.1.1      2020-01-24 [1] CRAN (R 4.0.0)
#>  pROC          1.16.2     2020-03-19 [1] CRAN (R 4.0.0)
#>  processx      3.4.2      2020-02-09 [1] CRAN (R 4.0.0)
#>  prodlim       2019.11.13 2019-11-17 [1] CRAN (R 4.0.0)
#>  ps            1.3.3      2020-05-08 [1] CRAN (R 4.0.0)
#>  purrr         0.3.4      2020-04-17 [1] CRAN (R 4.0.0)
#>  R6            2.4.1      2019-11-12 [1] CRAN (R 4.0.0)
#>  Rcpp          1.0.4.6    2020-04-09 [1] CRAN (R 4.0.0)
#>  recipes     * 0.1.12     2020-05-01 [1] CRAN (R 4.0.0)
#>  remotes       2.1.1      2020-02-15 [1] CRAN (R 4.0.0)
#>  rlang         0.4.6      2020-05-02 [1] CRAN (R 4.0.0)
#>  rmarkdown     2.2        2020-05-31 [1] CRAN (R 4.0.0)
#>  rpart         4.1-15     2019-04-12 [3] CRAN (R 4.0.0)
#>  rprojroot     1.3-2      2018-01-03 [1] CRAN (R 4.0.0)
#>  rsample     * 0.0.7      2020-06-04 [1] CRAN (R 4.0.0)
#>  scales        1.1.1      2020-05-11 [1] CRAN (R 4.0.0)
#>  sessioninfo   1.1.1      2018-11-05 [1] CRAN (R 4.0.0)
#>  stringi       1.4.6      2020-02-17 [1] CRAN (R 4.0.0)
#>  stringr       1.4.0      2019-02-10 [1] CRAN (R 4.0.0)
#>  survival      3.1-12     2020-04-10 [3] CRAN (R 4.0.0)
#>  testthat      2.3.2      2020-03-02 [1] CRAN (R 4.0.0)
#>  tibble        3.0.1      2020-04-20 [1] CRAN (R 4.0.0)
#>  tidyr         1.1.0      2020-05-20 [1] CRAN (R 4.0.0)
#>  tidyselect    1.1.0      2020-05-11 [1] CRAN (R 4.0.0)
#>  timeDate      3043.102   2018-02-21 [1] CRAN (R 4.0.0)
#>  tune        * 0.1.0      2020-04-02 [1] CRAN (R 4.0.0)
#>  usethis       1.6.1      2020-04-29 [1] CRAN (R 4.0.0)
#>  utf8          1.1.4      2018-05-24 [1] CRAN (R 4.0.0)
#>  vctrs         0.3.1      2020-06-05 [1] CRAN (R 4.0.0)
#>  withr         2.2.0      2020-04-20 [1] CRAN (R 4.0.0)
#>  workflows     0.1.1      2020-03-17 [1] CRAN (R 4.0.0)
#>  xfun          0.14       2020-05-20 [1] CRAN (R 4.0.0)
#>  yaml          2.2.1      2020-02-01 [1] CRAN (R 4.0.0)
#>  yardstick     0.0.6      2020-03-17 [1] CRAN (R 4.0.0)
#> 
#> [1] /usr/local/lib/R/site-library
#> [2] /usr/lib/R/site-library
#> [3] /usr/lib/R/library

Metadata

Metadata

Assignees

No one assigned

    Labels

    featurea feature request or enhancement

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions