Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

step_pca doesn't allow to tune threshold #534

Closed
cimentadaj opened this issue Jun 18, 2020 · 5 comments
Closed

step_pca doesn't allow to tune threshold #534

cimentadaj opened this issue Jun 18, 2020 · 5 comments
Labels
feature a feature request or enhancement

Comments

@cimentadaj
Copy link

Hi!

I've been using recipes to tune step_pca for num_comp and it works well. However, whenever I try to tune using threshold, it doesn't work. I thought it would, because there is indeed a threshold object in dials but if I run tunable on a recipe with threshold, it doesn't flag threshold as tunable.

library(recipes)
#> Loading required package: dplyr
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
#> 
#> Attaching package: 'recipes'
#> The following object is masked from 'package:stats':
#> 
#>     step
library(tune)
library(parsnip)
library(rsample)

rcp <-
  recipe(mtcars, mpg ~ .) %>%
  step_pca(cyl, disp, hp, num_comp = tune())

tunable(rcp)
#> # A tibble: 1 x 5
#>   name     call_info        source component component_id
#>   <chr>    <list>           <chr>  <chr>     <chr>       
#> 1 num_comp <named list [3]> recipe step_pca  pca_M1I52

rcp <-
  recipe(mtcars, mpg ~ .) %>%
  step_pca(cyl, disp, hp, threshold = tune())

tunable(rcp)
#> # A tibble: 1 x 5
#>   name     call_info        source component component_id
#>   <chr>    <list>           <chr>  <chr>     <chr>       
#> 1 num_comp <named list [3]> recipe step_pca  pca_W84oz

As expected, this doesn't allow to perform a grid search using threshold:

lm_mod <-
  linear_reg() %>%
  set_engine("lm") %>%
  set_mode("regression")

mt_fold <- vfold_cv(mtcars)

tune_grid(lm_mod, rcp, resamples = mt_fold)
#> Warning: No tuning parameters have been detected, performance will be evaluated
#> using the resamples with no tuning. Did you want [fit_resamples()]?
#> ! Fold01: recipe: is.na() applied to non-(list or vector) of type 'language'
#> ! Fold02: recipe: is.na() applied to non-(list or vector) of type 'language'
#> ! Fold03: recipe: is.na() applied to non-(list or vector) of type 'language'
#> ! Fold04: recipe: is.na() applied to non-(list or vector) of type 'language'
#> ! Fold05: recipe: is.na() applied to non-(list or vector) of type 'language'
#> ! Fold06: recipe: is.na() applied to non-(list or vector) of type 'language'
#> ! Fold07: recipe: is.na() applied to non-(list or vector) of type 'language'
#> ! Fold08: recipe: is.na() applied to non-(list or vector) of type 'language'
#> ! Fold09: recipe: is.na() applied to non-(list or vector) of type 'language'
#> ! Fold10: recipe: is.na() applied to non-(list or vector) of type 'language'
#> #  10-fold cross-validation 
#> # A tibble: 10 x 4
#>    splits         id     .metrics         .notes          
#>    <list>         <chr>  <list>           <list>          
#>  1 <split [28/4]> Fold01 <tibble [2 × 3]> <tibble [1 × 1]>
#>  2 <split [28/4]> Fold02 <tibble [2 × 3]> <tibble [1 × 1]>
#>  3 <split [29/3]> Fold03 <tibble [2 × 3]> <tibble [1 × 1]>
#>  4 <split [29/3]> Fold04 <tibble [2 × 3]> <tibble [1 × 1]>
#>  5 <split [29/3]> Fold05 <tibble [2 × 3]> <tibble [1 × 1]>
#>  6 <split [29/3]> Fold06 <tibble [2 × 3]> <tibble [1 × 1]>
#>  7 <split [29/3]> Fold07 <tibble [2 × 3]> <tibble [1 × 1]>
#>  8 <split [29/3]> Fold08 <tibble [2 × 3]> <tibble [1 × 1]>
#>  9 <split [29/3]> Fold09 <tibble [2 × 3]> <tibble [1 × 1]>
#> 10 <split [29/3]> Fold10 <tibble [2 × 3]> <tibble [1 × 1]>

SI

devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.1 (2020-06-06)
#>  os       Ubuntu 20.04 LTS            
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       Europe/Berlin               
#>  date     2020-06-18                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date       lib source        
#>  assertthat    0.2.1      2019-03-21 [1] CRAN (R 4.0.0)
#>  backports     1.1.7      2020-05-13 [1] CRAN (R 4.0.0)
#>  callr         3.4.3      2020-03-28 [1] CRAN (R 4.0.0)
#>  class         7.3-17     2020-04-26 [3] CRAN (R 4.0.0)
#>  cli           2.0.2      2020-02-28 [1] CRAN (R 4.0.0)
#>  codetools     0.2-16     2018-12-24 [3] CRAN (R 4.0.0)
#>  colorspace    1.4-1      2019-03-18 [1] CRAN (R 4.0.0)
#>  crayon        1.3.4      2017-09-16 [1] CRAN (R 4.0.0)
#>  desc          1.2.0      2018-05-01 [1] CRAN (R 4.0.0)
#>  devtools      2.3.0      2020-04-10 [1] CRAN (R 4.0.0)
#>  dials         0.0.6      2020-04-03 [1] CRAN (R 4.0.0)
#>  DiceDesign    1.8-1      2019-07-31 [1] CRAN (R 4.0.0)
#>  digest        0.6.25     2020-02-23 [1] CRAN (R 4.0.0)
#>  dplyr       * 1.0.0      2020-05-29 [1] CRAN (R 4.0.0)
#>  ellipsis      0.3.1      2020-05-15 [1] CRAN (R 4.0.0)
#>  evaluate      0.14       2019-05-28 [1] CRAN (R 4.0.0)
#>  fansi         0.4.1      2020-01-08 [1] CRAN (R 4.0.0)
#>  foreach       1.5.0      2020-03-30 [1] CRAN (R 4.0.0)
#>  fs            1.4.1      2020-04-04 [1] CRAN (R 4.0.0)
#>  furrr         0.1.0      2018-05-16 [1] CRAN (R 4.0.0)
#>  future        1.17.0     2020-04-18 [1] CRAN (R 4.0.0)
#>  generics      0.0.2      2018-11-29 [1] CRAN (R 4.0.0)
#>  ggplot2       3.3.1      2020-05-28 [1] CRAN (R 4.0.0)
#>  globals       0.12.5     2019-12-07 [1] CRAN (R 4.0.0)
#>  glue          1.4.1      2020-05-13 [1] CRAN (R 4.0.0)
#>  gower         0.2.1      2019-05-14 [1] CRAN (R 4.0.0)
#>  GPfit         1.0-8      2019-02-08 [1] CRAN (R 4.0.0)
#>  gtable        0.3.0      2019-03-25 [1] CRAN (R 4.0.0)
#>  hardhat       0.1.3      2020-05-20 [1] CRAN (R 4.0.0)
#>  highr         0.8        2019-03-20 [1] CRAN (R 4.0.0)
#>  htmltools     0.5.0      2020-06-16 [1] CRAN (R 4.0.1)
#>  ipred         0.9-9      2019-04-28 [1] CRAN (R 4.0.0)
#>  iterators     1.0.12     2019-07-26 [1] CRAN (R 4.0.0)
#>  knitr         1.28       2020-02-06 [1] CRAN (R 4.0.0)
#>  lattice       0.20-41    2020-04-02 [3] CRAN (R 4.0.0)
#>  lava          1.6.7      2020-03-05 [1] CRAN (R 4.0.0)
#>  lhs           1.0.2      2020-04-13 [1] CRAN (R 4.0.0)
#>  lifecycle     0.2.0      2020-03-06 [1] CRAN (R 4.0.0)
#>  listenv       0.8.0      2019-12-05 [1] CRAN (R 4.0.0)
#>  lubridate     1.7.8      2020-04-06 [1] CRAN (R 4.0.0)
#>  magrittr      1.5        2014-11-22 [1] CRAN (R 4.0.0)
#>  MASS          7.3-51.6   2020-04-26 [3] CRAN (R 4.0.0)
#>  Matrix        1.2-18     2019-11-27 [3] CRAN (R 4.0.0)
#>  memoise       1.1.0      2017-04-21 [1] CRAN (R 4.0.0)
#>  munsell       0.5.0      2018-06-12 [1] CRAN (R 4.0.0)
#>  nnet          7.3-14     2020-04-26 [3] CRAN (R 4.0.0)
#>  parsnip     * 0.1.1      2020-05-06 [1] CRAN (R 4.0.0)
#>  pillar        1.4.4      2020-05-05 [1] CRAN (R 4.0.0)
#>  pkgbuild      1.0.8      2020-05-07 [1] CRAN (R 4.0.0)
#>  pkgconfig     2.0.3      2019-09-22 [1] CRAN (R 4.0.0)
#>  pkgload       1.1.0      2020-05-29 [1] CRAN (R 4.0.0)
#>  plyr          1.8.6      2020-03-03 [1] CRAN (R 4.0.0)
#>  prettyunits   1.1.1      2020-01-24 [1] CRAN (R 4.0.0)
#>  pROC          1.16.2     2020-03-19 [1] CRAN (R 4.0.0)
#>  processx      3.4.2      2020-02-09 [1] CRAN (R 4.0.0)
#>  prodlim       2019.11.13 2019-11-17 [1] CRAN (R 4.0.0)
#>  ps            1.3.3      2020-05-08 [1] CRAN (R 4.0.0)
#>  purrr         0.3.4      2020-04-17 [1] CRAN (R 4.0.0)
#>  R6            2.4.1      2019-11-12 [1] CRAN (R 4.0.0)
#>  Rcpp          1.0.4.6    2020-04-09 [1] CRAN (R 4.0.0)
#>  recipes     * 0.1.12     2020-05-01 [1] CRAN (R 4.0.0)
#>  remotes       2.1.1      2020-02-15 [1] CRAN (R 4.0.0)
#>  rlang         0.4.6      2020-05-02 [1] CRAN (R 4.0.0)
#>  rmarkdown     2.2        2020-05-31 [1] CRAN (R 4.0.0)
#>  rpart         4.1-15     2019-04-12 [3] CRAN (R 4.0.0)
#>  rprojroot     1.3-2      2018-01-03 [1] CRAN (R 4.0.0)
#>  rsample     * 0.0.7      2020-06-04 [1] CRAN (R 4.0.0)
#>  scales        1.1.1      2020-05-11 [1] CRAN (R 4.0.0)
#>  sessioninfo   1.1.1      2018-11-05 [1] CRAN (R 4.0.0)
#>  stringi       1.4.6      2020-02-17 [1] CRAN (R 4.0.0)
#>  stringr       1.4.0      2019-02-10 [1] CRAN (R 4.0.0)
#>  survival      3.1-12     2020-04-10 [3] CRAN (R 4.0.0)
#>  testthat      2.3.2      2020-03-02 [1] CRAN (R 4.0.0)
#>  tibble        3.0.1      2020-04-20 [1] CRAN (R 4.0.0)
#>  tidyr         1.1.0      2020-05-20 [1] CRAN (R 4.0.0)
#>  tidyselect    1.1.0      2020-05-11 [1] CRAN (R 4.0.0)
#>  timeDate      3043.102   2018-02-21 [1] CRAN (R 4.0.0)
#>  tune        * 0.1.0      2020-04-02 [1] CRAN (R 4.0.0)
#>  usethis       1.6.1      2020-04-29 [1] CRAN (R 4.0.0)
#>  utf8          1.1.4      2018-05-24 [1] CRAN (R 4.0.0)
#>  vctrs         0.3.1      2020-06-05 [1] CRAN (R 4.0.0)
#>  withr         2.2.0      2020-04-20 [1] CRAN (R 4.0.0)
#>  workflows     0.1.1      2020-03-17 [1] CRAN (R 4.0.0)
#>  xfun          0.14       2020-05-20 [1] CRAN (R 4.0.0)
#>  yaml          2.2.1      2020-02-01 [1] CRAN (R 4.0.0)
#>  yardstick     0.0.6      2020-03-17 [1] CRAN (R 4.0.0)
#> 
#> [1] /usr/local/lib/R/site-library
#> [2] /usr/lib/R/site-library
#> [3] /usr/lib/R/library
@juliasilge
Copy link
Member

Looks like threshold is not currently tuneable:

recipes/R/pca.R

Line 266 in 6b891ba

tunable.step_pca <- function(x, ...) {

@juliasilge juliasilge added the feature a feature request or enhancement label Jun 18, 2020
@gregdenay
Copy link
Contributor

Hi, is this already being worked on? Otherwise I could try to give it it go.

@topepo
Copy link
Member

topepo commented Dec 9, 2020

It's on my list but please give it a try if you like.

@juliasilge
Copy link
Member

Closed in #615

Thank you so much for your contribution! 🎉

@github-actions
Copy link

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org

@github-actions github-actions bot locked and limited conversation to collaborators Feb 20, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature a feature request or enhancement
Projects
None yet
Development

No branches or pull requests

4 participants