Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom recipe steps don't work when tuning is done in parallel #192

Closed
UnclAlDeveloper opened this issue Apr 3, 2020 · 5 comments
Closed
Labels
bug an unexpected problem or unintended behavior

Comments

@UnclAlDeveloper
Copy link

The problem

When a custom recipe step is created, it can be used in a tuning grid when the tuning grid is run sequentially, ie. control_grid(allow_par = FALSE), but not when the tuning grid is run in parallel, ie. allow_par = TRUE.

When run in parallel the following error occurs (in this case I am using the step_percentile example custom step):

  • "recipe: Error in UseMethod("prep"): no applicable method for 'prep' applied to an object of class "c('step_percentile', 'step')""

Reproducible example

Below I used the example code from the recipes package for creating a custom step, and from the tune package for running a model through a tuning grid, with a few modifications that are mostly noted. You will see at the bottom of the example the "no applicable method for 'prep'" error. I should also note that I've tried adding step_percentile to a package and adding pkgs="MyPackage" to the control_grid, but this did not make a difference.

In a second version version of the code, which I've placed underneath, I've modified it to run sequentially and it runs correctly without any issues.

You can skip to the last 20 lines of the example to see the actual relevant code. Above that is just the definition of the custom step.

Failing Parallel Version

library(tidymodels)
#> -- Attaching packages ------------------------------------------------------------------------------------------------------------------------------------ tidymodels 0.1.0 --
#> v broom     0.5.5      v recipes   0.1.10
#> v dials     0.0.5      v rsample   0.0.6 
#> v dplyr     0.8.5      v tibble    3.0.0 
#> v ggplot2   3.3.0      v tune      0.0.1 
#> v infer     0.5.1      v workflows 0.1.1 
#> v parsnip   0.0.5      v yardstick 0.0.6 
#> v purrr     0.3.3
#> -- Conflicts --------------------------------------------------------------------------------------------------------------------------------------- tidymodels_conflicts() --
#> x purrr::discard()  masks scales::discard()
#> x dplyr::filter()   masks stats::filter()
#> x dplyr::lag()      masks stats::lag()
#> x ggplot2::margin() masks dials::margin()
#> x recipes::step()   masks stats::step()
library(mlbench)
library(doParallel)
#> Loading required package: foreach
#> 
#> Attaching package: 'foreach'
#> The following objects are masked from 'package:purrr':
#> 
#>     accumulate, when
#> Loading required package: iterators
#> Loading required package: parallel

### RECIPES

step_percentile <- function(
  recipe, ..., 
  role = NA, 
  trained = FALSE, 
  ref_dist = NULL,
  approx = FALSE, 
  options = list(probs = (0:100)/100, names = TRUE),
  skip = FALSE,
  id = rand_id("percentile")
) {
  
  ## The variable selectors are not immediately evaluated by using
  ##  the `quos` function in `rlang`. `ellipse_check` captures the
  ##  values and also checks to make sure that they are not empty.  
  terms <- recipes::ellipse_check(...) 
  
  recipes::add_step(
    recipe, 
    step_percentile_new(
      terms = terms, 
      trained = trained,
      role = role, 
      ref_dist = ref_dist,
      approx = approx,
      options = options,
      skip = skip,
      id = id
    )
  )
}

step_percentile_new <- 
  function(terms, role, trained, ref_dist, approx, options, skip, id) {
    step(
      subclass = "percentile", 
      terms = terms,
      role = role,
      trained = trained,
      ref_dist = ref_dist,
      approx = approx,
      options = options,
      skip = skip,
      id = id
    )
  }

get_pctl <- function(x, args) {
  args$x <- x
  do.call("quantile", args)
}

prep.step_percentile <- function(x, training, info = NULL, ...) {
  col_names <- recipes::terms_select(terms = x$terms, info = info) 
  ## You can add error trapping for non-numeric data here and so on. See the
  ## `check_type` function to do this for basic types. 
  
  ## We'll use the names later so
  if (x$options$names == FALSE)
    rlang::abort("`names` should be set to TRUE")
  
  if (!x$approx) {
    ref_dist <- training[, col_names]
  } else {
    ref_dist <- purrr::map(training[, col_names],  get_pctl, args = x$options)
  }
  
  ## Use the constructor function to return the updated object. 
  ## Note that `trained` is set to TRUE
  
  step_percentile_new(
    terms = x$terms, 
    trained = TRUE,
    role = x$role, 
    ref_dist = ref_dist,
    approx = x$approx,
    options = x$options,
    skip = x$skip,
    id = x$id
  )
}

## Two helper functions
pctl_by_mean <- function(x, ref) mean(ref <= x)

pctl_by_approx <- function(x, ref) {
  ## go from 1 column tibble to vector
  x <- getElement(x, names(x))
  ## get the percentiles values from the names (e.g. "10%")
  p_grid <- as.numeric(gsub("%$", "", names(ref))) 
  approx(x = ref, y = p_grid, xout = x)$y/100
}

bake.step_percentile <- function(object, new_data, ...) {
  require(tibble)
  ## For illustration (and not speed), we will loop through the affected variables
  ## and do the computations
  vars <- names(object$ref_dist)
  
  for (i in vars) {
    if (!object$approx) {
      ## We can use `apply` since tibbles do not drop dimensions:
      new_data[, i] <- apply(new_data[, i], 1, pctl_by_mean, 
                             ref = object$ref_dist[, i])
    } else 
      new_data[, i] <- pctl_by_approx(new_data[, i], object$ref_dist[[i]])
  }
  ## Always convert to tibbles on the way out
  as_tibble(new_data)
}


### TUNE
data(Ionosphere)

Ionosphere <- Ionosphere %>% select(-V2)

svm_mod <-
  svm_rbf(cost = tune(), rbf_sigma = tune()) %>%
  set_mode("classification") %>%
  set_engine("kernlab")

iono_rec <-
  recipe(Class ~ ., data = Ionosphere)  %>%
  # In case V1 is has a single value sampled
  step_zv(recipes::all_predictors()) %>% 
  # convert it to a dummy variable
  step_dummy(V1) %>%
  # Scale it the same as the others
  step_range(matches("V1_")) %>% 
  step_percentile(recipes::all_numeric()) # custom step added to the recipe

set.seed(4943)
iono_rs <- bootstraps(Ionosphere, times = 5) # times reduced to speed up sequential model tuning

roc_vals <- metric_set(roc_auc)

cluster = makeCluster(detectCores()) # Create a cluster
registerDoParallel(cluster)

ctrl <- control_grid(verbose = FALSE, allow_par = TRUE) # Run in parallel

set.seed(35)
grid_form <-
  tune_grid(
    iono_rec,
    model = svm_mod,
    resamples = iono_rs,
    metrics = roc_vals,
    control = ctrl
  )
#> Warning: All models failed in tune_grid(). See the `.notes` column.
print(grid_form$.notes[[1]])
#> # A tibble: 1 x 1
#>   .notes                                                                        
#>   <chr>                                                                         
#> 1 "recipe: Error in UseMethod(\"prep\"): no applicable method for 'prep' applie~

Created on 2020-04-03 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> - Session info ---------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.6.3 (2020-02-29)
#>  os       Windows Server x64          
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language (EN)                        
#>  collate  English_United States.1252  
#>  ctype    English_United States.1252  
#>  tz       America/New_York            
#>  date     2020-04-03                  
#> 
#> - Packages -------------------------------------------------------------------
#>  package       * version    date       lib source        
#>  assertthat      0.2.1      2019-03-21 [1] CRAN (R 3.6.1)
#>  backports       1.1.5      2019-10-02 [1] CRAN (R 3.6.1)
#>  base64enc       0.1-3      2015-07-28 [1] CRAN (R 3.6.0)
#>  bayesplot       1.7.1      2019-12-01 [1] CRAN (R 3.6.3)
#>  boot            1.3-24     2019-12-20 [2] CRAN (R 3.6.3)
#>  broom         * 0.5.5      2020-02-29 [1] CRAN (R 3.6.3)
#>  callr           3.4.3      2020-03-28 [1] CRAN (R 3.6.3)
#>  class           7.3-15     2019-01-01 [2] CRAN (R 3.6.3)
#>  cli             2.0.2      2020-02-28 [1] CRAN (R 3.6.3)
#>  codetools       0.2-16     2018-12-24 [2] CRAN (R 3.6.3)
#>  colorspace      1.4-1      2019-03-18 [1] CRAN (R 3.6.1)
#>  colourpicker    1.0        2017-09-27 [1] CRAN (R 3.6.3)
#>  crayon          1.3.4      2017-09-16 [1] CRAN (R 3.6.1)
#>  crosstalk       1.1.0.1    2020-03-13 [1] CRAN (R 3.6.3)
#>  desc            1.2.0      2018-05-01 [1] CRAN (R 3.6.1)
#>  devtools        2.2.2      2020-02-17 [1] CRAN (R 3.6.3)
#>  dials         * 0.0.5      2020-04-01 [1] CRAN (R 3.6.3)
#>  DiceDesign      1.8-1      2019-07-31 [1] CRAN (R 3.6.3)
#>  digest          0.6.25     2020-02-23 [1] CRAN (R 3.6.3)
#>  doParallel    * 1.0.15     2019-08-02 [1] CRAN (R 3.6.1)
#>  dplyr         * 0.8.5      2020-03-07 [1] CRAN (R 3.6.3)
#>  DT              0.13       2020-03-23 [1] CRAN (R 3.6.3)
#>  dygraphs        1.1.1.6    2018-07-11 [1] CRAN (R 3.6.3)
#>  ellipsis        0.3.0      2019-09-20 [1] CRAN (R 3.6.1)
#>  evaluate        0.14       2019-05-28 [1] CRAN (R 3.6.0)
#>  fansi           0.4.1      2020-01-08 [1] CRAN (R 3.6.2)
#>  fastmap         1.0.1      2019-10-08 [1] CRAN (R 3.6.1)
#>  foreach       * 1.5.0      2020-03-30 [1] CRAN (R 3.6.3)
#>  fs              1.4.0      2020-03-31 [1] CRAN (R 3.6.3)
#>  furrr           0.1.0      2018-05-16 [1] CRAN (R 3.6.3)
#>  future          1.16.0     2020-01-16 [1] CRAN (R 3.6.2)
#>  generics        0.0.2      2018-11-29 [1] CRAN (R 3.6.1)
#>  ggplot2       * 3.3.0      2020-03-05 [1] CRAN (R 3.6.3)
#>  ggridges        0.5.2      2020-01-12 [1] CRAN (R 3.6.3)
#>  globals         0.12.5     2019-12-07 [1] CRAN (R 3.6.1)
#>  glue            1.3.2      2020-03-12 [1] CRAN (R 3.6.3)
#>  gower           0.2.1      2019-05-14 [1] CRAN (R 3.6.0)
#>  GPfit           1.0-8      2019-02-08 [1] CRAN (R 3.6.3)
#>  gridExtra       2.3        2017-09-09 [1] CRAN (R 3.6.1)
#>  gtable          0.3.0      2019-03-25 [1] CRAN (R 3.6.1)
#>  gtools          3.8.2      2020-03-31 [1] CRAN (R 3.6.3)
#>  hardhat         0.1.2      2020-02-28 [1] CRAN (R 3.6.3)
#>  highr           0.8        2019-03-20 [1] CRAN (R 3.6.1)
#>  htmltools       0.4.0      2019-10-04 [1] CRAN (R 3.6.1)
#>  htmlwidgets     1.5.1      2019-10-08 [1] CRAN (R 3.6.1)
#>  httpuv          1.5.2      2019-09-11 [1] CRAN (R 3.6.1)
#>  igraph          1.2.5      2020-03-19 [1] CRAN (R 3.6.3)
#>  infer         * 0.5.1      2019-11-19 [1] CRAN (R 3.6.3)
#>  inline          0.3.15     2018-05-18 [1] CRAN (R 3.6.3)
#>  ipred           0.9-9      2019-04-28 [1] CRAN (R 3.6.0)
#>  iterators     * 1.0.12     2019-07-26 [1] CRAN (R 3.6.1)
#>  janeaustenr     0.1.5      2017-06-10 [1] CRAN (R 3.6.1)
#>  kernlab         0.9-29     2019-11-12 [1] CRAN (R 3.6.1)
#>  knitr           1.28       2020-02-06 [1] CRAN (R 3.6.3)
#>  later           1.0.0      2019-10-04 [1] CRAN (R 3.6.1)
#>  lattice         0.20-40    2020-02-19 [2] CRAN (R 3.6.3)
#>  lava            1.6.7      2020-03-05 [1] CRAN (R 3.6.3)
#>  lhs             1.0.1      2019-02-03 [1] CRAN (R 3.6.3)
#>  lifecycle       0.2.0      2020-03-06 [1] CRAN (R 3.6.3)
#>  listenv         0.8.0      2019-12-05 [1] CRAN (R 3.6.1)
#>  lme4            1.1-21     2019-03-05 [1] CRAN (R 3.6.3)
#>  loo             2.2.0      2019-12-19 [1] CRAN (R 3.6.3)
#>  lubridate       1.7.4      2018-04-11 [1] CRAN (R 3.6.1)
#>  magrittr        1.5        2014-11-22 [1] CRAN (R 3.6.1)
#>  markdown        1.1        2019-08-07 [1] CRAN (R 3.6.1)
#>  MASS            7.3-51.5   2019-12-20 [1] CRAN (R 3.6.2)
#>  Matrix          1.2-18     2019-11-27 [2] CRAN (R 3.6.3)
#>  matrixStats     0.56.0     2020-03-13 [1] CRAN (R 3.6.3)
#>  memoise         1.1.0      2017-04-21 [1] CRAN (R 3.6.1)
#>  mime            0.9        2020-02-04 [1] CRAN (R 3.6.2)
#>  miniUI          0.1.1.1    2018-05-18 [1] CRAN (R 3.6.3)
#>  minqa           1.2.4      2014-10-09 [1] CRAN (R 3.6.1)
#>  mlbench       * 2.1-1      2012-07-10 [1] CRAN (R 3.6.3)
#>  munsell         0.5.0      2018-06-12 [1] CRAN (R 3.6.1)
#>  nlme            3.1-145    2020-03-04 [2] CRAN (R 3.6.3)
#>  nloptr          1.2.2.1    2020-03-11 [1] CRAN (R 3.6.3)
#>  nnet            7.3-13     2020-02-25 [2] CRAN (R 3.6.3)
#>  parsnip       * 0.0.5      2020-01-07 [1] CRAN (R 3.6.3)
#>  pillar          1.4.3      2019-12-20 [1] CRAN (R 3.6.2)
#>  pkgbuild        1.0.6      2019-10-09 [1] CRAN (R 3.6.1)
#>  pkgconfig       2.0.3      2019-09-22 [1] CRAN (R 3.6.1)
#>  pkgload         1.0.2      2018-10-29 [1] CRAN (R 3.6.1)
#>  plyr            1.8.6      2020-03-03 [1] CRAN (R 3.6.3)
#>  prettyunits     1.1.1      2020-01-24 [1] CRAN (R 3.6.2)
#>  pROC            1.16.2     2020-03-19 [1] CRAN (R 3.6.2)
#>  processx        3.4.2      2020-02-09 [1] CRAN (R 3.6.3)
#>  prodlim         2019.11.13 2019-11-17 [1] CRAN (R 3.6.1)
#>  promises        1.1.0      2019-10-04 [1] CRAN (R 3.6.1)
#>  ps              1.3.2      2020-02-13 [1] CRAN (R 3.6.3)
#>  purrr         * 0.3.3      2019-10-18 [1] CRAN (R 3.6.1)
#>  R6              2.4.1      2019-11-12 [1] CRAN (R 3.6.1)
#>  Rcpp            1.0.4      2020-03-17 [1] CRAN (R 3.6.2)
#>  recipes       * 0.1.10     2020-03-18 [1] CRAN (R 3.6.2)
#>  remotes         2.1.1      2020-02-15 [1] CRAN (R 3.6.3)
#>  reshape2        1.4.3      2017-12-11 [1] CRAN (R 3.6.1)
#>  rlang           0.4.5      2020-03-01 [1] CRAN (R 3.6.3)
#>  rmarkdown       2.1        2020-01-20 [1] CRAN (R 3.6.2)
#>  rpart           4.1-15     2019-04-12 [2] CRAN (R 3.6.3)
#>  rprojroot       1.3-2      2018-01-03 [1] CRAN (R 3.6.1)
#>  rsample       * 0.0.6      2020-03-31 [1] CRAN (R 3.6.3)
#>  rsconnect       0.8.16     2019-12-13 [1] CRAN (R 3.6.3)
#>  rstan           2.19.3     2020-02-11 [1] CRAN (R 3.6.3)
#>  rstanarm        2.19.3     2020-02-11 [1] CRAN (R 3.6.3)
#>  rstantools      2.0.0      2019-09-15 [1] CRAN (R 3.6.3)
#>  rstudioapi      0.11       2020-02-07 [1] CRAN (R 3.6.3)
#>  scales        * 1.1.0      2019-11-18 [1] CRAN (R 3.6.1)
#>  sessioninfo     1.1.1      2018-11-05 [1] CRAN (R 3.6.1)
#>  shiny           1.4.0.2    2020-03-13 [1] CRAN (R 3.6.3)
#>  shinyjs         1.1        2020-01-13 [1] CRAN (R 3.6.2)
#>  shinystan       2.5.0      2018-05-01 [1] CRAN (R 3.6.3)
#>  shinythemes     1.1.2      2018-11-06 [1] CRAN (R 3.6.3)
#>  SnowballC       0.7.0      2020-04-01 [1] CRAN (R 3.6.3)
#>  StanHeaders     2.21.0-1   2020-01-19 [1] CRAN (R 3.6.2)
#>  stringi         1.4.6      2020-02-17 [1] CRAN (R 3.6.2)
#>  stringr         1.4.0      2019-02-10 [1] CRAN (R 3.6.1)
#>  survival        3.1-11     2020-03-07 [2] CRAN (R 3.6.3)
#>  testthat        2.3.2      2020-03-02 [1] CRAN (R 3.6.3)
#>  threejs         0.3.3      2020-01-21 [1] CRAN (R 3.6.3)
#>  tibble        * 3.0.0      2020-03-30 [1] CRAN (R 3.6.3)
#>  tidymodels    * 0.1.0      2020-02-16 [1] CRAN (R 3.6.3)
#>  tidyposterior   0.0.2      2018-11-15 [1] CRAN (R 3.6.3)
#>  tidypredict     0.4.5      2020-02-10 [1] CRAN (R 3.6.3)
#>  tidyr           1.0.2      2020-01-24 [1] CRAN (R 3.6.2)
#>  tidyselect      1.0.0      2020-01-27 [1] CRAN (R 3.6.1)
#>  tidytext        0.2.3      2020-03-04 [1] CRAN (R 3.6.3)
#>  timeDate        3043.102   2018-02-21 [1] CRAN (R 3.6.0)
#>  tokenizers      0.2.1      2018-03-29 [1] CRAN (R 3.6.1)
#>  tune          * 0.0.1      2020-02-11 [1] CRAN (R 3.6.3)
#>  usethis         1.5.1      2019-07-04 [1] CRAN (R 3.6.1)
#>  utf8            1.1.4      2018-05-24 [1] CRAN (R 3.6.1)
#>  vctrs           0.2.4      2020-03-10 [1] CRAN (R 3.6.3)
#>  withr           2.1.2      2018-03-15 [1] CRAN (R 3.6.1)
#>  workflows     * 0.1.1      2020-03-17 [1] CRAN (R 3.6.2)
#>  xfun            0.12       2020-01-13 [1] CRAN (R 3.6.2)
#>  xtable          1.8-4      2019-04-21 [1] CRAN (R 3.6.0)
#>  xts             0.12-0     2020-01-19 [1] CRAN (R 3.6.3)
#>  yaml            2.2.1      2020-02-01 [1] CRAN (R 3.6.1)
#>  yardstick     * 0.0.6      2020-03-17 [1] CRAN (R 3.6.2)
#>  zoo             1.8-7      2020-01-10 [1] CRAN (R 3.6.3)
#> 
#> [1] D:/R/Library/3.6
#> [2] C:/Program Files/R/R-3.6.3/library

Working Sequential Version

library(tidymodels)
#> -- Attaching packages ------------------------------------------------------------------------------------------------------------------------------------ tidymodels 0.1.0 --
#> v broom     0.5.5      v recipes   0.1.10
#> v dials     0.0.5      v rsample   0.0.6 
#> v dplyr     0.8.5      v tibble    3.0.0 
#> v ggplot2   3.3.0      v tune      0.0.1 
#> v infer     0.5.1      v workflows 0.1.1 
#> v parsnip   0.0.5      v yardstick 0.0.6 
#> v purrr     0.3.3
#> -- Conflicts --------------------------------------------------------------------------------------------------------------------------------------- tidymodels_conflicts() --
#> x purrr::discard()  masks scales::discard()
#> x dplyr::filter()   masks stats::filter()
#> x dplyr::lag()      masks stats::lag()
#> x ggplot2::margin() masks dials::margin()
#> x recipes::step()   masks stats::step()
library(mlbench)

### RECIPES

step_percentile <- function(
  recipe, ..., 
  role = NA, 
  trained = FALSE, 
  ref_dist = NULL,
  approx = FALSE, 
  options = list(probs = (0:100)/100, names = TRUE),
  skip = FALSE,
  id = rand_id("percentile")
) {
  
  ## The variable selectors are not immediately evaluated by using
  ##  the `quos` function in `rlang`. `ellipse_check` captures the
  ##  values and also checks to make sure that they are not empty.  
  terms <- recipes::ellipse_check(...) 
  
  recipes::add_step(
    recipe, 
    step_percentile_new(
      terms = terms, 
      trained = trained,
      role = role, 
      ref_dist = ref_dist,
      approx = approx,
      options = options,
      skip = skip,
      id = id
    )
  )
}

step_percentile_new <- 
  function(terms, role, trained, ref_dist, approx, options, skip, id) {
    step(
      subclass = "percentile", 
      terms = terms,
      role = role,
      trained = trained,
      ref_dist = ref_dist,
      approx = approx,
      options = options,
      skip = skip,
      id = id
    )
  }

get_pctl <- function(x, args) {
  args$x <- x
  do.call("quantile", args)
}

prep.step_percentile <- function(x, training, info = NULL, ...) {
  col_names <- recipes::terms_select(terms = x$terms, info = info) 
  ## You can add error trapping for non-numeric data here and so on. See the
  ## `check_type` function to do this for basic types. 
  
  ## We'll use the names later so
  if (x$options$names == FALSE)
    rlang::abort("`names` should be set to TRUE")
  
  if (!x$approx) {
    ref_dist <- training[, col_names]
  } else {
    ref_dist <- purrr::map(training[, col_names],  get_pctl, args = x$options)
  }
  
  ## Use the constructor function to return the updated object. 
  ## Note that `trained` is set to TRUE
  
  step_percentile_new(
    terms = x$terms, 
    trained = TRUE,
    role = x$role, 
    ref_dist = ref_dist,
    approx = x$approx,
    options = x$options,
    skip = x$skip,
    id = x$id
  )
}

## Two helper functions
pctl_by_mean <- function(x, ref) mean(ref <= x)

pctl_by_approx <- function(x, ref) {
  ## go from 1 column tibble to vector
  x <- getElement(x, names(x))
  ## get the percentiles values from the names (e.g. "10%")
  p_grid <- as.numeric(gsub("%$", "", names(ref))) 
  approx(x = ref, y = p_grid, xout = x)$y/100
}

bake.step_percentile <- function(object, new_data, ...) {
  require(tibble)
  ## For illustration (and not speed), we will loop through the affected variables
  ## and do the computations
  vars <- names(object$ref_dist)
  
  for (i in vars) {
    if (!object$approx) {
      ## We can use `apply` since tibbles do not drop dimensions:
      new_data[, i] <- apply(new_data[, i], 1, pctl_by_mean, 
                             ref = object$ref_dist[, i])
    } else 
      new_data[, i] <- pctl_by_approx(new_data[, i], object$ref_dist[[i]])
  }
  ## Always convert to tibbles on the way out
  as_tibble(new_data)
}


### TUNE
data(Ionosphere)

Ionosphere <- Ionosphere %>% select(-V2)

svm_mod <-
  svm_rbf(cost = tune(), rbf_sigma = tune()) %>%
  set_mode("classification") %>%
  set_engine("kernlab")

iono_rec <-
  recipe(Class ~ ., data = Ionosphere)  %>%
  # In case V1 is has a single value sampled
  step_zv(recipes::all_predictors()) %>% 
  # convert it to a dummy variable
  step_dummy(V1) %>%
  # Scale it the same as the others
  step_range(matches("V1_")) %>% 
  step_percentile(recipes::all_numeric()) # custom step added to the recipe

set.seed(4943)
iono_rs <- bootstraps(Ionosphere, times = 5) # times reduced to speed up sequential model tuning

roc_vals <- metric_set(roc_auc)

ctrl <- control_grid(verbose = FALSE, allow_par = FALSE)

set.seed(35)
grid_form <-
  tune_grid(
    iono_rec,
    model = svm_mod,
    resamples = iono_rs,
    metrics = roc_vals,
    control = ctrl
  )
print(grid_form)
#> # Bootstrap sampling 
#> # A tibble: 5 x 4
#>   splits            id         .metrics          .notes          
#>   <list>            <chr>      <list>            <list>          
#> 1 <split [351/120]> Bootstrap1 <tibble [10 x 5]> <tibble [0 x 1]>
#> 2 <split [351/130]> Bootstrap2 <tibble [10 x 5]> <tibble [0 x 1]>
#> 3 <split [351/137]> Bootstrap3 <tibble [10 x 5]> <tibble [0 x 1]>
#> 4 <split [351/141]> Bootstrap4 <tibble [10 x 5]> <tibble [0 x 1]>
#> 5 <split [351/131]> Bootstrap5 <tibble [10 x 5]> <tibble [0 x 1]>

Created on 2020-04-03 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> - Session info ---------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.6.3 (2020-02-29)
#>  os       Windows Server x64          
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language (EN)                        
#>  collate  English_United States.1252  
#>  ctype    English_United States.1252  
#>  tz       America/New_York            
#>  date     2020-04-03                  
#> 
#> - Packages -------------------------------------------------------------------
#>  package       * version    date       lib source        
#>  assertthat      0.2.1      2019-03-21 [1] CRAN (R 3.6.1)
#>  backports       1.1.5      2019-10-02 [1] CRAN (R 3.6.1)
#>  base64enc       0.1-3      2015-07-28 [1] CRAN (R 3.6.0)
#>  bayesplot       1.7.1      2019-12-01 [1] CRAN (R 3.6.3)
#>  boot            1.3-24     2019-12-20 [2] CRAN (R 3.6.3)
#>  broom         * 0.5.5      2020-02-29 [1] CRAN (R 3.6.3)
#>  callr           3.4.3      2020-03-28 [1] CRAN (R 3.6.3)
#>  class           7.3-15     2019-01-01 [2] CRAN (R 3.6.3)
#>  cli             2.0.2      2020-02-28 [1] CRAN (R 3.6.3)
#>  codetools       0.2-16     2018-12-24 [2] CRAN (R 3.6.3)
#>  colorspace      1.4-1      2019-03-18 [1] CRAN (R 3.6.1)
#>  colourpicker    1.0        2017-09-27 [1] CRAN (R 3.6.3)
#>  crayon          1.3.4      2017-09-16 [1] CRAN (R 3.6.1)
#>  crosstalk       1.1.0.1    2020-03-13 [1] CRAN (R 3.6.3)
#>  desc            1.2.0      2018-05-01 [1] CRAN (R 3.6.1)
#>  devtools        2.2.2      2020-02-17 [1] CRAN (R 3.6.3)
#>  dials         * 0.0.5      2020-04-01 [1] CRAN (R 3.6.3)
#>  DiceDesign      1.8-1      2019-07-31 [1] CRAN (R 3.6.3)
#>  digest          0.6.25     2020-02-23 [1] CRAN (R 3.6.3)
#>  dplyr         * 0.8.5      2020-03-07 [1] CRAN (R 3.6.3)
#>  DT              0.13       2020-03-23 [1] CRAN (R 3.6.3)
#>  dygraphs        1.1.1.6    2018-07-11 [1] CRAN (R 3.6.3)
#>  ellipsis        0.3.0      2019-09-20 [1] CRAN (R 3.6.1)
#>  evaluate        0.14       2019-05-28 [1] CRAN (R 3.6.0)
#>  fansi           0.4.1      2020-01-08 [1] CRAN (R 3.6.2)
#>  fastmap         1.0.1      2019-10-08 [1] CRAN (R 3.6.1)
#>  foreach         1.5.0      2020-03-30 [1] CRAN (R 3.6.3)
#>  fs              1.4.0      2020-03-31 [1] CRAN (R 3.6.3)
#>  furrr           0.1.0      2018-05-16 [1] CRAN (R 3.6.3)
#>  future          1.16.0     2020-01-16 [1] CRAN (R 3.6.2)
#>  generics        0.0.2      2018-11-29 [1] CRAN (R 3.6.1)
#>  ggplot2       * 3.3.0      2020-03-05 [1] CRAN (R 3.6.3)
#>  ggridges        0.5.2      2020-01-12 [1] CRAN (R 3.6.3)
#>  globals         0.12.5     2019-12-07 [1] CRAN (R 3.6.1)
#>  glue            1.3.2      2020-03-12 [1] CRAN (R 3.6.3)
#>  gower           0.2.1      2019-05-14 [1] CRAN (R 3.6.0)
#>  GPfit           1.0-8      2019-02-08 [1] CRAN (R 3.6.3)
#>  gridExtra       2.3        2017-09-09 [1] CRAN (R 3.6.1)
#>  gtable          0.3.0      2019-03-25 [1] CRAN (R 3.6.1)
#>  gtools          3.8.2      2020-03-31 [1] CRAN (R 3.6.3)
#>  hardhat         0.1.2      2020-02-28 [1] CRAN (R 3.6.3)
#>  highr           0.8        2019-03-20 [1] CRAN (R 3.6.1)
#>  htmltools       0.4.0      2019-10-04 [1] CRAN (R 3.6.1)
#>  htmlwidgets     1.5.1      2019-10-08 [1] CRAN (R 3.6.1)
#>  httpuv          1.5.2      2019-09-11 [1] CRAN (R 3.6.1)
#>  igraph          1.2.5      2020-03-19 [1] CRAN (R 3.6.3)
#>  infer         * 0.5.1      2019-11-19 [1] CRAN (R 3.6.3)
#>  inline          0.3.15     2018-05-18 [1] CRAN (R 3.6.3)
#>  ipred           0.9-9      2019-04-28 [1] CRAN (R 3.6.0)
#>  iterators       1.0.12     2019-07-26 [1] CRAN (R 3.6.1)
#>  janeaustenr     0.1.5      2017-06-10 [1] CRAN (R 3.6.1)
#>  kernlab         0.9-29     2019-11-12 [1] CRAN (R 3.6.1)
#>  knitr           1.28       2020-02-06 [1] CRAN (R 3.6.3)
#>  later           1.0.0      2019-10-04 [1] CRAN (R 3.6.1)
#>  lattice         0.20-40    2020-02-19 [2] CRAN (R 3.6.3)
#>  lava            1.6.7      2020-03-05 [1] CRAN (R 3.6.3)
#>  lhs             1.0.1      2019-02-03 [1] CRAN (R 3.6.3)
#>  lifecycle       0.2.0      2020-03-06 [1] CRAN (R 3.6.3)
#>  listenv         0.8.0      2019-12-05 [1] CRAN (R 3.6.1)
#>  lme4            1.1-21     2019-03-05 [1] CRAN (R 3.6.3)
#>  loo             2.2.0      2019-12-19 [1] CRAN (R 3.6.3)
#>  lubridate       1.7.4      2018-04-11 [1] CRAN (R 3.6.1)
#>  magrittr        1.5        2014-11-22 [1] CRAN (R 3.6.1)
#>  markdown        1.1        2019-08-07 [1] CRAN (R 3.6.1)
#>  MASS            7.3-51.5   2019-12-20 [1] CRAN (R 3.6.2)
#>  Matrix          1.2-18     2019-11-27 [2] CRAN (R 3.6.3)
#>  matrixStats     0.56.0     2020-03-13 [1] CRAN (R 3.6.3)
#>  memoise         1.1.0      2017-04-21 [1] CRAN (R 3.6.1)
#>  mime            0.9        2020-02-04 [1] CRAN (R 3.6.2)
#>  miniUI          0.1.1.1    2018-05-18 [1] CRAN (R 3.6.3)
#>  minqa           1.2.4      2014-10-09 [1] CRAN (R 3.6.1)
#>  mlbench       * 2.1-1      2012-07-10 [1] CRAN (R 3.6.3)
#>  munsell         0.5.0      2018-06-12 [1] CRAN (R 3.6.1)
#>  nlme            3.1-145    2020-03-04 [2] CRAN (R 3.6.3)
#>  nloptr          1.2.2.1    2020-03-11 [1] CRAN (R 3.6.3)
#>  nnet            7.3-13     2020-02-25 [2] CRAN (R 3.6.3)
#>  parsnip       * 0.0.5      2020-01-07 [1] CRAN (R 3.6.3)
#>  pillar          1.4.3      2019-12-20 [1] CRAN (R 3.6.2)
#>  pkgbuild        1.0.6      2019-10-09 [1] CRAN (R 3.6.1)
#>  pkgconfig       2.0.3      2019-09-22 [1] CRAN (R 3.6.1)
#>  pkgload         1.0.2      2018-10-29 [1] CRAN (R 3.6.1)
#>  plyr            1.8.6      2020-03-03 [1] CRAN (R 3.6.3)
#>  prettyunits     1.1.1      2020-01-24 [1] CRAN (R 3.6.2)
#>  pROC            1.16.2     2020-03-19 [1] CRAN (R 3.6.2)
#>  processx        3.4.2      2020-02-09 [1] CRAN (R 3.6.3)
#>  prodlim         2019.11.13 2019-11-17 [1] CRAN (R 3.6.1)
#>  promises        1.1.0      2019-10-04 [1] CRAN (R 3.6.1)
#>  ps              1.3.2      2020-02-13 [1] CRAN (R 3.6.3)
#>  purrr         * 0.3.3      2019-10-18 [1] CRAN (R 3.6.1)
#>  R6              2.4.1      2019-11-12 [1] CRAN (R 3.6.1)
#>  Rcpp            1.0.4      2020-03-17 [1] CRAN (R 3.6.2)
#>  recipes       * 0.1.10     2020-03-18 [1] CRAN (R 3.6.2)
#>  remotes         2.1.1      2020-02-15 [1] CRAN (R 3.6.3)
#>  reshape2        1.4.3      2017-12-11 [1] CRAN (R 3.6.1)
#>  rlang           0.4.5      2020-03-01 [1] CRAN (R 3.6.3)
#>  rmarkdown       2.1        2020-01-20 [1] CRAN (R 3.6.2)
#>  rpart           4.1-15     2019-04-12 [2] CRAN (R 3.6.3)
#>  rprojroot       1.3-2      2018-01-03 [1] CRAN (R 3.6.1)
#>  rsample       * 0.0.6      2020-03-31 [1] CRAN (R 3.6.3)
#>  rsconnect       0.8.16     2019-12-13 [1] CRAN (R 3.6.3)
#>  rstan           2.19.3     2020-02-11 [1] CRAN (R 3.6.3)
#>  rstanarm        2.19.3     2020-02-11 [1] CRAN (R 3.6.3)
#>  rstantools      2.0.0      2019-09-15 [1] CRAN (R 3.6.3)
#>  rstudioapi      0.11       2020-02-07 [1] CRAN (R 3.6.3)
#>  scales        * 1.1.0      2019-11-18 [1] CRAN (R 3.6.1)
#>  sessioninfo     1.1.1      2018-11-05 [1] CRAN (R 3.6.1)
#>  shiny           1.4.0.2    2020-03-13 [1] CRAN (R 3.6.3)
#>  shinyjs         1.1        2020-01-13 [1] CRAN (R 3.6.2)
#>  shinystan       2.5.0      2018-05-01 [1] CRAN (R 3.6.3)
#>  shinythemes     1.1.2      2018-11-06 [1] CRAN (R 3.6.3)
#>  SnowballC       0.7.0      2020-04-01 [1] CRAN (R 3.6.3)
#>  StanHeaders     2.21.0-1   2020-01-19 [1] CRAN (R 3.6.2)
#>  stringi         1.4.6      2020-02-17 [1] CRAN (R 3.6.2)
#>  stringr         1.4.0      2019-02-10 [1] CRAN (R 3.6.1)
#>  survival        3.1-11     2020-03-07 [2] CRAN (R 3.6.3)
#>  testthat        2.3.2      2020-03-02 [1] CRAN (R 3.6.3)
#>  threejs         0.3.3      2020-01-21 [1] CRAN (R 3.6.3)
#>  tibble        * 3.0.0      2020-03-30 [1] CRAN (R 3.6.3)
#>  tidymodels    * 0.1.0      2020-02-16 [1] CRAN (R 3.6.3)
#>  tidyposterior   0.0.2      2018-11-15 [1] CRAN (R 3.6.3)
#>  tidypredict     0.4.5      2020-02-10 [1] CRAN (R 3.6.3)
#>  tidyr           1.0.2      2020-01-24 [1] CRAN (R 3.6.2)
#>  tidyselect      1.0.0      2020-01-27 [1] CRAN (R 3.6.1)
#>  tidytext        0.2.3      2020-03-04 [1] CRAN (R 3.6.3)
#>  timeDate        3043.102   2018-02-21 [1] CRAN (R 3.6.0)
#>  tokenizers      0.2.1      2018-03-29 [1] CRAN (R 3.6.1)
#>  tune          * 0.0.1      2020-02-11 [1] CRAN (R 3.6.3)
#>  usethis         1.5.1      2019-07-04 [1] CRAN (R 3.6.1)
#>  utf8            1.1.4      2018-05-24 [1] CRAN (R 3.6.1)
#>  vctrs           0.2.4      2020-03-10 [1] CRAN (R 3.6.3)
#>  withr           2.1.2      2018-03-15 [1] CRAN (R 3.6.1)
#>  workflows     * 0.1.1      2020-03-17 [1] CRAN (R 3.6.2)
#>  xfun            0.12       2020-01-13 [1] CRAN (R 3.6.2)
#>  xtable          1.8-4      2019-04-21 [1] CRAN (R 3.6.0)
#>  xts             0.12-0     2020-01-19 [1] CRAN (R 3.6.3)
#>  yaml            2.2.1      2020-02-01 [1] CRAN (R 3.6.1)
#>  yardstick     * 0.0.6      2020-03-17 [1] CRAN (R 3.6.2)
#>  zoo             1.8-7      2020-01-10 [1] CRAN (R 3.6.3)
#> 
#> [1] D:/R/Library/3.6
#> [2] C:/Program Files/R/R-3.6.3/library
@topepo topepo added the bug an unexpected problem or unintended behavior label Apr 3, 2020
@topepo topepo changed the title Custom recipe steps don't work when tuning is down in parallel Custom recipe steps don't work when tuning is done in parallel May 15, 2020
@ksizaidi-h
Copy link

I have the same bug when I try tuning in parallel using a step form the themis package.

@lucasc400
Copy link

I experienced the same issue with themis::step_smote()
I used the following codes to create a parallel cluster, as described in https://tune.tidymodels.org/articles/extras/optimizations.html:
all_cores <- parallel::detectCores(logical = FALSE) library(doParallel) cl <- makePSOCKcluster(all_cores) registerDoParallel(cl)
where I got an error message of
"recipe: Error in UseMethod("prep"): no applicable method for 'prep' applied to an object of class "c('step_smote', 'step')".

The same codes worked when I used themis::step_downsample() instead. I don't think I could use the !!variable method as suggested in the link, because I wanted to put a column name in step_smote(). I can post my codes if needed.

Is this issue fixed and I missed out something or is this still something under development? I see the pull request that has been merged but I don't see how I can fix it (sorry I'm quite new to GitHub).

@juliasilge
Copy link
Member

The fix in the PR linked here was done after the latest CRAN release. Do you have the development version of the package from GitHub? You can install via:

# if needed: install.packages("devtools")
devtools::install_github("tidymodels/tune")

We are planning a new CRAN release for tune pretty soon, so that this fix will be available more broadly. If you are able, can you try this out and see if your problem is fixed?

@juliasilge
Copy link
Member

I misspoke! This is fixed in the current version of recipes:

# if needed: install.packages("devtools")
devtools::install_github("tidymodels/recipes")

Open a new issue if you continue to have a problem after updating! 🙌

@github-actions
Copy link

github-actions bot commented Mar 6, 2021

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Mar 6, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

5 participants