Skip to content

Unreliable random numbers produced when using doFuture backend #377

@mbac

Description

@mbac

Hi,

I’m using doFuture as a foreach backend. This is an approximation of the code I’m trying to run:

Edit: forgot the doFuture code:

library(tidymodels)
library(doFuture)

cores <- parallelly::availableCores()
registerDoFuture()
# Only option available on Macs, as I understand it:
plan("multisession")

data(cells)
set.seed(2369)
tr_te_split <- initial_split(cells %>% select(-case), prop = 3/4)
cell_train <- training(tr_te_split)
cell_test  <- testing(tr_te_split)

set.seed(1697)
folds <- vfold_cv(cell_train, v = 10)

cell_rec <- recipe(
    class ~ .,
    data = cell_train
)

boost_forest_mod <- boost_tree(
    mtry = tune(),
    trees = tune(),
    min_n = tune(),
    learn_rate = tune(),
    tree_depth = tune(),
    loss_reduction = tune(),
    sample_size = tune(),
    stop_iter = tune()
) %>%
    set_engine("xgboost") %>%
    set_mode("classification")

workflow_cells <- workflow() %>%
    add_recipe(cell_rec) %>%
    add_model(boost_forest_mod)

workflow_cells_tuned <- workflow_cells %>%
    tune_grid(
        folds,
        grid = 20,
        metrics = metric_set(roc_auc, precision, recall)
    )

The tuning procedure seems to work, but I’m getting warnings for each iteration of the doFuture backend (I guess):

UNRELIABLE VALUE: One of the foreach() iterations (‘doFuture-7’) unexpectedly generated random numbers without
declaring so. There is a risk that those random numbers are not statistically sound and the overall results might be
invalid. To fix this, use ‘%dorng%’ from the ‘doRNG’ package instead of ‘%dopar%’. This ensures that proper,
parallel-safe random numbers are produced via the L’Ecuyer-CMRG method. To disable this check, set option
‘future.rng.onMisuse’ to “ignore”.

My session info:

> sessionInfo()
R version 4.0.4 (2021-02-15)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 10.16

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] xgboost_1.4.1.1          rcompanion_2.4.0         doFuture_0.12.0-9000    
 [4] future_1.21.0            foreach_1.5.1            multilevelmod_0.0.0.9000
 [7] REDCapR_0.11.1.9004      dtplyr_1.1.0             readxl_1.3.1            
[10] yardstick_0.0.8          workflowsets_0.0.2       workflows_0.2.2         
[13] tune_0.1.5               rsample_0.0.9            recipes_0.1.16          
[16] parsnip_0.1.5.9002       modeldata_0.1.0          infer_0.5.4             
[19] dials_0.0.9.9000         scales_1.1.1             broom_0.7.6             
[22] tidymodels_0.1.3.9000    forcats_0.5.1            stringr_1.4.0           
[25] dplyr_1.0.5              purrr_0.3.4              readr_1.4.0             
[28] tidyr_1.1.3              tibble_3.1.1             ggplot2_3.3.3           
[31] tidyverse_1.3.1          pacman_0.5.1             devtools_2.4.0          
[34] usethis_2.0.1           

loaded via a namespace (and not attached):
  [1] backports_1.2.1    plyr_1.8.6         splines_4.0.4      listenv_0.8.0     
  [5] TH.data_1.0-10     digest_0.6.27      fansi_0.4.2        checkmate_2.0.0   
  [9] magrittr_2.0.1     memoise_2.0.0      remotes_2.3.0      globals_0.14.0    
 [13] modelr_0.1.8       gower_0.2.2        matrixStats_0.58.0 sandwich_3.0-0    
 [17] hardhat_0.1.5      prettyunits_1.1.1  colorspace_2.0-0   rvest_1.0.0       
 [21] haven_2.4.0        xfun_0.22          callr_3.7.0        crayon_1.4.1      
 [25] jsonlite_1.7.2     libcoin_1.0-8      Exact_2.1          zoo_1.8-9         
 [29] survival_3.2-10    iterators_1.0.13   glue_1.4.2         gtable_0.3.0      
 [33] ipred_0.9-11       pkgbuild_1.2.0     mvtnorm_1.1-1      DBI_1.1.1         
 [37] Rcpp_1.0.6         GPfit_1.0-8        proxy_0.4-25       stats4_4.0.4      
 [41] lava_1.6.9         prodlim_2019.11.13 httr_1.4.2         modeltools_0.2-23 
 [45] ellipsis_0.3.2     farver_2.1.0       pkgconfig_2.0.3    multcompView_0.1-8
 [49] nnet_7.3-15        dbplyr_2.1.1       utf8_1.2.1         labeling_0.4.2    
 [53] tidyselect_1.1.1   rlang_0.4.11       DiceDesign_1.9     munsell_0.5.0     
 [57] cellranger_1.1.0   tools_4.0.4        cachem_1.0.4       cli_2.5.0         
 [61] generics_0.1.0     EMT_1.1            fastmap_1.1.0      processx_3.5.2    
 [65] knitr_1.33         fs_1.5.0           coin_1.4-1         rootSolve_1.8.2.1 
 [69] tictoc_1.0         xml2_1.3.2         compiler_4.0.4     rstudioapi_0.13   
 [73] curl_4.3.1         e1071_1.7-6        testthat_3.0.2     reprex_2.0.0      
 [77] lhs_1.1.1          DescTools_0.99.41  stringi_1.5.3      ps_1.6.0          
 [81] desc_1.3.0         lattice_0.20-41    Matrix_1.3-2       conflicted_1.0.4  
 [85] vctrs_0.3.8        pillar_1.6.0       lifecycle_1.0.0    furrr_0.2.2       
 [89] lmtest_0.9-38      data.table_1.14.0  lmom_2.8           R6_2.5.0          
 [93] parallelly_1.25.0  gld_2.6.2          sessioninfo_1.1.1  codetools_0.2-18  
 [97] boot_1.3-27        MASS_7.3-53.1      assertthat_0.2.1   pkgload_1.2.1     
[101] rprojroot_2.0.2    nortest_1.0-4      withr_2.4.2        multcomp_1.4-17   
[105] expm_0.999-6       parallel_4.0.4     hms_1.0.0          grid_4.0.4        
[109] rpart_4.1-15       timeDate_3043.102  class_7.3-18       pROC_1.17.0.1     
[113] lubridate_1.7.10

Metadata

Metadata

Assignees

No one assigned

    Labels

    upkeepmaintenance, infrastructure, and similar

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions