Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[doParallel/doAzureParallel] task 356 failed - "arguments imply differing number of rows: 0, 75" #869

Open
4 tasks done
brnleehng opened this issue Apr 10, 2018 · 10 comments

Comments

@brnleehng
Copy link

We are getting 'arguments imply differing numbers of rows' when we are trying to do a multiclass gbm model. There's not much any information on why the GBM model is failing at iteration 60. Can any of you guys help us to debug and fix?

Adding @kchaitanyabandi for visibility

Based on one of the tasks logs,

  • Fold01.Rep1: shrinkage=0.200, interaction.depth=11, n.minobsinnode= 6, n.trees=1300
    Iter TrainDeviance ValidDeviance StepSize Improve
    1 1.0986 -nan 0.2000 0.3625
    2 0.8479 -nan 0.2000 0.1920
    3 0.7156 -nan 0.2000 0.1080
    4 0.6399 -nan 0.2000 0.0768
    5 0.5824 -nan 0.2000 0.0516
    6 0.5444 -nan 0.2000 0.0396
    7 0.5144 -nan 0.2000 0.0341
    8 0.4894 -nan 0.2000 0.0293
    9 0.4680 -nan 0.2000 0.0228
    10 0.4495 -nan 0.2000 0.0156
    20 0.3630 -nan 0.2000 0.0025
    40 0.3120 -nan 0.2000 0.0015
    60 -nan -nan 0.2000 -nan
    80 -nan -nan 0.2000 -nan
    100 -nan -nan 0.2000 -nan
    120 -nan -nan 0.2000 -nan
    140 -nan -nan 0.2000 -nan
    160 -nan -nan 0.2000 -nan
    180 -nan -nan 0.2000 -nan
    200 -nan -nan 0.2000 -nan

It appears that the GBM model is failing and when the results are getting collated. We received this error.

<simpleError in data.frame(pred = x, obs = y, stringsAsFactors = FALSE): arguments imply differing number of rows: 0, 2197>

Note: We are trying to run this with doAzureParallel, but it can also be reproduced on doParallel.

Adding @kchaitanyabandi for more information

Minimal, reproducible example:

Minimal dataset:

library(caret)
X = iris[,1:3]
Y = iris$Species

The error can be reproduced by the iris data set. However, @kchaitanyabandi might be able to share with you a sample data set.

Minimal, runnable code:

library(doParallel)
cluster <- makeCluster(detectCores() - 1)
 
registerDoParallel(cluster)
 
gbmGrid <- expand.grid(interaction.depth = 10:20,
                       n.trees = c(100, 150, 200, 250, 300, 350, 400, 450, 500, 1000, 1175, 1250, 1300),
                       shrinkage = c(0.025, .05, .1, 0.2, 0.3),
                       n.minobsinnode = c(5:10, 20, 30))
 
ctrl_gbm <- trainControl(method = "repeatedcv", number = 2, repeats = 1, summaryFunction = multiClassSummary,
                         classProbs = TRUE, verboseIter = FALSE, allowParallel = TRUE)
 
tuned_fit_gbm <- train(x = X,
                       y = Y,
                       method = "gbm",
                       verbose = TRUE,
                       metric = "logLoss",
                       trControl = ctrl_gbm,
                       tuneGrid = gbmGrid)

Session Info:

>sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.2

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] doParallel_1.0.11 iterators_1.0.9   foreach_1.4.4     caret_6.0-79      ggplot2_2.2.1     lattice_0.20-35  

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.16       lubridate_1.7.3    tidyr_0.8.0        gtools_3.5.0       class_7.3-14      
 [6] assertthat_0.2.0   glmnet_2.0-13      digest_0.6.15      ipred_0.9-6        psych_1.7.8       
[11] R6_2.2.2           plyr_1.8.4         stats4_3.4.3       e1071_1.6-8        httr_1.3.1        
[16] pillar_1.2.1       gplots_3.0.1       rlang_0.2.0        lazyeval_0.2.1     curl_3.1          
[21] gdata_2.18.0       kernlab_0.9-25     rpart_4.1-11       Matrix_1.2-12      devtools_1.13.4   
[26] splines_3.4.3      CVST_0.2-1         ddalpha_1.3.1.1    gower_0.1.2        stringr_1.3.0     
[31] foreign_0.8-69     munsell_0.4.3      broom_0.4.3        compiler_3.4.3     pkgconfig_2.0.1   
[36] mnormt_1.5-5       dimRed_0.1.0       gbm_2.1.3          nnet_7.3-12        tidyselect_0.2.4  
[41] tibble_1.4.2       prodlim_1.6.1      DRR_0.0.3          codetools_0.2-15   RcppRoll_0.2.2    
[46] dplyr_0.7.4        withr_2.1.2        bitops_1.0-6       MASS_7.3-47        recipes_0.1.2     
[51] ModelMetrics_1.1.0 grid_3.4.3         nlme_3.1-131       gtable_0.2.0       git2r_0.20.0      
[56] magrittr_1.5       scales_0.5.0       KernSmooth_2.23-15 stringi_1.1.6      ROCR_1.0-7        
[61] reshape2_1.4.3     MLmetrics_1.1.1    bindrcpp_0.2       timeDate_3043.102  robustbase_0.92-8 
[66] lava_1.6           tools_3.4.3        glue_1.2.0         DEoptimR_1.0-8     purrr_0.2.4       
[71] sfsmisc_1.1-2      survival_2.41-3    yaml_2.1.17        colorspace_1.3-2   caTools_1.17.1    
[76] memoise_1.1.0      knitr_1.19         bindr_0.1.1 

Thanks!
Brian

@brnleehng brnleehng changed the title [doParallel/doAzureParalell] task 356 failed - "arguments imply differing number of rows: 0, 75" [doParallel/doAzureParallel] task 356 failed - "arguments imply differing number of rows: 0, 75" Apr 10, 2018
@topepo
Copy link
Owner

topepo commented Apr 17, 2018

I can't reproduce this without the X and y parts. Also does this have anything to do with doAzureParallel?

@kchaitanyabandi
Copy link

Hi,

The minimal dataset is Iris in R.

X = iris[,1:3]
Y = iris$Species

as mentioned by Brian.

@topepo
Copy link
Owner

topepo commented Apr 19, 2018

Missed that! 😦

@topepo
Copy link
Owner

topepo commented Apr 19, 2018

If I run it without parallelism, I get this:

Error in { : 
  task 353 failed - "arguments imply differing number of rows: 0, 75"
In addition: There were 50 or more warnings (use warnings() to see the first 50)

The warnings are:

The dataset size is too small or subsampling rate is too large: nTrain*bag.fraction <= n.minobsinnode

@kchaitanyabandi
Copy link

Hi Max,

Could you please suggest a way to avoid this warning and the error ? As, in how to modify the subsampling rate while using caret or is it fixed ?

Thanks
Krishna Chaitanya

@kchaitanyabandi
Copy link

Hi @topepo ,

Also, it is becoming really important for me to understand two issues that sprout up when I performed my runs. They are:

  1. Error in names(resamples) <- gsub("^.", "", names(resamples)) :
    attempt to specify an attribute in a NULL

  2. Warning message:
    In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
    There were missing values in resampled performance measures.

Could you please explain why these messages come up at the first place or when do they come up while running any algorithm for classification/regression ?

@see24
Copy link

see24 commented May 10, 2018

I have had similar issues but the error messages are different depending on the tuning grid and whether it is in parallel or not.
Not in parallel using the default tuning grid give this error:
Error in { :
task 3 failed - "arguments imply differing number of rows: 0, 10265"
In addition: Warning messages:
1: predictions failed for Resample11: shrinkage=0.1, interaction.depth=2, n.minobsinnode=10, n.trees=150 Error in lvl[x] : invalid subscript type 'list'

2: predictions failed for Resample23: shrinkage=0.1, interaction.depth=2, n.minobsinnode=10, n.trees=150 Error in lvl[x] : invalid subscript type 'list'

3: predictions failed for Resample25: shrinkage=0.1, interaction.depth=3, n.minobsinnode=10, n.trees=150 Error in lvl[x] : invalid subscript type 'list'

In parallel (using doParallel package) with the default tuning grid gives this error:
Error in { :
task 2 failed - "arguments imply differing number of rows: 0, 10380"

Whereas in parallel with a custom tuning grid gives this error:
Something is wrong; all the Accuracy metric values are missing:
Accuracy Kappa
Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA
Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA
NA's :6 NA's :6
Error: Stopping
In addition: Warning message:
In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
There were missing values in resampled performance measures.

Any insight you might have as to where this error is coming from or whether there is a way to avoid it would be much appreciated!
Thanks,
Sarah

@see24
Copy link

see24 commented May 10, 2018

In case it helps this is what the confusionMatrix results look like when I run the model with the gbm package directly:
Sensitivity--------------0.0017781 0.7354 0.28306 0.0000 0.000000 0.000000 0.00000 0.5009 0.00000
Specificity--------------0.9999367 0.4962 0.96555 1.0000 1.000000 1.000000 1.00000 0.6723 1.00000
Pos Pred Value--------0.7142857 0.4052 0.29544 NaN NaN NaN NaN 0.3699 NaN
Neg Pred Value--------0.9184225 0.8007 0.96349 0.8819 0.994014 0.993143 0.92259 0.7782 0.97251
Prevalence-------------0.0817062 0.3182 0.04855 0.1181 0.005986 0.006857 0.07741 0.2775 0.02749
Detection Rate---------0.0001453 0.2340 0.01374 0.0000 0.000000 0.000000 0.00000 0.1390 0.00000
Detection Prevalenc---0.0002034 0.5775 0.04652 0.0000 0.000000 0.000000 0.00000 0.3758 0.00000
Balanced Accuracy----0.5008574 0.6158 0.62431 0.5000 0.500000 0.500000 0.50000 0.5866 0.50000
Class: TBL Class: TL
Sensitivity 0.00000 0.00000
Specificity 1.00000 1.00000
Pos Pred Value NaN NaN
Neg Pred Value 0.97277 0.98902
Prevalence 0.02723 0.01098
Detection Rate 0.00000 0.00000
Detection Prevalence 0.00000 0.00000
Balanced Accuracy 0.50000 0.50000

As you can see the classes are unbalanced but up sampling does not seem to solve the problem

@ruizcrp
Copy link

ruizcrp commented Aug 27, 2018

Hi, I'm having the same issue. It seems to me that reducing the amount of parallel cores reduces the likelihood that it crashes with that error. So maybe a memory issue?

@topepo
Copy link
Owner

topepo commented Aug 27, 2018

@kchaitanyabandi

The issue is related to the warning message about nTrain*bag.fraction <= n.minobsinnode. The latter two variables are arguments to gbm. It might be that you should pass in bag.fraction to make it work for your data set size. It is fixed by gbm at bag.fraction = 0.50. Keep in mind that nTrain is your training set size after it is resampled.

There were missing values in resampled performance measures.

This is usually the result of a model making predictions that are constant across all of the samples. This results in var(pred) = 0 and R^2 can't be computed.

Error in names(resamples) <- gsub("^.", "", names(resamples)) :

This is a general error when all of the models fail.

@see24 I don't think that they are related. It's not good form to tack on "similar issues"; if you are using the same example and get the same results then please comment. Otherwise open a different issue with a small reproducible example.

@ruizcrp I don't know which issue you are referring to. Please submit a new issue with a small reproducible example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants