Get negative AUC in the linux #1105

MengZou-HUST opened this issue Dec 9, 2019 · 3 comments

MengZou-HUST commented Dec 9, 2019

I came across a weird problem, When I use the package in windows and it is okay. But when I use the package in linux there is some problems: the AUC is negative. Could you have some suggestions?

> library(caret)
> set.seed(2969)
> imbal_train <- twoClassSim(1000, intercept = -20, linearVars = 20)
> ctrl <- trainControl(method = "repeatedcv", repeats = 5,
+                      classProbs = TRUE,
+                      summaryFunction = twoClassSummary)
> set.seed(5627)
> orig_fit <- train(Class ~ ., data = imbal_train,
+                   method = "treebag",
+                   nbagg = 50,
+                   metric = "ROC",
+                   trControl = ctrl)
> orig_fit
Bagged CART

1000 samples
  25 predictor
   2 classes: 'Class1', 'Class2'

No pre-processing
Resampling: Cross-Validated (10 fold, repeated 5 times)
Summary of sample sizes: 901, 901, 899, 899, 900, 899, ...
Resampling results:

  ROC        Sens       Spec
  -2.346135  0.9869595  0.3533333
topepo commented Jan 2, 2020

Can you send the sessionInfo() so that we can see if this is related to JackStat/ModelMetrics#29 and perhaps retest with the current devel version of caret?

jodie-c commented Feb 3, 2020

Hi, I'm having the same problem with multiple different models where some calculated ROC values are negative.
Output after collating trained models using resamples()

summary.resamples(object = resultsVAS)

Models: GLM, NB 
Number of resamples: 500 

          Min.   1st Qu.    Median      Mean   3rd Qu.      Max. NA's
GLM -0.1435044 0.6318969 0.7264200 0.6969336 0.8065648 0.8203622    0
NB  -0.4446388 0.3300259 0.4174423 0.4146185 0.5038565 0.7103623    0

Output of sessionInfo()

R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Server 7.5 (Maipo)

Matrix products: default
BLAS/LAPACK: /cm/shared/apps/intel/compilers_and_libraries/2018.3.222/linux/mkl/lib/intel64_lin/

 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
[1] doParallel_1.0.15 iterators_1.0.10  foreach_1.4.4     plyr_1.8.4       
[5] caret_6.0-80      ggplot2_3.1.0     lattice_0.20-35  

loaded via a namespace (and not attached):
 [1] tidyselect_0.2.5   purrr_0.3.3        reshape2_1.4.3     kernlab_0.9-27    
 [5] splines_3.5.1      colorspace_1.3-2   stats4_3.5.1       geometry_0.1-7    
 [9] survival_2.42-3    prodlim_2018.04.18 rlang_0.4.2        ModelMetrics_1.2.2
[13] pillar_1.4.1       glue_1.3.1         withr_2.1.2        dimRed_0.2.2      
[17] lava_1.6.3         robustbase_0.93-3  stringr_1.3.1      timeDate_3043.102 
[21] munsell_0.5.0      pls_2.7-0          gtable_0.2.0       recipes_0.1.3     
[25] codetools_0.2-15   class_7.3-14       DEoptimR_1.0-8     broom_0.5.0       
[29] Rcpp_1.0.2         backports_1.1.2    scales_1.0.0       ipred_0.9-8       
[33] CVST_0.2-2         stringi_1.2.4      naivebayes_0.9.6   dplyr_0.8.3       
[37] RcppRoll_0.3.0     ddalpha_1.3.4      grid_3.5.1         tools_3.5.1       
[41] magrittr_1.5       lazyeval_0.2.1     tibble_2.1.2       crayon_1.3.4      
[45] tidyr_0.8.2        DRR_0.0.3          pkgconfig_2.0.2    MASS_7.3-51.1     
[49] Matrix_1.2-14      data.table_1.11.8  lubridate_1.7.4    gower_0.1.2       
[53] assertthat_0.2.0   R6_2.3.0           rpart_4.1-13       sfsmisc_1.1-2     
[57] nnet_7.3-12        nlme_3.1-137       compiler_3.5.1    

topepo commented Feb 3, 2020

This was fixed in version 6.0-84 but please test to verify.

@topepo topepo closed this as completed Feb 3, 2020
