Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

caret training fails on mxnetAdam unless mxnet is warmed up with dummy code #887

Closed
sazari opened this issue May 16, 2018 · 5 comments
Closed

Comments

@sazari
Copy link

@sazari sazari commented May 16, 2018

Hi!

I tried to run following fragment from caret demo code for mxnetAdam:

timestamp <- Sys.time()
library(caret)
library(plyr)
library(recipes)
library(dplyr)
library(kernlab)

modelZ <- "mxnetAdam"

for(i in getModelInfo(modelZ)[[1]]$library)
do.call("requireNamespace", list(package = i))

set.seed(2)
training <- twoClassSim(50, linearVars = 2)
testing <- twoClassSim(500, linearVars = 2)
trainX <- training[, -ncol(training)]
trainY <- training$Class

rec_cls <- recipe(Class ~ ., data = training) %>%
step_center(all_predictors()) %>%
step_scale(all_predictors())

cctrl1 <- trainControl(method = "cv", number = 3, returnResamp = "all",
classProbs = TRUE, summaryFunction = twoClassSummary)

maGrid <- expand.grid( layer1= 3, layer2 = c(0,5), layer3 = 0, activation= 'relu', learningrate=1e-02,
beta1=0.9, beta2=0.9999, dropout=c(0.05,0.20) )

set.seed(849)
test_class_cv_model <- train(trainX, trainY, method = modelZ,
trControl = cctrl1, metric = "ROC", preProc = c("center", "scale"), tuneGrid = maGrid)

print(test_class_cv_model)

which fails with the error:

Start training with 1 devices
Error in mxnet::mx.mlp(data = x, label = y, out_node = length(unique(y)), :
object 'mx.metric.accuracy' not found
In addition: There were 13 warnings (use warnings() to see them)
Timing stopped at: 0.2 0.17 0.03

To make it run, I had to inject following mxnet warmup dummy lines:

library(mxnet)
a <- mx.nd.ones(c(2,3), ctx = mx.cpu())
b <- mx.nd.ones(c(2,3), ctx = mx.gpu())
c1 <- a * 2 + 1
c2 <- b * 3 - 1
c1
c2

Final code is below:

timestamp <- Sys.time()
library(caret)
library(plyr)
library(recipes)
library(dplyr)
library(kernlab)

modelZ <- "mxnetAdam"

library(mxnet)
a <- mx.nd.ones(c(2,3), ctx = mx.cpu())
b <- mx.nd.ones(c(2,3), ctx = mx.gpu())
c1 <- a * 2 + 1
c2 <- b * 3 - 1
c1
c2

for(i in getModelInfo(modelZ)[[1]]$library)
do.call("requireNamespace", list(package = i))

set.seed(2)
training <- twoClassSim(50, linearVars = 2)
testing <- twoClassSim(500, linearVars = 2)
trainX <- training[, -ncol(training)]
trainY <- training$Class

rec_cls <- recipe(Class ~ ., data = training) %>%
step_center(all_predictors()) %>%
step_scale(all_predictors())

cctrl1 <- trainControl(method = "cv", number = 3, returnResamp = "all",
classProbs = TRUE, summaryFunction = twoClassSummary)

maGrid <- expand.grid( layer1= 3, layer2 = c(0,5), layer3 = 0, activation= 'relu', learningrate=1e-02,
beta1=0.9, beta2=0.9999, dropout=c(0.05,0.20) )

set.seed(849)
test_class_cv_model <- train(trainX, trainY, method = modelZ,
trControl = cctrl1, metric = "ROC", preProc = c("center", "scale"), tuneGrid = maGrid)

print(test_class_cv_model)

Session info:

R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] kernlab_0.9-25 recipes_0.1.2 broom_0.4.4 dplyr_0.7.4 plyr_1.8.4 caret_6.0-79 ggplot2_2.2.1
[8] lattice_0.20-35

loaded via a namespace (and not attached):
[1] nlme_3.1-137 lubridate_1.7.4 dimRed_0.1.0 RColorBrewer_1.1-2 tools_3.5.0 R6_2.2.2
[7] rpart_4.1-13 lazyeval_0.2.1 colorspace_1.3-2 nnet_7.3-12 withr_2.1.2 tidyselect_0.2.4
[13] gridExtra_2.3 mnormt_1.5-5 compiler_3.5.0 influenceR_0.1.0 scales_0.5.0 sfsmisc_1.1-2
[19] DEoptimR_1.0-8 psych_1.8.3.3 robustbase_0.93-0 readr_1.1.1 stringr_1.3.1 digest_0.6.15
[25] foreign_0.8-70 pkgconfig_2.0.1 htmltools_0.3.6 htmlwidgets_1.2 rlang_0.2.0 ddalpha_1.3.2
[31] rstudioapi_0.7 bindr_0.1.1 visNetwork_2.0.3 jsonlite_1.5 ModelMetrics_1.1.0 rgexf_0.15.3
[37] magrittr_1.5 Matrix_1.2-14 Rcpp_0.12.16 munsell_0.4.3 abind_1.4-5 viridis_0.5.1
[43] stringi_1.1.7 yaml_2.1.18 MASS_7.3-49 grid_3.5.0 parallel_3.5.0 splines_3.5.0
[49] hms_0.4.2 pillar_1.2.2 igraph_1.2.1 reshape2_1.4.3 codetools_0.2-15 stats4_3.5.0
[55] CVST_0.2-1 magic_1.5-8 XML_3.98-1.11 glue_1.2.0 downloader_0.4 foreach_1.4.4
[61] gtable_0.2.0 purrr_0.2.4 tidyr_0.8.0 assertthat_0.2.0 DRR_0.0.3 gower_0.1.2
[67] prodlim_2018.04.18 class_7.3-14 survival_2.41-3 viridisLite_0.3.0 geometry_0.3-6 timeDate_3043.102
[73] RcppRoll_0.2.2 mxnet_1.2.0 tibble_1.4.2 iterators_1.0.9 bindrcpp_0.2.2 Rook_1.1-1
[79] lava_1.6.1 DiagrammeR_1.0.0 brew_1.0-6 ipred_0.9-6

mxnet is installed for GPU according to new procedures for R 3.5.0: apache/incubator-mxnet#10791

Command for mxnet installation:

install.packages("https://s3.ca-central-1.amazonaws.com/jeremiedb/share/mxnet/GPU/mxnet.zip", repos = NULL)

Latest development version of caret is installed from github.

Please fix caret if bug is caret related.

@hadjipantelis
Copy link
Contributor

@hadjipantelis hadjipantelis commented May 18, 2018

Hello @sazari, can you please try to load mxnet PRIOR to starting caret::train and then try again?

I suspect that what happens is that the mx.metric.accuracy is not loaded in the work-space when you first call mxnet. That's why your "warn-start" works. I will make a relevant PR soon. Thank you for reporting this.

@sazari
Copy link
Author

@sazari sazari commented May 18, 2018

Thank you very much, @hadjipantelis , your suggestion worked, adding library(mxnet) immediately before first call of caret::train fixes the issue. Please do the needful regarding PR.

@hadjipantelis
Copy link
Contributor

@hadjipantelis hadjipantelis commented May 24, 2018

Made the relevant PR (#891) let me know if it works for you. I have seen you raised #888 too but that is more complicated - I strongly suspect you need to instruct MXNet to use a single core before using parallel.

@sazari
Copy link
Author

@sazari sazari commented May 24, 2018

Installed latest dev version from github and issue is fixed! Works without additional dummy code. Thank you very much! we can close 887 as resolved.

@topepo
Copy link
Owner

@topepo topepo commented May 25, 2018

Thanks

@topepo topepo closed this May 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.