Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ParBayesianOptimization suddenly fails while logging epoch results #33

Closed
hanibalSC opened this issue May 6, 2021 · 8 comments
Closed

Comments

@hanibalSC
Copy link

Hi,

Firstly, thanks for sharing this package.

I am currently using it to tune parameters for ML methods. While searching for an optimal cost parameter for the svmLinear2 model (contained in caret), the optimization stopped with a sudden error after successfully completing 15 iterations.

image

Here is the error traceback:

  Item 2 has 9 columns, inconsistent with item 1 which has 10 columns. To fill missing columns use fill=TRUE. 
7.
rbindlist(l, use.names, fill, idcol) 
6.
rbind(deparse.level, ...) 
5.
rbind(scoreSummary, data.table(Epoch = rep(Epoch, nrow(NewResults)), 
    Iteration = 1:nrow(NewResults) + nrow(scoreSummary), inBounds = rep(TRUE, 
        nrow(NewResults)), NewResults)) 
4.
addIterations(optObj, otherHalting = otherHalting, iters.n = iters.n, 
    iters.k = iters.k, parallel = parallel, plotProgress = plotProgress, 
    errorHandling = errorHandling, saveFile = saveFile, verbose = verbose, 
    ...) 
3.
ParBayesianOptimization::bayesOpt(FUN = ...

So somehow the data tables storing the summary information each iteration suddenly differ in the number of columns present. Is this a common bug with the ParBayesianOptimization package?

Should fill be set to TRUE in rbind for robustness? Or is something more fundamental wrong?

Thanks!

P.S.: I posted this to https://github.com/cran/ParBayesianOptimization/issues/1, but maybe it is more appropriate here.

@hanibalSC
Copy link
Author

The issue can also be found on https://stackoverflow.com/questions/67418224/parbayesianoptimization-suddenly-fails-while-logging-epoch-results.

I was able to reproduce the issue on the iris dataset.

library(data.table)
library(caret)
library(ParBayesianOptimization)
set.seed(1234)

bayes.opt.bounds = list()
bayes.opt.bounds[["svmRadial"]] = list(C = c(0,1000),
                                       sigma = c(0,500))

svmRadScore = function(...){
  grid = data.frame(...)
  mod = caret::train(Species~., data=iris, method = "svmRadial",
                     trControl = trainControl(method = "repeatedcv",
                                              number = 7, repeats = 5),
                     tuneGrid = grid)
  return(list(Score = caret::getTrainPerf(mod)[, "TrainAccuracy"], Pred = 0))
}

bayes.create.grid.par = function(bounds, n = 10){
  grid = data.table()
  params = names(bounds)
  grid[, c(params) := lapply(bounds, FUN = function(minMax){ 
    return(runif(n, minMax[1], minMax[2]))}
  )]
  return(grid)
}

prior.grid.rad = bayes.create.grid.par(bayes.opt.bounds[["svmRadial"]])
svmRadOpt = ParBayesianOptimization::bayesOpt(FUN = svmRadScore,
                                              bounds = bayes.opt.bounds[["svmRadial"]],
                                              initGrid = prior.grid.rad,
                                              iters.n = 100,
                                              acq = "ucb", kappa = 1, parallel = FALSE,plotProgress = TRUE)

Using this example, the error occurred on the 9th epoch.
image

@AnotherSamWilson
Copy link
Owner

Hmmm I have seen this error pop up here a few times, but I could never get my hands on a reproducible example. This usually happens when the scoring function returns something unexpected, like an error. Thanks for providing this example, I'll look into it.

@AnotherSamWilson
Copy link
Owner

Would you mind showing me your output for sessionInfo()

@AnotherSamWilson
Copy link
Owner

AnotherSamWilson commented May 10, 2021

I think I've figured it out. I'm not sure exactly where the error occurs in caret, but on the 9th iteration, the following parameters are passed to the scoring function:

   C    sigma
1: 0 46.63388

These result in an error. The following message is spit out:

Something is wrong; all the Accuracy metric values are missing:
    Accuracy       Kappa    
 Min.   : NA   Min.   : NA  
 1st Qu.: NA   1st Qu.: NA  
 Median : NA   Median : NA  
 Mean   :NaN   Mean   :NaN  
 3rd Qu.: NA   3rd Qu.: NA  
 Max.   : NA   Max.   : NA  
 NA's   :1     NA's   :1    

However, there is built in error-handling in ParBayesianOptimization that is supposed to catch this and provide more useful error information. It's not working in this case. I'll work on fixing that now.

@hanibalSC
Copy link
Author

That's good to know. I would not have guessed to check the scoring function. I was just very confused as to why the dimensions in the summary items would suddenly change... I wonder why that specific example results in all the Accuracy metrics to be missing though (given there is no missing data).

I suppose that the cost should be positive, and not 0, given the problem statement SVMs are trying to solve (as seen when glancing over this thread: https://stats.stackexchange.com/questions/185994/the-cost-parameter-for-support-vector-machines).

Thanks for clarifying!

I suppose a more useful error message will help future users detect their error ;-) and thanks again for making this awesome package available!

@hanibalSC
Copy link
Author

For documentation purposes:

I believe setting the verbose option to 2 may print out the parameters for each iteration. Maybe this would have been helpful in identifying the issue.

@hanibalSC
Copy link
Author

For reference, I also summarized your findings and my interpretation for other users at https://stackoverflow.com/questions/67418224/parbayesianoptimization-suddenly-fails-while-logging-epoch-results/67477307#67477307, where I raised the issue before reporting it. Let me know if you have a stackoverflow account you would like me to reference as well :-)

@AnotherSamWilson
Copy link
Owner

Moving to issue #34

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants