Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upInvalid seeds for boot632 train control method #349
Comments
|
You hit the nail on the head. It does an extra model fit on the entire data set for that method. I'll fix that and update this thread. Thanks, Max |
|
I checked this and it seems to work if the data(iris)
TrainData <- iris[,1:4]
TrainClasses <- iris[,5]
num_resamples <- 5
num_models <- 5
# specify seeds
set.seed(329)
seeds <- sample(1000, num_resamples*num_models + 1)
seeds <- lapply(seq(from = 1, by = num_models, to = length(seeds)),
function(i_seed) seeds[i_seed:(i_seed+num_models)])
seeds[[length(seeds)]] <- seeds[[length(seeds)]][1]Working# this one expects a list of length 'num_resamples + 1' (call it L)
knnFit1 <- train(TrainData, TrainClasses,
method = "knn",
preProcess = c("center", "scale"),
tuneLength = num_models,
trControl = trainControl(method = "boot",
number = num_resamples,
seeds = seeds))Not working# this one fails, it expects a list of length L+1
knnFit2 <- train(TrainData, TrainClasses,
method = "knn",
preProcess = c("center", "scale"),
tuneLength = num_models,
trControl = trainControl(method = "boot632",
number = num_resamples,
seeds = seeds))If the second case should indeed receive a longer |
Simple example:
Results in:
Cause:
Within the
nominalTrainWorkflowfunction the nestedforeachloops have argumentsiter = seq(along = resampleIndex)andparm = 1:nrow(info$loop). However, whenctrl$methodis"boot632"theresampleIndexvariable changes and gets theAllDataargument appended at the beginning. This causes theforeachloops to do one extra iteration.The last element of the
seedslist is a single numeric value, but iflength(parm) > 1then the following code:... attempts to access a non-existing value, therefore doing
set.seed(NA), which gives the error mentioned above.Solution:
I'm not sure about this but I think that the
seedslist must also getMvalues in the last element whenctrl$method == "boot632", whereMis the number of models as specified in the documentation.