Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Enable using fully predefined indices in resampling #2412
Use fully predefined indices via the
4 classes -> 4 folds.
Outer: 4 classes -> 4 folds
Only compatible with 1 repetition, hence only with "CV".
To use predefined indices in "RepCV" one should use the already existing "Blocking" implementation. This diiffers in the way what the class number does not also define the number of folds and hence more combinations than just the number of folds can be generated.
The user needs to initiliaze this special resampling by setting
As said, the approach uses the factor variable supplied in the task via the
In a nested setting, a possible workflow would look as follows:
inner = makeResampleDesc("CV", iters = 4, fixed = TRUE) outer = makeResampleDesc("CV", iters = 5, fixed = TRUE) tune_wrapper = makeTuneWrapper(lrn, resampling = inner, par.set = ps, control = ctrl, show.info = FALSE) p = resample(tune_wrapper, ct, outer, show.info = FALSE, extract = getTuneResult)
So rather than doing a random sampling, we use the predefined indices specified in "blocking".
The function is smart enough to also deal with a little mispecification by issueing a warning:
inner = makeResampleDesc("CV", iters = 5, fixed = TRUE) outer = makeResampleDesc("CV", iters = 5, fixed = TRUE) "iters (5) is not equal to length of blocking levels (4)!
If inner > outer, an error will be thrown.
By logic, the inner fold count needs always to be one less the outer count.
Users can also combine using fixed indices in the outer and random sampling in the inner:
inner = makeResampleDesc("CV", iters = 5) outer = makeResampleDesc("CV", iters = 5, fixed = TRUE) tune_wrapper = makeTuneWrapper(lrn, resampling = inner, par.set = ps, control = ctrl, show.info = FALSE) expect_success(resample(tune_wrapper, ct, outer, show.info = FALSE, extract = getTuneResult))
To explicitly avoid clashes between "fixed" and "blocking" when a "blocking" factor was given in the task, I had to add a little helper arg.
Just for clarification: This PR changes nothing on the existing "blocking" implemen tation besides the need to explicitly trigger it when using "CV".
What exactly does
Since the number of iterations is fixed by the number of levels, why not make this the automatic choice instead of asking the user to specify it again?
I didn't do this first because I had problems distinguishing between inner and outer and it was easier to get help by the user here (inner should always be one less than outer).
(Another reason was that for a long time I did not use "fixed" and so I had no flag telling me whether I am in a "blocking", "fixed" or "normal" setting.)
referenced this pull request
Aug 14, 2018
referenced this pull request
Aug 15, 2018
Vignette update added. Please review using the netlify preview: https://deploy-preview-2412--nervous-hopper-4136be.netlify.com/articles/resample.html
@larskotthoff updated help page and tutorial - please take a look.
Remember that you can use the netlify preview of the docs (https://deploy-preview-2412--nervous-hopper-4136be.netlify.com/articles/tutorial/resample.html) once the pkgdown files have been deployed by Travis.