-
-
Notifications
You must be signed in to change notification settings - Fork 402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable caching of filter values during tuning #2463
Conversation
set perc = 1 as default
remove nselect arg delete getFilterValues
Could you add a test that checks that results are the same with and without caching please? |
Some comments:
|
I tried with base That's why I used the fs pkg.
Ok, fair enough. |
b568c34
to
adeae6b
Compare
Merge branch 'master' into cache-filtering # Conflicts: # .travis.yml # appveyor.yml # tic.R
@pat-s you might get angry at me saying this now: |
I understand your point. Imo it is sufficient to only have caching for filtering in mlr and implement it properly (pkg wide) in mlr3 right from the start. I do not really a big disadvantage of this PR. Yes it is not pkg wide but does this point make it not mergeable? But if you do not want to have this in mlr because of this point I'll just leave it in the branch. |
If
I disagree. Most of our problems regarding maintainability is because of the long list of packages in Suggests, not the few packages in Imports. |
As far as I've read the docs of
I think we can debate about a lot of packages in SUGGESTS (to possibly clean up) but isn't |
Try if (!dir.exists(path)) dir.create(path) If this does not work, the permissions are indeed set strangely on your machine.
Then the docs are wrong. You spawn multiple threads / processes which do the following concurrently:
So lets assume that the cache dir already existed. Then (1) is no problem. The calculated hash will be the same on all / many workers. As soon as we try to store the results in (4), all workers with the same hash will try to write to the exactly same file concurrently. Depending on the file system different things will happen now.
You either need a thread-safe storage system (like a data base), or pre-compute the results before the parallelization. |
Caching would still be nice to have for sequential execution though. |
Tried again. For whatever reason it works now. Strange. 🤔 😆
Looking at my tests I indeed never checked it with parallelization. I only checked that all callers work ( I tested it now: Case 2: 6 cpus and 3 folds I am not sure if the latter only works if there is a small delay between the writing attempts of both workers. Or if they really wait on each other. Or or or... @mllg Happy to make additional checks on this or making the implementation more robust. My parallelization knowledge ends at this point (= balancing multiple parallel write attempts). From a user perspective I do not see any drawback using the caching method in parallel as well atm? |
Merge branch 'master' into cache-filtering # Conflicts: # tic.R
Merge branch 'master' into cache-filtering # Conflicts: # NEWS.md # R/FilterWrapper.R # R/filterFeatures.R # R/generateFilterValues.R # tic.R
fixes mlr-org#1995 # Implementation - caching via `memoise::memoise()` - caching is done on the filesystem because memory caching is not supported in parallel processes (r-lib/memoise#77) - argument `cache` can be passed to `makeFilterWrapper()`. It works with `resample()`, `tuneParams()`, `filterFeatures()`. - `cache` accepts a logical vector (using default cache dirs determined by the [rappdirs](https://github.com/r-lib/rappdirs) pkg) or a chr vector specifying a custom directory - new function: `delete_cache()`: Deletes ONLY the default cache dirs. - new function: `get_cache_dir()`: Returns the default cache dirs. - removal of argument `nselect` in `generateFilterValuesData()` and all filters because its not used - finally remove deprecated `getFilterValues()` # Benchmark example #### WITH Cache ``` r library(mlr) #> Loading required package: ParamHelpers library(microbenchmark) lrn = makeFilterWrapper(learner = "regr.ksvm", fw.method = "chi.squared", cache = TRUE) ps = makeParamSet(makeNumericParam("fw.perc", lower = 0, upper = 1), makeNumericParam("C", lower = -10, upper = 10, trafo = function(x) 2^x), makeNumericParam("sigma", lower = -10, upper = 10, trafo = function(x) 2^x) ) rdesc = makeResampleDesc("CV", iters = 3) y <- rnorm(100) x <- matrix(rnorm(100 * 2000), 100, 2000) dat <- data.frame(data.frame(y, as.data.frame(x))) task = makeRegrTask(target = "y", data = dat) set.seed(123) microbenchmark(tuneParams(lrn, task = task, resampling = rdesc, par.set = ps, control = makeTuneControlRandom(maxit = 10), show.info = T), times = 1) #> [Tune] Started tuning learner regr.ksvm.filtered for parameter set: #> Type len Def Constr Req Tunable Trafo #> fw.perc numeric - - 0 to 1 - TRUE - #> C numeric - - -10 to 10 - TRUE Y #> sigma numeric - - -10 to 10 - TRUE Y #> With control class: TuneControlRandom #> Imputation value: Inf #> [Tune-x] 1: fw.perc=0.962; C=4.08; sigma=1.23 #> [Tune-y] 1: mse.test.mean=1.0250471; time: 0.5 min #> [Tune-x] 2: fw.perc=0.403; C=195; sigma=0.152 #> [Tune-y] 2: mse.test.mean=1.0250471; time: 0.0 min #> [Tune-x] 3: fw.perc=0.288; C=0.0104; sigma=0.0106 #> [Tune-y] 3: mse.test.mean=1.0222349; time: 0.0 min #> [Tune-x] 4: fw.perc=0.482; C=0.0326; sigma=0.0196 #> [Tune-y] 4: mse.test.mean=1.0226891; time: 0.0 min #> [Tune-x] 5: fw.perc=0.674; C=0.00189; sigma=16.2 #> [Tune-y] 5: mse.test.mean=1.0226888; time: 0.0 min #> [Tune-x] 6: fw.perc=0.352; C=0.283; sigma=85.6 #> [Tune-y] 6: mse.test.mean=1.0273247; time: 0.0 min #> [Tune-x] 7: fw.perc=0.919; C=0.0491; sigma=597 #> [Tune-y] 7: mse.test.mean=1.0242091; time: 0.0 min #> [Tune-x] 8: fw.perc=0.728; C=13.2; sigma=0.00203 #> [Tune-y] 8: mse.test.mean=1.0249831; time: 0.0 min #> [Tune-x] 9: fw.perc=0.395; C=0.736; sigma=2.31 #> [Tune-y] 9: mse.test.mean=1.0307186; time: 0.0 min #> [Tune-x] 10: fw.perc=0.698; C=318; sigma=5.16 #> [Tune-y] 10: mse.test.mean=1.0250471; time: 0.0 min #> [Tune] Result: fw.perc=0.288; C=0.0104; sigma=0.0106 : mse.test.mean=1.0222349 #> Unit: seconds #> expr #> tuneParams(lrn, task = task, resampling = rdesc, par.set = ps, control = makeTuneControlRandom(maxit = 10), show.info = T) #> min lq mean median uq max neval #> 42.67322 42.67322 42.67322 42.67322 42.67322 42.67322 1 ``` <sup>Created on 2018-11-02 by the [reprex package](https://reprex.tidyverse.org) (v0.2.1)</sup> #### WITHOUT caching ``` r library(mlr) #> Loading required package: ParamHelpers library(microbenchmark) lrn = makeFilterWrapper(learner = "regr.ksvm", fw.method = "chi.squared", cache = FALSE) ps = makeParamSet(makeNumericParam("fw.perc", lower = 0, upper = 1), makeNumericParam("C", lower = -10, upper = 10, trafo = function(x) 2^x), makeNumericParam("sigma", lower = -10, upper = 10, trafo = function(x) 2^x) ) rdesc = makeResampleDesc("CV", iters = 3) y <- rnorm(100) x <- matrix(rnorm(100 * 2000), 100, 2000) dat <- data.frame(data.frame(y, as.data.frame(x))) task = makeRegrTask(target = "y", data = dat) set.seed(123) microbenchmark(tuneParams(lrn, task = task, resampling = rdesc, par.set = ps, control = makeTuneControlRandom(maxit = 10), show.info = T), times = 1) #> [Tune] Started tuning learner regr.ksvm.filtered for parameter set: #> Type len Def Constr Req Tunable Trafo #> fw.perc numeric - - 0 to 1 - TRUE - #> C numeric - - -10 to 10 - TRUE Y #> sigma numeric - - -10 to 10 - TRUE Y #> With control class: TuneControlRandom #> Imputation value: Inf #> [Tune-x] 1: fw.perc=0.962; C=4.08; sigma=1.23 #> [Tune-y] 1: mse.test.mean=0.7753929; time: 0.4 min #> [Tune-x] 2: fw.perc=0.403; C=195; sigma=0.152 #> [Tune-y] 2: mse.test.mean=0.7753929; time: 0.3 min #> [Tune-x] 3: fw.perc=0.288; C=0.0104; sigma=0.0106 #> [Tune-y] 3: mse.test.mean=0.7776375; time: 0.4 min #> [Tune-x] 4: fw.perc=0.482; C=0.0326; sigma=0.0196 #> [Tune-y] 4: mse.test.mean=0.7772678; time: 0.4 min #> [Tune-x] 5: fw.perc=0.674; C=0.00189; sigma=16.2 #> [Tune-y] 5: mse.test.mean=0.7778738; time: 0.4 min #> [Tune-x] 6: fw.perc=0.352; C=0.283; sigma=85.6 #> [Tune-y] 6: mse.test.mean=0.7774229; time: 0.4 min #> [Tune-x] 7: fw.perc=0.919; C=0.0491; sigma=597 #> [Tune-y] 7: mse.test.mean=0.7766936; time: 0.4 min #> [Tune-x] 8: fw.perc=0.728; C=13.2; sigma=0.00203 #> [Tune-y] 8: mse.test.mean=0.7762520; time: 0.4 min #> [Tune-x] 9: fw.perc=0.395; C=0.736; sigma=2.31 #> [Tune-y] 9: mse.test.mean=0.7763041; time: 0.4 min #> [Tune-x] 10: fw.perc=0.698; C=318; sigma=5.16 #> [Tune-y] 10: mse.test.mean=0.7753929; time: 0.3 min #> [Tune] Result: fw.perc=0.403; C=195; sigma=0.152 : mse.test.mean=0.7753929 #> Unit: seconds #> expr #> tuneParams(lrn, task = task, resampling = rdesc, par.set = ps, control = makeTuneControlRandom(maxit = 10), show.info = T) #> min lq mean median uq max neval #> 224.1479 224.1479 224.1479 224.1479 224.1479 224.1479 1 ``` <sup>Created on 2018-11-02 by the [reprex package](https://reprex.tidyverse.org) (v0.2.1)</sup>
fixes #1995
Implementation
memoise::memoise()
cache
can be passed tomakeFilterWrapper()
. It works withresample()
,tuneParams()
,filterFeatures()
.cache
accepts a logical vector (using default cache dirs determined by the rappdirs pkg) or a chr vector specifying a custom directorydelete_cache()
: Deletes ONLY the default cache dirs.get_cache_dir()
: Returns the default cache dirs.nselect
ingenerateFilterValuesData()
and all filters because its not usedgetFilterValues()
Benchmark example
WITH Cache
Created on 2018-11-02 by the reprex package (v0.2.1)
WITHOUT caching
Created on 2018-11-02 by the reprex package (v0.2.1)