Enable caching of filter values during tuning #2463

pat-s · 2018-10-23T16:51:21Z

fixes #1995

Implementation

caching via memoise::memoise()
caching is done on the filesystem because memory caching is not supported in parallel processes (cache_memory() vs cache_filesystem() r-lib/memoise#77)
argument cache can be passed to makeFilterWrapper(). It works with resample(), tuneParams(), filterFeatures().
cache accepts a logical vector (using default cache dirs determined by the rappdirs pkg) or a chr vector specifying a custom directory
new function: delete_cache(): Deletes ONLY the default cache dirs.
new function: get_cache_dir(): Returns the default cache dirs.
removal of argument nselect in generateFilterValuesData() and all filters because its not used
finally remove deprecated getFilterValues()

Benchmark example

WITH Cache

library(mlr)
#> Loading required package: ParamHelpers
library(microbenchmark)

lrn = makeFilterWrapper(learner = "regr.ksvm", fw.method = "chi.squared", 
                        cache = TRUE)
ps = makeParamSet(makeNumericParam("fw.perc", lower = 0, upper = 1),
                  makeNumericParam("C", lower = -10, upper = 10, 
                                   trafo = function(x) 2^x),
                  makeNumericParam("sigma", lower = -10, upper = 10,
                                   trafo = function(x) 2^x)
)
rdesc = makeResampleDesc("CV", iters = 3)

y <- rnorm(100)
x <- matrix(rnorm(100 * 2000), 100, 2000)
dat <- data.frame(data.frame(y, as.data.frame(x)))
task = makeRegrTask(target = "y", data = dat)


set.seed(123)
microbenchmark(tuneParams(lrn, task = task, resampling = rdesc, par.set = ps,
                          control = makeTuneControlRandom(maxit = 10), show.info = T),
               times = 1)
#> [Tune] Started tuning learner regr.ksvm.filtered for parameter set:
#>            Type len Def    Constr Req Tunable Trafo
#> fw.perc numeric   -   -    0 to 1   -    TRUE     -
#> C       numeric   -   - -10 to 10   -    TRUE     Y
#> sigma   numeric   -   - -10 to 10   -    TRUE     Y
#> With control class: TuneControlRandom
#> Imputation value: Inf
#> [Tune-x] 1: fw.perc=0.962; C=4.08; sigma=1.23
#> [Tune-y] 1: mse.test.mean=1.0250471; time: 0.5 min
#> [Tune-x] 2: fw.perc=0.403; C=195; sigma=0.152
#> [Tune-y] 2: mse.test.mean=1.0250471; time: 0.0 min
#> [Tune-x] 3: fw.perc=0.288; C=0.0104; sigma=0.0106
#> [Tune-y] 3: mse.test.mean=1.0222349; time: 0.0 min
#> [Tune-x] 4: fw.perc=0.482; C=0.0326; sigma=0.0196
#> [Tune-y] 4: mse.test.mean=1.0226891; time: 0.0 min
#> [Tune-x] 5: fw.perc=0.674; C=0.00189; sigma=16.2
#> [Tune-y] 5: mse.test.mean=1.0226888; time: 0.0 min
#> [Tune-x] 6: fw.perc=0.352; C=0.283; sigma=85.6
#> [Tune-y] 6: mse.test.mean=1.0273247; time: 0.0 min
#> [Tune-x] 7: fw.perc=0.919; C=0.0491; sigma=597
#> [Tune-y] 7: mse.test.mean=1.0242091; time: 0.0 min
#> [Tune-x] 8: fw.perc=0.728; C=13.2; sigma=0.00203
#> [Tune-y] 8: mse.test.mean=1.0249831; time: 0.0 min
#> [Tune-x] 9: fw.perc=0.395; C=0.736; sigma=2.31
#> [Tune-y] 9: mse.test.mean=1.0307186; time: 0.0 min
#> [Tune-x] 10: fw.perc=0.698; C=318; sigma=5.16
#> [Tune-y] 10: mse.test.mean=1.0250471; time: 0.0 min
#> [Tune] Result: fw.perc=0.288; C=0.0104; sigma=0.0106 : mse.test.mean=1.0222349
#> Unit: seconds
#>                                                                                                                             expr
#>  tuneParams(lrn, task = task, resampling = rdesc, par.set = ps,      control = makeTuneControlRandom(maxit = 10), show.info = T)
#>       min       lq     mean   median       uq      max neval
#>  42.67322 42.67322 42.67322 42.67322 42.67322 42.67322     1

^{Created on 2018-11-02 by the reprex package (v0.2.1)}

WITHOUT caching

library(mlr)
#> Loading required package: ParamHelpers
library(microbenchmark)

lrn = makeFilterWrapper(learner = "regr.ksvm", fw.method = "chi.squared", 
                        cache = FALSE)
ps = makeParamSet(makeNumericParam("fw.perc", lower = 0, upper = 1),
                  makeNumericParam("C", lower = -10, upper = 10, 
                                   trafo = function(x) 2^x),
                  makeNumericParam("sigma", lower = -10, upper = 10,
                                   trafo = function(x) 2^x)
)
rdesc = makeResampleDesc("CV", iters = 3)

y <- rnorm(100)
x <- matrix(rnorm(100 * 2000), 100, 2000)
dat <- data.frame(data.frame(y, as.data.frame(x)))
task = makeRegrTask(target = "y", data = dat)


set.seed(123)
microbenchmark(tuneParams(lrn, task = task, resampling = rdesc, par.set = ps,
                          control = makeTuneControlRandom(maxit = 10), show.info = T),
               times = 1)
#> [Tune] Started tuning learner regr.ksvm.filtered for parameter set:
#>            Type len Def    Constr Req Tunable Trafo
#> fw.perc numeric   -   -    0 to 1   -    TRUE     -
#> C       numeric   -   - -10 to 10   -    TRUE     Y
#> sigma   numeric   -   - -10 to 10   -    TRUE     Y
#> With control class: TuneControlRandom
#> Imputation value: Inf
#> [Tune-x] 1: fw.perc=0.962; C=4.08; sigma=1.23
#> [Tune-y] 1: mse.test.mean=0.7753929; time: 0.4 min
#> [Tune-x] 2: fw.perc=0.403; C=195; sigma=0.152
#> [Tune-y] 2: mse.test.mean=0.7753929; time: 0.3 min
#> [Tune-x] 3: fw.perc=0.288; C=0.0104; sigma=0.0106
#> [Tune-y] 3: mse.test.mean=0.7776375; time: 0.4 min
#> [Tune-x] 4: fw.perc=0.482; C=0.0326; sigma=0.0196
#> [Tune-y] 4: mse.test.mean=0.7772678; time: 0.4 min
#> [Tune-x] 5: fw.perc=0.674; C=0.00189; sigma=16.2
#> [Tune-y] 5: mse.test.mean=0.7778738; time: 0.4 min
#> [Tune-x] 6: fw.perc=0.352; C=0.283; sigma=85.6
#> [Tune-y] 6: mse.test.mean=0.7774229; time: 0.4 min
#> [Tune-x] 7: fw.perc=0.919; C=0.0491; sigma=597
#> [Tune-y] 7: mse.test.mean=0.7766936; time: 0.4 min
#> [Tune-x] 8: fw.perc=0.728; C=13.2; sigma=0.00203
#> [Tune-y] 8: mse.test.mean=0.7762520; time: 0.4 min
#> [Tune-x] 9: fw.perc=0.395; C=0.736; sigma=2.31
#> [Tune-y] 9: mse.test.mean=0.7763041; time: 0.4 min
#> [Tune-x] 10: fw.perc=0.698; C=318; sigma=5.16
#> [Tune-y] 10: mse.test.mean=0.7753929; time: 0.3 min
#> [Tune] Result: fw.perc=0.403; C=195; sigma=0.152 : mse.test.mean=0.7753929
#> Unit: seconds
#>                                                                                                                             expr
#>  tuneParams(lrn, task = task, resampling = rdesc, par.set = ps,      control = makeTuneControlRandom(maxit = 10), show.info = T)
#>       min       lq     mean   median       uq      max neval
#>  224.1479 224.1479 224.1479 224.1479 224.1479 224.1479     1

^{Created on 2018-11-02 by the reprex package (v0.2.1)}

set perc = 1 as default

remove nselect arg delete getFilterValues

larskotthoff · 2018-10-23T17:17:55Z

Could you add a test that checks that results are the same with and without caching please?

…iltering

mllg · 2018-11-03T19:39:08Z

Some comments:

I don't see that we need the extra dependency fs formlr.
Although nselect is not used by any mlr filter currently, I had some custom filters which used it. I would prefer to keep the few extra lines.

pat-s · 2018-11-04T09:59:15Z

I don't see that we need the extra dependency fs formlr.

I tried with base dir.create() and it did silently fail to create the directory (if a custom cache directory was given). Instead, using fs::dir_create() worked.

That's why I used the fs pkg.
I think that's a valid reason. Also it is in SUGGESTS and should not have that much impact then.

Although nselect is not used by any mlr filter currently, I had some custom filters which used it. I would prefer to keep the few extra lines.

Ok, fair enough.

Merge branch 'master' into cache-filtering # Conflicts: # .travis.yml # appveyor.yml # tic.R

berndbischl · 2018-11-04T19:07:38Z

@pat-s you might get angry at me saying this now:
but i am REALLY unsure whether we should "shoehorn" such a mechnism into mlr now.
caching just for filter values. that seems not that reasonable?
what are other thoughts here? i wnat this at least discussed before we do this, this is a major change to the base system
@mllg @larskotthoff

pat-s · 2018-11-06T09:49:48Z

but i am REALLY unsure whether we should "shoehorn" such a mechnism into mlr now.
caching just for filter values. that seems not that reasonable?

I understand your point.
However, filtering is the code part that will profit from caching most (since it is the only part that is recalled with the same settings every time).
Enabling caching for all mlr parts would involve a lot of more work with not so much impact as for the filter stuff.

Imo it is sufficient to only have caching for filtering in mlr and implement it properly (pkg wide) in mlr3 right from the start.

I do not really a big disadvantage of this PR. Yes it is not pkg wide but does this point make it not mergeable?

But if you do not want to have this in mlr because of this point I'll just leave it in the branch.

mllg · 2018-11-06T13:02:04Z

I tried with base dir.create() and it did silently fail to create the directory (if a custom cache directory was given). Instead, using fs::dir_create() worked.

If dir.create() fails, the alarm bells should start to ring. I assume this is because of race conditions during parallelization. libuv used by fs might have some fallbacks (e.g., timeouts, retries, ...) to solve this. Nevertheless, you still will encounter race conditions for the files. Memorization will not work reliably in parallel.

That's why I used the fs pkg.
I think that's a valid reason. Also it is in SUGGESTS and should not have that much impact then.

I disagree. Most of our problems regarding maintainability is because of the long list of packages in Suggests, not the few packages in Imports.

pat-s · 2018-11-06T13:20:43Z

If dir.create() fails, the alarm bells should start to ring. I assume this is because of race conditions during parallelization. libuv used by fs might have some fallbacks (e.g., timeouts, retries, ...) to solve this. Nevertheless, you still will encounter race conditions for the files. Memorization will not work reliably in parallel.

dir.create() is just doing a one-time call creating the directory for caching. Even when executing it "manually" it failed to create the dir. I assume some permission problems here? But I did not debug further.
To be more explicit, dir.create(rappdirs::use_cache_dir()) failed while fs::dir_create(rappdirs::use_cache_dir()) worked on my machine.

Memorization will not work reliably in parallel.

As far as I've read the docs of memoise, parallelization is not a problem as long as caching is done on the filesystem. Parallelization and caching do not work when caching in memory as the processes do not share the same memory.

I disagree. Most of our problems regarding maintainability is because of the long list of packages in Suggests, not the few packages in Imports.

I think we can debate about a lot of packages in SUGGESTS (to possibly clean up) but isn't fs one of those that really do a better (safer) job than the base R implementation?
If you insist on not using fs, it will take quite some time to figure out why dir.create() failed silently (!) :/.

mllg · 2018-11-06T13:57:21Z

If dir.create() fails, the alarm bells should start to ring. I assume this is because of race conditions during parallelization. libuv used by fs might have some fallbacks (e.g., timeouts, retries, ...) to solve this. Nevertheless, you still will encounter race conditions for the files. Memorization will not work reliably in parallel.

dir.create() is just doing a one-time call creating the directory for caching. Even when executing it "manually" it failed to create the dir. I assume some permission problems here? But I did not debug further.
To be more explicit, dir.create(rappdirs::use_cache_dir()) failed while fs::dir_create(rappdirs::use_cache_dir()) worked on my machine.

Try

if (!dir.exists(path)) dir.create(path)

If this does not work, the permissions are indeed set strangely on your machine.

Memorization will not work reliably in parallel.

As far as I've read the docs of memoise, parallelization is not a problem as long as caching is done on the filesystem. Parallelization and caching do not work when caching in memory as the processes do not share the same memory.

Then the docs are wrong. You spawn multiple threads / processes which do the following concurrently:

Check if cache dir exists and create it, if necessary
Hash the inputs to the function call
Run the filtering
Store result as cache_dir/[hash].rds
Next computation will load the stored file

So lets assume that the cache dir already existed. Then (1) is no problem. The calculated hash will be the same on all / many workers. As soon as we try to store the results in (4), all workers with the same hash will try to write to the exactly same file concurrently. Depending on the file system different things will happen now.

On a local file system with a working file locking, the workers will write the file, one after another. Other workers in the meantime might try to load the file for (5), and now have to wait (which can lead to timeouts).
On a network file system without proper locking, you will corrupt your files, and system admins will get very angry because you overburden the file system and make the system unresponsive for everyone.

You either need a thread-safe storage system (like a data base), or pre-compute the results before the parallelization.

mllg · 2018-11-06T15:00:48Z

Caching would still be nice to have for sequential execution though.

pat-s · 2018-11-06T15:09:09Z

Try

if (!dir.exists(path)) dir.create(path)
If this does not work, the permissions are indeed set strangely on your machine.

Tried again. For whatever reason it works now. Strange. 🤔 😆

You either need a thread-safe storage system (like a data base), or pre-compute the results before the parallelization.

Looking at my tests I indeed never checked it with parallelization. I only checked that all callers work (resample(), tuneParams(), filterFeatures()).
I see the point of the potential writing conflict.

I tested it now:
Case 1: 3 cpus and 3 folds
In this case we get always different hashes since the tasks are different and no problems during writing.

Case 2: 6 cpus and 3 folds
When using cpus = 6 and folds = 3, two workers each are running on the exact same setup and would write the same hash file (potentially at the same time). This also works without trouble for me. In the end I have 3 hash files in the cache dir (I deleted the cache dir before the run).

I am not sure if the latter only works if there is a small delay between the writing attempts of both workers. Or if they really wait on each other. Or or or...
But in the end it worked for me in the same way as doing it sequentially (no additional files, no conflicts).

@mllg Happy to make additional checks on this or making the implementation more robust. My parallelization knowledge ends at this point (= balancing multiple parallel write attempts). From a user perspective I do not see any drawback using the caching method in parallel as well atm?

Merge branch 'master' into cache-filtering # Conflicts: # tic.R

changed default of cache disable tests to write to the user's home directory create paths with `recursive = TRUE`

…iltering

Merge branch 'master' into cache-filtering # Conflicts: # NEWS.md # R/FilterWrapper.R # R/filterFeatures.R # R/generateFilterValues.R # tic.R

Build URL: https://travis-ci.org/mlr-org/mlr/builds/487534341 Commit: 632c995

fixes mlr-org#1995 # Implementation - caching via `memoise::memoise()` - caching is done on the filesystem because memory caching is not supported in parallel processes (r-lib/memoise#77) - argument `cache` can be passed to `makeFilterWrapper()`. It works with `resample()`, `tuneParams()`, `filterFeatures()`. - `cache` accepts a logical vector (using default cache dirs determined by the [rappdirs](https://github.com/r-lib/rappdirs) pkg) or a chr vector specifying a custom directory - new function: `delete_cache()`: Deletes ONLY the default cache dirs. - new function: `get_cache_dir()`: Returns the default cache dirs. - removal of argument `nselect` in `generateFilterValuesData()` and all filters because its not used - finally remove deprecated `getFilterValues()` # Benchmark example #### WITH Cache ``` r library(mlr) #> Loading required package: ParamHelpers library(microbenchmark) lrn = makeFilterWrapper(learner = "regr.ksvm", fw.method = "chi.squared", cache = TRUE) ps = makeParamSet(makeNumericParam("fw.perc", lower = 0, upper = 1), makeNumericParam("C", lower = -10, upper = 10, trafo = function(x) 2^x), makeNumericParam("sigma", lower = -10, upper = 10, trafo = function(x) 2^x) ) rdesc = makeResampleDesc("CV", iters = 3) y <- rnorm(100) x <- matrix(rnorm(100 * 2000), 100, 2000) dat <- data.frame(data.frame(y, as.data.frame(x))) task = makeRegrTask(target = "y", data = dat) set.seed(123) microbenchmark(tuneParams(lrn, task = task, resampling = rdesc, par.set = ps, control = makeTuneControlRandom(maxit = 10), show.info = T), times = 1) #> [Tune] Started tuning learner regr.ksvm.filtered for parameter set: #> Type len Def Constr Req Tunable Trafo #> fw.perc numeric - - 0 to 1 - TRUE - #> C numeric - - -10 to 10 - TRUE Y #> sigma numeric - - -10 to 10 - TRUE Y #> With control class: TuneControlRandom #> Imputation value: Inf #> [Tune-x] 1: fw.perc=0.962; C=4.08; sigma=1.23 #> [Tune-y] 1: mse.test.mean=1.0250471; time: 0.5 min #> [Tune-x] 2: fw.perc=0.403; C=195; sigma=0.152 #> [Tune-y] 2: mse.test.mean=1.0250471; time: 0.0 min #> [Tune-x] 3: fw.perc=0.288; C=0.0104; sigma=0.0106 #> [Tune-y] 3: mse.test.mean=1.0222349; time: 0.0 min #> [Tune-x] 4: fw.perc=0.482; C=0.0326; sigma=0.0196 #> [Tune-y] 4: mse.test.mean=1.0226891; time: 0.0 min #> [Tune-x] 5: fw.perc=0.674; C=0.00189; sigma=16.2 #> [Tune-y] 5: mse.test.mean=1.0226888; time: 0.0 min #> [Tune-x] 6: fw.perc=0.352; C=0.283; sigma=85.6 #> [Tune-y] 6: mse.test.mean=1.0273247; time: 0.0 min #> [Tune-x] 7: fw.perc=0.919; C=0.0491; sigma=597 #> [Tune-y] 7: mse.test.mean=1.0242091; time: 0.0 min #> [Tune-x] 8: fw.perc=0.728; C=13.2; sigma=0.00203 #> [Tune-y] 8: mse.test.mean=1.0249831; time: 0.0 min #> [Tune-x] 9: fw.perc=0.395; C=0.736; sigma=2.31 #> [Tune-y] 9: mse.test.mean=1.0307186; time: 0.0 min #> [Tune-x] 10: fw.perc=0.698; C=318; sigma=5.16 #> [Tune-y] 10: mse.test.mean=1.0250471; time: 0.0 min #> [Tune] Result: fw.perc=0.288; C=0.0104; sigma=0.0106 : mse.test.mean=1.0222349 #> Unit: seconds #> expr #> tuneParams(lrn, task = task, resampling = rdesc, par.set = ps, control = makeTuneControlRandom(maxit = 10), show.info = T) #> min lq mean median uq max neval #> 42.67322 42.67322 42.67322 42.67322 42.67322 42.67322 1 ``` <sup>Created on 2018-11-02 by the [reprex package](https://reprex.tidyverse.org) (v0.2.1)</sup> #### WITHOUT caching ``` r library(mlr) #> Loading required package: ParamHelpers library(microbenchmark) lrn = makeFilterWrapper(learner = "regr.ksvm", fw.method = "chi.squared", cache = FALSE) ps = makeParamSet(makeNumericParam("fw.perc", lower = 0, upper = 1), makeNumericParam("C", lower = -10, upper = 10, trafo = function(x) 2^x), makeNumericParam("sigma", lower = -10, upper = 10, trafo = function(x) 2^x) ) rdesc = makeResampleDesc("CV", iters = 3) y <- rnorm(100) x <- matrix(rnorm(100 * 2000), 100, 2000) dat <- data.frame(data.frame(y, as.data.frame(x))) task = makeRegrTask(target = "y", data = dat) set.seed(123) microbenchmark(tuneParams(lrn, task = task, resampling = rdesc, par.set = ps, control = makeTuneControlRandom(maxit = 10), show.info = T), times = 1) #> [Tune] Started tuning learner regr.ksvm.filtered for parameter set: #> Type len Def Constr Req Tunable Trafo #> fw.perc numeric - - 0 to 1 - TRUE - #> C numeric - - -10 to 10 - TRUE Y #> sigma numeric - - -10 to 10 - TRUE Y #> With control class: TuneControlRandom #> Imputation value: Inf #> [Tune-x] 1: fw.perc=0.962; C=4.08; sigma=1.23 #> [Tune-y] 1: mse.test.mean=0.7753929; time: 0.4 min #> [Tune-x] 2: fw.perc=0.403; C=195; sigma=0.152 #> [Tune-y] 2: mse.test.mean=0.7753929; time: 0.3 min #> [Tune-x] 3: fw.perc=0.288; C=0.0104; sigma=0.0106 #> [Tune-y] 3: mse.test.mean=0.7776375; time: 0.4 min #> [Tune-x] 4: fw.perc=0.482; C=0.0326; sigma=0.0196 #> [Tune-y] 4: mse.test.mean=0.7772678; time: 0.4 min #> [Tune-x] 5: fw.perc=0.674; C=0.00189; sigma=16.2 #> [Tune-y] 5: mse.test.mean=0.7778738; time: 0.4 min #> [Tune-x] 6: fw.perc=0.352; C=0.283; sigma=85.6 #> [Tune-y] 6: mse.test.mean=0.7774229; time: 0.4 min #> [Tune-x] 7: fw.perc=0.919; C=0.0491; sigma=597 #> [Tune-y] 7: mse.test.mean=0.7766936; time: 0.4 min #> [Tune-x] 8: fw.perc=0.728; C=13.2; sigma=0.00203 #> [Tune-y] 8: mse.test.mean=0.7762520; time: 0.4 min #> [Tune-x] 9: fw.perc=0.395; C=0.736; sigma=2.31 #> [Tune-y] 9: mse.test.mean=0.7763041; time: 0.4 min #> [Tune-x] 10: fw.perc=0.698; C=318; sigma=5.16 #> [Tune-y] 10: mse.test.mean=0.7753929; time: 0.3 min #> [Tune] Result: fw.perc=0.403; C=195; sigma=0.152 : mse.test.mean=0.7753929 #> Unit: seconds #> expr #> tuneParams(lrn, task = task, resampling = rdesc, par.set = ps, control = makeTuneControlRandom(maxit = 10), show.info = T) #> min lq mean median uq max neval #> 224.1479 224.1479 224.1479 224.1479 224.1479 224.1479 1 ``` <sup>Created on 2018-11-02 by the [reprex package](https://reprex.tidyverse.org) (v0.2.1)</sup>

pat-s added 6 commits October 23, 2018 16:03

remove nselect argument from filters

dc3db1e

add caching option via memoise

d72a70c

set perc = 1 as default

hide documentation

8caa588

remove nselect arg delete getFilterValues

indent and add cache argument

fa5eefa

add cache argument

edc8354

add cache argument

b2f6eaa

pat-s added type-enhancement pr-work in progress - not done project - base labels Oct 23, 2018

pat-s added 20 commits October 23, 2018 22:38

.learner$par.vals$cache -> .learner$cache

6fd95fb

export generateFilterValuesData

38e747c

adjust cache

b108d28

add test

c6a7810

populate cache through resample()

6a65de8

call memoise explicitly

84568eb

don't import memoise

b186bf1

memoise in suggests

9c0458f

document cache argument

17a891d

revert perc = 1

df8bcf7

set a default for cache in tuneParams()

e536b1e

remove nselect from tests

1da7765

Merge branch 'master' into cache-filtering

9cc7ca0

add NEWS

bd9261d

update NEWS.md

88d9b3a

Merge branch 'cache-filtering' of github.com:mlr-org/mlr into cache-f…

5f3eb9f

…iltering

mrmr.classis needs a feature_count arg -> set to getTaskNFeats(task)

ea8cb56

remove getFilterValues() test

8440ff6

apply NAMESPACE changes in pkgdown call

1dd19eb

Merge branch 'master' into cache-filtering

a97af43

restore arg nselect in filters

adeae6b

pat-s force-pushed the cache-filtering branch from b568c34 to adeae6b Compare November 4, 2018 10:11

merge master

2ecf2f8

Merge branch 'master' into cache-filtering # Conflicts: # .travis.yml # appveyor.yml # tic.R

pat-s mentioned this pull request Nov 4, 2018

Caching mlr-org/mlr3pipelines#16

Open

pat-s and others added 14 commits November 6, 2018 16:12

remove fs functions

5b80ecb

Merge branch 'master' into cache-filtering

df0f9f3

merge master

b21e70f

Merge branch 'master' into cache-filtering # Conflicts: # tic.R

update caching documentation

748a4f9

fix tut

161a9a1

Merge branch 'master' into cache-filtering

b434f30

Merge branch 'master' into cache-filtering

cd252e6

cleanups

319841a

changed default of cache disable tests to write to the user's home directory create paths with `recursive = TRUE`

added note on thread safety

9a86495

Merge branch 'cache-filtering' of github.com:mlr-org/mlr into cache-f…

9495c08

…iltering

re-add nselect arg doc

101e88c

Merge branch 'master' into cache-filtering

47e4003

merge master

632c995

Merge branch 'master' into cache-filtering # Conflicts: # NEWS.md # R/FilterWrapper.R # R/filterFeatures.R # R/generateFilterValues.R # tic.R

Deploy from Travis build 13371 [ci skip]

20d5ecc

Build URL: https://travis-ci.org/mlr-org/mlr/builds/487534341 Commit: 632c995

pat-s merged commit a18312b into master Feb 1, 2019

pat-s deleted the cache-filtering branch February 1, 2019 18:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable caching of filter values during tuning #2463

Enable caching of filter values during tuning #2463

pat-s commented Oct 23, 2018 •

edited

Loading

larskotthoff commented Oct 23, 2018

mllg commented Nov 3, 2018

pat-s commented Nov 4, 2018

berndbischl commented Nov 4, 2018

pat-s commented Nov 6, 2018

mllg commented Nov 6, 2018

pat-s commented Nov 6, 2018

mllg commented Nov 6, 2018

mllg commented Nov 6, 2018

pat-s commented Nov 6, 2018

Enable caching of filter values during tuning #2463

Enable caching of filter values during tuning #2463

Conversation

pat-s commented Oct 23, 2018 • edited Loading

Implementation

Benchmark example

WITH Cache

WITHOUT caching

larskotthoff commented Oct 23, 2018

mllg commented Nov 3, 2018

pat-s commented Nov 4, 2018

berndbischl commented Nov 4, 2018

pat-s commented Nov 6, 2018

mllg commented Nov 6, 2018

pat-s commented Nov 6, 2018

mllg commented Nov 6, 2018

mllg commented Nov 6, 2018

pat-s commented Nov 6, 2018

pat-s commented Oct 23, 2018 •

edited

Loading