Aggregation of measure timetrain is made on test set #1286

MariaErdmann · 2016-10-13T13:02:04Z

Hi,
while debugging #1284 I stumbled over this and this might be a bug:
If one wants to assess only the training time of a learner and uses a resampling strategy like 'cv' he/she will most likely choose to set predict = "train" in the makeResampleDesc. But the the resampling won't work because the aggregation function is set to test.mean. Shouldn't it be set to train.mean?

lrn = makeLearner("classif.rpart")
rdesc = makeResampleDesc("CV", predict = "train")
resample(lrn, binaryclass.task, rdesc, measures = list(timetrain))

#  Error in FUN(X[[i]], ...) : 
#  Aggregation 'test.mean' not compatible with resampling! You have to set arg 
# 'predict' to 'test' or 'both' in your resample object, instead it is 'train'! 

timetrain$aggr
# Aggregation function: test.mean

Taking a closer look at the measure I saw that the default setting for aggr = test.mean is passed.
If you agree that this is a bug, than I would fix this. Additionally, I think it would be great if the note does also tell that the time is measured in seconds (right now it does only say: "Time of fitting the model. ")

The text was updated successfully, but these errors were encountered:

larskotthoff · 2016-10-13T13:16:02Z

Agreed on both points. Could you make a PR for this please (including tests)?

MariaErdmann · 2016-10-13T13:27:33Z

Yes, I will! Shall the test be in test_base_measures.R?
Aggregation for timeboth is also made on the test sets, is there something like aggr = both.mean ?

larskotthoff · 2016-10-13T13:30:17Z

We don't have both.mean (and it wouldn't really make sense), so maybe put out an error or a warning in this case?

I would put the tests in test_base_resample.R.

MariaErdmann · 2016-10-24T08:13:59Z

We discussed this issue last week in our mlr meeting and @ja-thomas (ping :-)) had some arguments which argue against my idea of changing àggrtotrain`

ja-thomas · 2016-10-24T10:29:39Z

Just some thoughts on why I think setting it to train.mean is a bad idea.

I would argue that in most cases we use predict="test" (thats why it is the default). Generally we will have more than just one the measure timetrain, because measuring the time without the performance doesn't make too much sense (except for a certain benchmark thesis ;) ).

So a "normal" thing to do is something like:

lrn = makeLearner("classif.rpart")
rdesc = makeResampleDesc("CV")
resample(lrn, binaryclass.task, rdesc, measures = list(acc, timetrain))

If we change the aggregation of timetrain to train.mean, we would have to set the aggregation by hand for that "standard" case, which should work imo. out of the box.

berndbischl · 2016-11-21T10:07:52Z

please dont do anything here. i am not sure what the best solution is and have assigned myself. i also dont see this as too urgent.

stale · 2019-12-19T00:28:07Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

larskotthoff added type-bug prio-medium labels Oct 14, 2016

MariaErdmann self-assigned this Oct 15, 2016

berndbischl assigned berndbischl and unassigned MariaErdmann Nov 21, 2016

Coorsaa added the project - mlrWorkshop2017 label Mar 9, 2017

stale bot added the stale label Dec 19, 2019

stale bot closed this as completed Dec 26, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Aggregation of measure timetrain is made on test set #1286

Aggregation of measure timetrain is made on test set #1286

MariaErdmann commented Oct 13, 2016

larskotthoff commented Oct 13, 2016

MariaErdmann commented Oct 13, 2016 •

edited

Loading

larskotthoff commented Oct 13, 2016

MariaErdmann commented Oct 24, 2016

ja-thomas commented Oct 24, 2016

berndbischl commented Nov 21, 2016

stale bot commented Dec 19, 2019

Aggregation of measure timetrain is made on test set #1286

Aggregation of measure timetrain is made on test set #1286

Comments

MariaErdmann commented Oct 13, 2016

larskotthoff commented Oct 13, 2016

MariaErdmann commented Oct 13, 2016 • edited Loading

larskotthoff commented Oct 13, 2016

MariaErdmann commented Oct 24, 2016

ja-thomas commented Oct 24, 2016

berndbischl commented Nov 21, 2016

stale bot commented Dec 19, 2019

MariaErdmann commented Oct 13, 2016 •

edited

Loading