Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#747 mergeBenchmarkResults #914

Merged
merged 13 commits into from
Nov 30, 2016
Merged

#747 mergeBenchmarkResults #914

merged 13 commits into from
Nov 30, 2016

Conversation

florianfendt
Copy link
Contributor

Not totally ready to merge yet.
We need to agree on how we want to treat the following situation:
All bmrs have been optimized on the same measure but for some bmrs additional measures were calculated.
Right now all the measures would need to be equal and we stop with an error otherwise.

#' @export
mergeBenchmarkResults = function(...) {
set = list(...)
for (i in 1:length(set)) {
Copy link
Contributor

@giuseppec giuseppec Jul 4, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use seq_len(set)

@berndbischl
Copy link
Sponsor Member

@florianfendt
have you looked at the comments by @giuseppec ? status?

@florianfendt
Copy link
Contributor Author

@berndbischl
Yes, just did it. See my notes above.

@berndbischl
Copy link
Sponsor Member

@giuseppec

  1. the code basically has to be completely rewritten. too many problems.
    refactor and make nicer.

  2. these checks have be performed.

a) all task types equal
b) all measures must be equal, for each task
c) after the merge, the bmr must be a full cross product for tasks x learners
d) for each task, the resampling objects (over learners) must the same
e) the ... args should be a list "bmrs". better to code on.

@giuseppec
Copy link
Contributor

Maybe we should allow merging bmr objects, even if the full cross product for tasks x learners is incomplete and simply add NA values to the missing task-learner combinations.
Reason: If a single learner fails for a specific task, the batchmark function from @ja-thomas currently throws away all results from all other learners on this specific task. In that case it is maybe better to just print NA for the failed task-learner combination.
@zmjones do you know what happens with the BMR plots when the BMR object contains a task-learner combination with NA?

@larskotthoff
Copy link
Sponsor Member

Should we have a separate issue/PR for supporting NAs in BenchmarkResults? This is probably just testing that everything works as expected in the presence of NAs.

@larskotthoff
Copy link
Sponsor Member

Are there any other separate PRs required for this or can it go ahead now?

@giuseppec
Copy link
Contributor

No more separate PR here, I think we are not finished here but you can go ahead and give some general feedback for this PR. There will be several merge conflicts due to function renamings.

@florianfendt florianfendt self-assigned this Nov 7, 2016
@giuseppec
Copy link
Contributor

@florianfendt what happend here? It seems that you did something wrong while rebasing, I see 204 files changed? In the worst case we have to open up a new PR and throw this away.

@florianfendt
Copy link
Contributor Author

@giuseppec
Yes, I'm on it at the moment. Sth went wrong while merging. I'll figure it out I hope,
otherwise I'll open a new PR

@larskotthoff
Copy link
Sponsor Member

Is this ready to be reviewed/merged now?

@florianfendt
Copy link
Contributor Author

@giuseppec
Did you have a look? Are we finished here?

@giuseppec
Copy link
Contributor

I did some cleanup and extensions. @larskotthoff you could now review.

#' @return \code{\link{BenchmarkResult}}
#' @details Note that if you want to merge several \code{\link{BenchmarkResult}}
#' objects you must ensure that all possible learner and task combinations will be
#' contained in the return object.\cr
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not clear what this means. Do I have to postprocess the returned object to contain all learners and tasks?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.


# check for duplicated or missing task-learner combinations
all.combos = expand.grid(task.id = task.ids, learner.id = learner.ids)
all.combos = paste(all.combos$task.id, all.combos$learner.id, sep = " - ")
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use stringi functions for string operations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

existing.combos = rbindlist(lapply(bmrs, function(bmr) {
getBMRAggrPerformances(bmr, as.df = TRUE)[, c("task.id", "learner.id")]
}))
existing.combos = paste(existing.combos$task.id, existing.combos$learner.id, sep = " - ")
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stri_paste

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@larskotthoff
Copy link
Sponsor Member

Thanks for changing, merging.

@larskotthoff larskotthoff merged commit 0e2950b into master Nov 30, 2016
@larskotthoff larskotthoff deleted the mergeBenchmarkResults branch November 30, 2016 17:30
larskotthoff pushed a commit that referenced this pull request Nov 30, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants