Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensembles parallel #1116

Merged
merged 26 commits into from
Mar 10, 2017
Merged

Ensembles parallel #1116

merged 26 commits into from
Mar 10, 2017

Conversation

zmjones
Copy link
Contributor

@zmjones zmjones commented Aug 11, 2016

this is a clone of #615 in a branch in this repo so that @PhilippPro can help me fix the multilabel breakage

@zmjones
Copy link
Contributor Author

zmjones commented Aug 11, 2016

thanks for the help @PhilippPro

@berndbischl
Copy link
Sponsor Member

  1. please remove the stacking stuff. we have that completely refactored in another PR. this will create too many problems. store the diff, keep the branch. and doc for now that it is not parallel

  2. make sure that users understand what the current par level now does. in the tutorial. list the ensemble learners there.

  3. remove the partial dep stuff. but do a new PR for this. seems nearly finished.

  4. remove the ens prediction stuff. we will do general parallel prediction.

 - skipped stacking due to incoming refactoring
 - skipped prediction (left in branch) due to proposed
   parallelization of all prediction functions
@zmjones
Copy link
Contributor Author

zmjones commented Aug 11, 2016

all that should be done now (minus tutorials and new pr)

@jakob-r
Copy link
Sponsor Member

jakob-r commented Aug 11, 2016

there might still be some cleanup necessary. but i am out for the moment.

@zmjones
Copy link
Contributor Author

zmjones commented Aug 12, 2016

@PhilippPro looks like the remaining errors are related to the multilabel stuff. help?

@PhilippPro
Copy link
Member

ok. first have to look on my own stuff but will try it later.

Am 12.08.2016 um 12:48 schrieb Zachary M. Jones:

@PhilippPro https://github.com/PhilippPro looks like the remaining
errors are related to the multilabel stuff. help?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1116 (comment), or
mute the thread
https://github.com/notifications/unsubscribe-auth/ALCX-kaybZfoEphbOu1GY0r5LwcjPkBMks5qfE-XgaJpZM4Jh5Yc.

@zmjones
Copy link
Contributor Author

zmjones commented Aug 12, 2016

thank you!

@zmjones
Copy link
Contributor Author

zmjones commented Aug 17, 2016

if someone else wants to finish this up i would be grateful. i don't know the multilabel stuff (which is what i think is broken).

@PhilippPro
Copy link
Member

PhilippPro commented Aug 17, 2016

I think there are more things that do not work. As you can see in the current travis build, the actual error occurs also in generatePartialDependence and MulticlassWrapper (and MultilabelDBRWrapper). I just solved the problem with the multilabel example in the commit before.

@zmjones
Copy link
Contributor Author

zmjones commented Aug 17, 2016

ah ok i will take a look at it again then. thanks

@PhilippPro
Copy link
Member

I can look at the multilabel part if everything else is ok. It's just disgusting to look at these errors, because you have to step deeply into the infrastructure of predictLearner, predictLearner2, etc. to find the problems and it takes a long time.

@jakob-r
Copy link
Sponsor Member

jakob-r commented Sep 6, 2016

I fixed many variable missnamings and other stuff. But I can not get behind why the "MultilabelDBRWrapper" fails if it wasn't even touched!

@PhilippPro
Copy link
Member

It depends on MultilabelBinaryRelevanceWrapper and this was touched.

@jakob-r
Copy link
Sponsor Member

jakob-r commented Sep 6, 2016

Oh I indeed oversaw this. I guess I know the issue then

@jakob-r
Copy link
Sponsor Member

jakob-r commented Feb 6, 2017

Will auto merge in 24 hours if no objections are raised.

@berndbischl
Copy link
Sponsor Member

Will auto merge in 24 hours if no objections are raised.

no, you can raise concerns that we are wasting your time. shout at me.
but this i really dont want.

i will try to review now.

@berndbischl
Copy link
Sponsor Member

the indentation is wrong in several places.

more importantly:
where are the docs what is parallelolized now?

@jakob-r
Copy link
Sponsor Member

jakob-r commented Feb 7, 2017

True, documentation is totally missing. Any hint of where it would make sense to add it? Just add a note for each wrapper that it can be parallelized with the specific tag?

indentation... 🙄 I will take care of it

@jakob-r jakob-r self-assigned this Feb 7, 2017
@berndbischl
Copy link
Sponsor Member

berndbischl commented Feb 7, 2017

True, documentation is totally missing. Any hint of where it would make sense to add it? Just add a note for each wrapper that it can be parallelized with the specific tag?

well, i think it needs to go here:
https://mlr-org.github.io/mlr-tutorial/release/html/parallelization/index.html

BUT:

I think we should also have this in R. makes it much easier to review stuff like this, whether it is "complete". and complete documentation about behavior of options should always be available in R.

but we dont want to copy-paste docs to 2 places. I would suggest:
we create a mini doc page ?mlrParallel, copy the tutorial LEVEL definition info there, and you add your stuff.
EDIT: and we remove the parallel levels table from the tutorial and simply link to it

for completeness we can add a sentence and a link to the wrapper that is affected by your change here.

what do you think?

@jakob-r
Copy link
Sponsor Member

jakob-r commented Feb 24, 2017

So. Documentations is done, tests are running. Should be merged soon.

@jakob-r jakob-r removed their assignment Feb 24, 2017
@jakob-r
Copy link
Sponsor Member

jakob-r commented Feb 27, 2017

@berndbischl ping

Copy link
Sponsor Member

@larskotthoff larskotthoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to extend the tests to check the predictions as well.

@jakob-r jakob-r merged commit a49bf1f into master Mar 10, 2017
@jakob-r
Copy link
Sponsor Member

jakob-r commented Mar 10, 2017

We agreed that we do not need to check the result of the predictions as the parallelMap code is also run in the normal tests of the ensembles.
Actually it can be argued, that the parallel test can bee seen as superfluous as it just tests the parallelMap behavior.

@mllg mllg deleted the ensembles_parallel branch March 10, 2017 14:37
@berndbischl
Copy link
Sponsor Member

Actually it can be argued, that the parallel test can bee seen as superfluous as it just tests the parallelMap behavior.

not really. i detected many bugs with this test, that were not pm bugs. but problems in mlr.
most problematic thing: exporting options to slave and so on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants