[MRG+2] MultiOutputClassifier #6127

hugobowne · 2016-01-07T02:02:53Z

TODO for this PR

Check if meta of meta works (a few combinations)
To check the shapes of outputs of all methods
Check if meta of meta works (a few combinations)
Compare with multiple runs of single o/p estimators
Check if the various errors are raised properly (when the n_outputs during predict time is > (or !=?) during fit time).
Check if not fitted error is raised? (or could we make the general tests take care of that? I think there is a test for meta estimators.)
See if this passes all the meta estimators' tests
Make the tests pass ;)
Add documentation

raghavrv · 2016-01-07T03:42:00Z

Thanks a lot for the PR, could you remove the main in test file, as all tests are run by nose

raghavrv · 2016-01-07T03:42:59Z

sklearn/tests/test_mult_one_vs_rests.py

Could you change the docstrings of all test* functions to comments please?

james-nichols · 2016-01-07T04:35:33Z

@rvraghav93, I just had a chat with @hugobowne about doing these changes and fixing the unit test failures. Might submit a PR soon if that's ok...?

raghavrv · 2016-01-07T04:38:31Z

You mean a PR to @hugobowne's branch right? Raising another PR to scikit learn is not necessary :)

raghavrv · 2016-01-07T04:44:43Z

Also a few points -

The file MultiOneRest is empty? This PR seems to have only the tests.

And I think the prefered filename would be multioutput.py

Please see if this approach could be followed instead?

maniteja123 · 2016-01-07T04:46:47Z

I suppose the file MultiOneRest.py was committed as a executable and hence showing in the diff as empty. Had faced this issue before :)

raghavrv · 2016-01-07T04:48:45Z

Ah that's new to me!

hugobowne · 2016-01-07T12:49:23Z

wacky issue!

@rvraghav93, i think the preferred filename should be multi_one_vs_rest.py or something along these lines, as this PR deals specifically with '[one-versus-all] classification models' --

In particular, it doesn't deal with regressors at all. Moreover, it should probably be generalized to deal with all classification models (this will be an easy extension). @rvraghav93, we had completed it before you suggested your approach.

after fixing all necessary issues, i suggest we i) generalize to deal with all classification models & leave regressors for a different PR.

thoughts?

raghavrv · 2016-01-07T14:30:43Z

Moreover, it should probably be generalized to deal with all classification models (this will be an easy extension).

Yes, like @mblondel suggests here, we should have MutiOutputClassifier and MultiOutputRegressor put inside the multioutput.py. The MultiOutput* meta-estimators should support any estimator including but not limited to the OneVsRestClassifier, which by itself is a meta-estimator!

leave regressors for a different PR.

Indeed. regression meta-estimator could be done in a separate PR. And partial_fit support in a subsequent one after that.

Thanks for your patience!

MechCoder · 2016-01-07T23:46:46Z

sklearn/MultiOneVsRest.py

I assume you did not mean to commit an empty file :P

@MechCoder i definitely didn't ! the diff is empty for some wacky reason but the file is not! can you confirm this?

try

chmod -x

Add the file again and commit

git config core.fileMode false

Add the file again and commit, if needed (I think you won't need to)

Squash all the commits

Force push

Or you could just copy over the code to multioutput.py, remove this file and force push because that is what you are ultimately going to do anyway.

my thinking exactly @MechCoder

MechCoder · 2016-01-08T00:04:12Z

@hugobowne I've changed the title to WIP. Let us know once you finish up the MultiOutputClassifier (and if you need any help).

maniteja123 · 2016-01-08T14:41:56Z

@hugobowne Thanks a lot for letting me know about this. I did pull the code from your branch. I would be happy if I can help in any way.

One doubt - Even if I add a commit to this code, to continue working on this PR, I need to push them to this branch, for which I won't have the access rights. Any help is appreciated !

Thanks.

maniteja123 · 2016-01-09T21:18:50Z

I have just done simple modifications and refactored the code a little. It is at this branch. Please do look at it though I haven't added any new functionality. Thanks.

hugobowne · 2016-01-10T03:52:45Z

thanks for patience, all. this is a quick note:
@maniteja123 thanks for jumping in & getting involved. if you're going to work on the code now, it is best to currently make a PR to my branch, I think. @james-nichols has just done so & I hope to merge, commit here & squash when i get a moment early this coming week.

but for workflow here, generally, perhaps @MechCoder or @rvraghav93 could suggest best practice given the following: I won't have much time to contribute in the upcoming weeks & @maniteja123 is going to work on the MultiOutputClassifier -- in this case, is it i) best for him to issue PRs to my branch OR ii) should I give him collaborator access to my branch so that I don't need to merge etc... (in which case this all may move more quickly).

is there a common practice for this?

raghavrv · 2016-01-10T03:56:59Z

The common way to do this is as a PR to your branch as you had suggested. But if you don't mind giving him access to your repository, you can go ahead as it would indeed speed things up :)

MechCoder · 2016-01-10T04:20:16Z

I agree

On Sat, Jan 9, 2016 at 10:57 PM, Raghav R V notifications@github.com
wrote:

The common way to do this is as a PR to your branch as you had suggested.
But if you don't mind giving him access to your repository, you can go
ahead as it would indeed speed things up :)

—
Reply to this email directly or view it on GitHub
#6127 (comment)
.

Manoj,
http://github.com/MechCoder

hugobowne · 2016-01-12T00:03:11Z

hi all. I have just now merged @james-nichols PR into my branch. I then tried to squash commits but think I may have completely bungled it -- i used this as a guide: http://gitready.com/advanced/2009/02/10/squashing-commits-with-rebase.html

thoughts? @rvraghav93 @MechCoder

@maniteja123 : I have given you collaborator rights to my sklearn fork so please feel free to work on the branch -- I would suggest that you shoot me an email when working on it & i will do the same.

collaborator on code @MrChristophRivera can also field questions when I'm unable to.

hugobowne · 2016-01-12T00:11:11Z

ok I just attempted to squash again. let me know how it's looking. apologies for rookie errors!

MechCoder · 2016-01-13T22:22:04Z

sklearn/multioutput.py

this paragraph looks great but it should belong to an example and not here, I think

Left it for here as of now. Will make a point.

TomDLT · 2016-03-31T10:52:21Z

sklearn/tests/test_multioutput.py

+        forest_.fit(X, y[:, i])
+        assert_equal(list(forest_.predict(X)), list(predictions[:, i]))
+        assert_almost_equal(list(forest_.predict_proba(X)),
+                            list(predict_proba[:, :, i]), decimal=1)


decimal=1 is small, can't you go further?
Can you have the exact result with appropriate random_state ?

I have changed it to assert_array_equal now and the test succeeds.

maniteja123 · 2016-03-31T12:43:59Z

@TomDLT I have done all the changes. I also did go through the whole code for any errors in documentation or tests. Hopefully, I have addressed all the comments.

TomDLT · 2016-03-31T12:50:11Z

sklearn/multioutput.py

    def fit(self, X, y, sample_weight=None):
        """ Fit the model to data.
-        Fits a seperate model for each output variable.
+        Fit a seperate model for each output variable.


TomDLT · 2016-03-31T12:53:33Z

Can you squash into 2 commits?

maniteja123 · 2016-03-31T12:55:16Z

Yeah, I should do something like git rebase -i HEAD~14 after the new commit right ?

TomDLT · 2016-03-31T12:56:13Z

Yes, at the end you should have only two commits, hugobown's work and yours

maniteja123 · 2016-03-31T13:04:43Z

Just a doubt. when I rebase with git rebase -i HEAD~14, I get all the commits in the PR. Shall I pick the first two (one by hugobowne and other is my first one) and squash the remaining ? Sorry to bother you but I have never worked in collaboration and don't want to do any mistake here.

MechCoder · 2016-03-31T13:12:36Z

Just squash you last 12 into one.

git rebase -i HEAD~12

maniteja123 · 2016-03-31T13:14:04Z

I have one local commit also, so it should be 13, right ?

TomDLT · 2016-03-31T13:20:16Z

Shall I pick the first two (one by hugobowne and other is my first one) and squash the remaining

yes

MechCoder · 2016-03-31T13:23:24Z

@TomDLT Please merge if you are happy! Thanks!

TomDLT · 2016-03-31T14:12:22Z

This looks really good to me!

Just one detail:

NotFittedError ( handled by common tests ?)

Actually not for META_ESTIMATORS, but I am not sure if we should add it in common tests or in test_multioutput.py

MechCoder · 2016-04-01T01:08:19Z

@maniteja123 Could you just add a test to check for NonFittedError?

maniteja123 · 2016-04-01T01:53:48Z

@MechCoder I added a simple test for NotFittedError when predict, predict_proba and score are called. ~~Do I need to add it for all the other MetaEstimators too ?~~ Oh sorry just woke up. I just added it to this meta estimator.

MechCoder · 2016-04-01T03:02:03Z

Merging with master. Thanks for your perseverance! 🍷 🍷

raghavrv · 2016-04-01T09:46:25Z

Thanks @maniteja123 and @hugobowne

MechCoder · 2016-04-01T14:48:56Z

We forgot to update whatsnew.rst for this. Could you do that?

maniteja123 · 2016-04-01T16:16:31Z

Yeah sure. Shall I push it to this branch itself ?

maniteja123 · 2016-04-01T16:21:07Z

And thank you so so much @MechCoder @rvraghav93 @TomDLT and everyone else for all the help and bearing patiently with my doubts and sincere thanks to @hugobowne for letting me work on this. I am again sorry for taking so much of your time in reviewing this multiple times.

MechCoder · 2016-04-01T17:19:36Z

Yes please push it here. I'll cherry-pick it

maniteja123 · 2016-04-02T06:57:35Z

@MechCoder sorry for the delay, This is the commit

hugobowne mentioned this pull request Jan 7, 2016

Meta-estimators for multi-output learning #5824

Closed

raghavrv reviewed Jan 7, 2016
View reviewed changes

sklearn/tests/test_mult_one_vs_rests.py Outdated

Copy link

Member

raghavrv Jan 7, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you change the docstrings of all test* functions to comments please?

MechCoder changed the title ~~added MultiOneVsRest classifier & testing suite (Meta-estimators for multi-output learning #5824)~~ [MRG] added MultiOneVsRest classifier & testing suite (Meta-estimators for multi-output learning #5824) Jan 7, 2016

MechCoder reviewed Jan 7, 2016
View reviewed changes

MechCoder changed the title ~~[MRG] added MultiOneVsRest classifier & testing suite (Meta-estimators for multi-output learning #5824)~~ [WIP] added MultiOneVsRest classifier & testing suite (Meta-estimators for multi-output learning #5824) Jan 7, 2016

hugobowne force-pushed the MultiOneVsRestClassifier branch from ba0db84 to bae109a Compare January 12, 2016 00:09

MechCoder changed the title ~~[WIP] added MultiOneVsRest classifier & testing suite (Meta-estimators for multi-output learning #5824)~~ [MRG] added MultiOneVsRest classifier & testing suite (Meta-estimators for multi-output learning #5824) Jan 13, 2016

MechCoder reviewed Jan 13, 2016
View reviewed changes

TomDLT reviewed Mar 31, 2016
View reviewed changes

maniteja123 force-pushed the MultiOneVsRestClassifier branch from 2ea42ca to 2c8dd4e Compare March 31, 2016 13:17

TomDLT changed the title ~~[MRG] MultiOutputClassifier~~ [MRG+2] MultiOutputClassifier Mar 31, 2016

Meta estimator for multi output classification

29ee54a

maniteja123 force-pushed the MultiOneVsRestClassifier branch from 2c8dd4e to 29ee54a Compare April 1, 2016 01:49

MechCoder merged commit 3078d7d into scikit-learn:master Apr 1, 2016

Uh oh!

[MRG+2] MultiOutputClassifier #6127

[MRG+2] MultiOutputClassifier #6127

Uh oh!

Conversation

hugobowne commented Jan 7, 2016

Uh oh!

raghavrv commented Jan 7, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

james-nichols commented Jan 7, 2016

Uh oh!

raghavrv commented Jan 7, 2016

Uh oh!

raghavrv commented Jan 7, 2016

Uh oh!

maniteja123 commented Jan 7, 2016

Uh oh!

raghavrv commented Jan 7, 2016

Uh oh!

hugobowne commented Jan 7, 2016

Uh oh!

raghavrv commented Jan 7, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MechCoder commented Jan 8, 2016

Uh oh!

maniteja123 commented Jan 8, 2016

Uh oh!

maniteja123 commented Jan 9, 2016

Uh oh!

hugobowne commented Jan 10, 2016

Uh oh!

raghavrv commented Jan 10, 2016

Uh oh!

MechCoder commented Jan 10, 2016

Uh oh!

hugobowne commented Jan 12, 2016

Uh oh!

hugobowne commented Jan 12, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maniteja123 commented Mar 31, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomDLT commented Mar 31, 2016

Uh oh!

maniteja123 commented Mar 31, 2016

Uh oh!

TomDLT commented Mar 31, 2016

Uh oh!

maniteja123 commented Mar 31, 2016

Uh oh!

MechCoder commented Mar 31, 2016

Uh oh!

maniteja123 commented Mar 31, 2016

Uh oh!

TomDLT commented Mar 31, 2016

Uh oh!

MechCoder commented Mar 31, 2016