Skip to content

Conversation

hugobowne
Copy link
Contributor

TODO for this PR

  • Check if meta of meta works (a few combinations)
  • To check the shapes of outputs of all methods
  • Check if meta of meta works (a few combinations)
  • Compare with multiple runs of single o/p estimators
  • Check if the various errors are raised properly (when the n_outputs during predict time is > (or !=?) during fit time).
  • Check if not fitted error is raised? (or could we make the general tests take care of that? I think there is a test for meta estimators.)
  • See if this passes all the meta estimators' tests
  • Make the tests pass ;)
  • Add documentation

@raghavrv
Copy link
Member

raghavrv commented Jan 7, 2016

Thanks a lot for the PR, could you remove the main in test file, as all tests are run by nose

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you change the docstrings of all test* functions to comments please?

@james-nichols
Copy link

@rvraghav93, I just had a chat with @hugobowne about doing these changes and fixing the unit test failures. Might submit a PR soon if that's ok...?

@raghavrv
Copy link
Member

raghavrv commented Jan 7, 2016

You mean a PR to @hugobowne's branch right? Raising another PR to scikit learn is not necessary :)

@raghavrv
Copy link
Member

raghavrv commented Jan 7, 2016

Also a few points -

The file MultiOneRest is empty? This PR seems to have only the tests.

And I think the prefered filename would be multioutput.py

Please see if this approach could be followed instead?

@maniteja123
Copy link
Contributor

I suppose the file MultiOneRest.py was committed as a executable and hence showing in the diff as empty. Had faced this issue before :)

@raghavrv
Copy link
Member

raghavrv commented Jan 7, 2016

Ah that's new to me!

@hugobowne
Copy link
Contributor Author

wacky issue!

@rvraghav93, i think the preferred filename should be multi_one_vs_rest.py or something along these lines, as this PR deals specifically with '[one-versus-all] classification models' --

In particular, it doesn't deal with regressors at all. Moreover, it should probably be generalized to deal with all classification models (this will be an easy extension). @rvraghav93, we had completed it before you suggested your approach.

after fixing all necessary issues, i suggest we i) generalize to deal with all classification models & leave regressors for a different PR.

thoughts?

@raghavrv
Copy link
Member

raghavrv commented Jan 7, 2016

Moreover, it should probably be generalized to deal with all classification models (this will be an easy extension).

Yes, like @mblondel suggests here, we should have MutiOutputClassifier and MultiOutputRegressor put inside the multioutput.py. The MultiOutput* meta-estimators should support any estimator including but not limited to the OneVsRestClassifier, which by itself is a meta-estimator!

leave regressors for a different PR.

Indeed. regression meta-estimator could be done in a separate PR. And partial_fit support in a subsequent one after that.

Thanks for your patience!

@MechCoder MechCoder changed the title added MultiOneVsRest classifier & testing suite (Meta-estimators for multi-output learning #5824) [MRG] added MultiOneVsRest classifier & testing suite (Meta-estimators for multi-output learning #5824) Jan 7, 2016
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you did not mean to commit an empty file :P

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MechCoder i definitely didn't ! the diff is empty for some wacky reason but the file is not! can you confirm this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try

  1. chmod -x
  2. Add the file again and commit
  3. git config core.fileMode false
  4. Add the file again and commit, if needed (I think you won't need to)
  5. Squash all the commits
  6. Force push

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or you could just copy over the code to multioutput.py, remove this file and force push because that is what you are ultimately going to do anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed :P

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my thinking exactly @MechCoder

@MechCoder MechCoder changed the title [MRG] added MultiOneVsRest classifier & testing suite (Meta-estimators for multi-output learning #5824) [WIP] added MultiOneVsRest classifier & testing suite (Meta-estimators for multi-output learning #5824) Jan 7, 2016
@MechCoder
Copy link
Member

@hugobowne I've changed the title to WIP. Let us know once you finish up the MultiOutputClassifier (and if you need any help).

@maniteja123
Copy link
Contributor

@hugobowne Thanks a lot for letting me know about this. I did pull the code from your branch. I would be happy if I can help in any way.

One doubt - Even if I add a commit to this code, to continue working on this PR, I need to push them to this branch, for which I won't have the access rights. Any help is appreciated !

Thanks.

@maniteja123
Copy link
Contributor

I have just done simple modifications and refactored the code a little. It is at this branch. Please do look at it though I haven't added any new functionality. Thanks.

@hugobowne
Copy link
Contributor Author

thanks for patience, all. this is a quick note:
@maniteja123 thanks for jumping in & getting involved. if you're going to work on the code now, it is best to currently make a PR to my branch, I think. @james-nichols has just done so & I hope to merge, commit here & squash when i get a moment early this coming week.

but for workflow here, generally, perhaps @MechCoder or @rvraghav93 could suggest best practice given the following: I won't have much time to contribute in the upcoming weeks & @maniteja123 is going to work on the MultiOutputClassifier -- in this case, is it i) best for him to issue PRs to my branch OR ii) should I give him collaborator access to my branch so that I don't need to merge etc... (in which case this all may move more quickly).

is there a common practice for this?

@raghavrv
Copy link
Member

The common way to do this is as a PR to your branch as you had suggested. But if you don't mind giving him access to your repository, you can go ahead as it would indeed speed things up :)

@MechCoder
Copy link
Member

I agree

On Sat, Jan 9, 2016 at 10:57 PM, Raghav R V notifications@github.com
wrote:

The common way to do this is as a PR to your branch as you had suggested.
But if you don't mind giving him access to your repository, you can go
ahead as it would indeed speed things up :)


Reply to this email directly or view it on GitHub
#6127 (comment)
.

Manoj,
http://github.com/MechCoder

@hugobowne
Copy link
Contributor Author

hi all. I have just now merged @james-nichols PR into my branch. I then tried to squash commits but think I may have completely bungled it -- i used this as a guide: http://gitready.com/advanced/2009/02/10/squashing-commits-with-rebase.html

thoughts? @rvraghav93 @MechCoder

@maniteja123 : I have given you collaborator rights to my sklearn fork so please feel free to work on the branch -- I would suggest that you shoot me an email when working on it & i will do the same.

collaborator on code @MrChristophRivera can also field questions when I'm unable to.

@hugobowne hugobowne force-pushed the MultiOneVsRestClassifier branch from ba0db84 to bae109a Compare January 12, 2016 00:09
@hugobowne
Copy link
Contributor Author

ok I just attempted to squash again. let me know how it's looking. apologies for rookie errors!

@MechCoder MechCoder changed the title [WIP] added MultiOneVsRest classifier & testing suite (Meta-estimators for multi-output learning #5824) [MRG] added MultiOneVsRest classifier & testing suite (Meta-estimators for multi-output learning #5824) Jan 13, 2016
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this paragraph looks great but it should belong to an example and not here, I think

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left it for here as of now. Will make a point.

forest_.fit(X, y[:, i])
assert_equal(list(forest_.predict(X)), list(predictions[:, i]))
assert_almost_equal(list(forest_.predict_proba(X)),
list(predict_proba[:, :, i]), decimal=1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decimal=1 is small, can't you go further?
Can you have the exact result with appropriate random_state ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed it to assert_array_equal now and the test succeeds.

@maniteja123
Copy link
Contributor

@TomDLT I have done all the changes. I also did go through the whole code for any errors in documentation or tests. Hopefully, I have addressed all the comments.

def fit(self, X, y, sample_weight=None):
""" Fit the model to data.
Fits a seperate model for each output variable.
Fit a seperate model for each output variable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separate

@TomDLT
Copy link
Member

TomDLT commented Mar 31, 2016

Can you squash into 2 commits?

@maniteja123
Copy link
Contributor

Yeah, I should do something like git rebase -i HEAD~14 after the new commit right ?

@TomDLT
Copy link
Member

TomDLT commented Mar 31, 2016

Yes, at the end you should have only two commits, hugobown's work and yours

@maniteja123
Copy link
Contributor

Just a doubt. when I rebase with git rebase -i HEAD~14, I get all the commits in the PR. Shall I pick the first two (one by hugobowne and other is my first one) and squash the remaining ? Sorry to bother you but I have never worked in collaboration and don't want to do any mistake here.

@MechCoder
Copy link
Member

Just squash you last 12 into one.

git rebase -i HEAD~12

@maniteja123
Copy link
Contributor

I have one local commit also, so it should be 13, right ?

@maniteja123 maniteja123 force-pushed the MultiOneVsRestClassifier branch from 2ea42ca to 2c8dd4e Compare March 31, 2016 13:17
@TomDLT
Copy link
Member

TomDLT commented Mar 31, 2016

Shall I pick the first two (one by hugobowne and other is my first one) and squash the remaining

yes

@MechCoder
Copy link
Member

@TomDLT Please merge if you are happy! Thanks!

@TomDLT
Copy link
Member

TomDLT commented Mar 31, 2016

This looks really good to me!


Just one detail:

NotFittedError ( handled by common tests ?)

Actually not for META_ESTIMATORS, but I am not sure if we should add it in common tests or in test_multioutput.py

@TomDLT TomDLT changed the title [MRG] MultiOutputClassifier [MRG+2] MultiOutputClassifier Mar 31, 2016
@MechCoder
Copy link
Member

@maniteja123 Could you just add a test to check for NonFittedError?

@maniteja123 maniteja123 force-pushed the MultiOneVsRestClassifier branch from 2c8dd4e to 29ee54a Compare April 1, 2016 01:49
@maniteja123
Copy link
Contributor

@MechCoder I added a simple test for NotFittedError when predict, predict_proba and score are called. Do I need to add it for all the other MetaEstimators too ? Oh sorry just woke up. I just added it to this meta estimator.

@MechCoder
Copy link
Member

Merging with master. Thanks for your perseverance! 🍷 🍷

@MechCoder MechCoder merged commit 3078d7d into scikit-learn:master Apr 1, 2016
@raghavrv
Copy link
Member

raghavrv commented Apr 1, 2016

Thanks @maniteja123 and @hugobowne

@MechCoder
Copy link
Member

We forgot to update whatsnew.rst for this. Could you do that?

@maniteja123
Copy link
Contributor

Yeah sure. Shall I push it to this branch itself ?

@maniteja123
Copy link
Contributor

And thank you so so much @MechCoder @rvraghav93 @TomDLT and everyone else for all the help and bearing patiently with my doubts and sincere thanks to @hugobowne for letting me work on this. I am again sorry for taking so much of your time in reviewing this multiple times.

@MechCoder
Copy link
Member

Yes please push it here. I'll cherry-pick it

@maniteja123
Copy link
Contributor

@MechCoder sorry for the delay, This is the commit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants