Skip to content

[MRG] FIX OvR with constant label for non-predict methods #3308

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 25, 2014

Conversation

jnothman
Copy link
Member

bug mentioned on ML

@jnothman
Copy link
Member Author

This is still broken for a constantly absent label.

@abhishekkrthakur
Copy link
Contributor

This fixes the problem in my case.

@jnothman
Copy link
Member Author

Until the second commit, it may have fixed the error, but not given correct output. The second commit should resolve that.

@jnothman jnothman changed the title FIX OvR with constant label for non-predict methods [MRG] FIX OvR with constant label for non-predict methods Jun 23, 2014
@coveralls
Copy link

Coverage Status

Coverage increased (+0.0%) when pulling 51d4f75 on jnothman:ovr_constant_predict_proba into 30347e9 on scikit-learn:master.

@@ -63,10 +63,23 @@ def test_ovr_always_present():
X[:5, :] = 0
y = [[int(i >= 5), 2, 3] for i in range(10)]
with warnings.catch_warnings(record=True):
ovr = OneVsRestClassifier(DecisionTreeClassifier())
ovr = OneVsRestClassifier(LogisticRegression())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to modify this line?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here. Maybe this is because the predict_proba of a DecisionTreeClassifier is pretty meaningless as it's binary unless the depth is bound. That should still work though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I first tried, it seemed there was no predict_proba on the tree, but I thought that was a bit weird. I was doing something else at the same time....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. With DTC, I get Base estimator doesn't have a decision_function attribute.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it was some other class I threw in that refused to give me a predict_proba

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have spoken too fast.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

svc?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If SVC, why not LogisticRegression?

On 24 June 2014 09:37, Arnaud Joly notifications@github.com wrote:

In sklearn/tests/test_multiclass.py:

@@ -63,10 +63,23 @@ def test_ovr_always_present():
X[:5, :] = 0
y = [[int(i >= 5), 2, 3] for i in range(10)]
with warnings.catch_warnings(record=True):

  •    ovr = OneVsRestClassifier(DecisionTreeClassifier())
    
  •    ovr = OneVsRestClassifier(LogisticRegression())
    

svc?


Reply to this email directly or view it on GitHub
https://github.com/scikit-learn/scikit-learn/pull/3308/files#r14128965.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for LogisticRegression as the simplest classifier that models class assignment probabilities. Tied with MultinomialNB for event count data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The decision_function of MultinomialNB is kind of a hack (a linear interpretation of the model) so +2 for LogisticRegression.

@jnothman
Copy link
Member Author

So, @ogrisel, @arjoly, are you fine for merge?

@arjoly
Copy link
Member

arjoly commented Jun 24, 2014

Could you use assert_warns or ignore_warnings instead of the context manager?

@ogrisel
Copy link
Member

ogrisel commented Jun 25, 2014

+1 for merging.

@arjoly
Copy link
Member

arjoly commented Jun 25, 2014

Thanks @jnothman merging!

arjoly added a commit that referenced this pull request Jun 25, 2014
[MRG] FIX OvR with constant label for non-predict methods
@arjoly arjoly merged commit 7aee742 into scikit-learn:master Jun 25, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants