New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Voting classifier flatten transform (#7230) #7794

Closed
wants to merge 4 commits into
base: master
from

Conversation

Projects
None yet
6 participants
@olologin
Contributor

olologin commented Oct 30, 2016

Reference Issue

Fixes #7230

What does this implement/fix? Explain your changes.

It adds flatten_transform parameter to VotingClassifier, which changes shape of transform method's output to [n_samples, n_classifiers * n_classes] instead of [n_classifiers, n_samples, n_classes],
With this parameter turned on you can use VotingClassifier as a transformer, and feed its output to other estimators/transformers in Pipeline.

Any other comments?

None, make suggestions. I Summon @amueller into this PR :)

@olologin olologin changed the title from Voting classifier flatten transform to [RFC] Voting classifier flatten transform (#7230) Oct 30, 2016

@olologin

This comment has been minimized.

Show comment
Hide comment
@olologin

olologin Oct 30, 2016

Contributor

Also, I'm not sure whether regression test is needed. Because added functionality is pretty simple.

Contributor

olologin commented Oct 30, 2016

Also, I'm not sure whether regression test is needed. Because added functionality is pretty simple.

olologin added some commits Oct 30, 2016

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Oct 30, 2016

Member

I'd be tempted to deprecate the current behaviour.

Member

jnothman commented Oct 30, 2016

I'd be tempted to deprecate the current behaviour.

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Nov 2, 2016

Member

tests are always needed!
@jnothman change the default or deprecate entirely?

Member

amueller commented Nov 2, 2016

tests are always needed!
@jnothman change the default or deprecate entirely?

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Nov 3, 2016

Member

Do you see harm in deprecating entirely? I'd rather that. The current transform output has indeterminate semantics within the scikit-learn API.

Member

jnothman commented Nov 3, 2016

Do you see harm in deprecating entirely? I'd rather that. The current transform output has indeterminate semantics within the scikit-learn API.

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Nov 17, 2016

Member

I'm happy with deprecating entirely. That needs two steps, though: introducing the parameter and removing it again. Or what did you want to do?

Member

amueller commented Nov 17, 2016

I'm happy with deprecating entirely. That needs two steps, though: introducing the parameter and removing it again. Or what did you want to do?

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Feb 20, 2017

Member

Also, it might be helpful to special case for binary classification and only retain probabilities of one of the classes, say the positive one? That would result in smaller and more interpretable downstream models.

Member

amueller commented Feb 20, 2017

Also, it might be helpful to special case for binary classification and only retain probabilities of one of the classes, say the positive one? That would result in smaller and more interpretable downstream models.

@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Feb 20, 2017

Member

bonus points for implementing get_feature_names() ;)

Member

amueller commented Feb 20, 2017

bonus points for implementing get_feature_names() ;)

def test_transform():
"""Check trqansform method of VotingClassifier on toy dataset."""

This comment has been minimized.

@agramfort

agramfort Jun 8, 2017

Member

trqansform -> transform

@agramfort

agramfort Jun 8, 2017

Member

trqansform -> transform

flatten_transform=True).fit(X, y)
assert_array_equal(eclf1.transform(X).shape, (3, 4, 2))
assert_array_equal(eclf2.transform(X).shape, (4, 6))

This comment has been minimized.

@agramfort

agramfort Jun 8, 2017

Member

can you check more than shapes with an assert_equal on the values after a proper reshape?

@agramfort

agramfort Jun 8, 2017

Member

can you check more than shapes with an assert_equal on the values after a proper reshape?

flatten_transform : bool, optional (default=False)
Affects shape of transform output only when voting='soft'
If voting='soft' and flatten_transform=True, transform method returns
matrix with shape [n_samples, n_classifiers * n_classes] instead of

This comment has been minimized.

@agramfort

agramfort Jun 8, 2017

Member

[n_samples, n_classifiers * n_classes]
->
(n_samples, n_classifiers * n_classes)

shapes are tuples

@agramfort

agramfort Jun 8, 2017

Member

[n_samples, n_classifiers * n_classes]
->
(n_samples, n_classifiers * n_classes)

shapes are tuples

Affects shape of transform output only when voting='soft'
If voting='soft' and flatten_transform=True, transform method returns
matrix with shape [n_samples, n_classifiers * n_classes] instead of
[n_classifiers, n_samples, n_classes].

This comment has been minimized.

@agramfort

agramfort Jun 8, 2017

Member

same here

@agramfort

agramfort Jun 8, 2017

Member

same here

@@ -238,16 +247,25 @@ def transform(self, X):
Returns
-------
If `voting='soft'`:
If `voting='soft'` and `flatten_transform=False`:
array-like = [n_classifiers, n_samples, n_classes]

This comment has been minimized.

@agramfort

agramfort Jun 8, 2017

Member

shape as tuple

@agramfort

agramfort Jun 8, 2017

Member

shape as tuple

- Added ``flatten_transform`` parameter to :class:`ensemble.VotingClassifier`
to change output shape of `transform` method to 2 dimensional.
(`#7794 <https://github.com/scikit-learn/scikit-learn/pull/7794>`_)
by `Ibraim Ganiev`_.

This comment has been minimized.

@agramfort

agramfort Jun 8, 2017

Member

should be written as

:issue:7794 by Ibraim Ganiev_.

@agramfort

agramfort Jun 8, 2017

Member

should be written as

:issue:7794 by Ibraim Ganiev_.

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Jun 8, 2017

Member

How does this relate to our concurrent ambitions towards stacking (e.g. #8960)

Member

jnothman commented Jun 8, 2017

How does this relate to our concurrent ambitions towards stacking (e.g. #8960)

@GaelVaroquaux

This comment has been minimized.

Show comment
Hide comment
@GaelVaroquaux

GaelVaroquaux Jun 8, 2017

Member

You'll also need to rebase on master, to get the tests running.

Member

GaelVaroquaux commented Jun 8, 2017

You'll also need to rebase on master, to get the tests running.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment