EasyEnsemble should be a meta-estimator #252

amueller · 2017-03-20T02:23:34Z

I'm not entirely sure how EasyEnsemble should be used, but I feel like it might be easier if it was a meta-estimator.
Let's say I want to implement a random forest using EasyEnsemble as an estimator. I have no idea how to do that easily. If it was a meta-estimator, I could just do EasyEnsemble(DecisionTreeClassifier(max_features="auto")), which would be nice. Ideally this would set the random seeds, too.

I ended up with this solution, which works, but is verbose and probably inefficient:

from sklearn.ensemble import VotingClassifier
from sklearn.tree import DecisionTreeClassifier
def make_resample_tree(random_state=0):
    tree = make_imb_pipeline(RandomUnderSampler(random_state=random_state, replacement=True),
                             DecisionTreeClassifier(max_features='auto', random_state=random_state))
    return "tree_i".format(random_state), tree
classifiers = [make_resample_tree(i) for i in range(100)]
resampled_rf = VotingClassifier(classifiers, voting='soft')

[Though it is kinda neat that I can write it down like that ;]

The text was updated successfully, but these errors were encountered:

amueller · 2017-03-20T02:32:51Z

A stupid implementation of the general scheme is this:

from sklearn.base import clone

def make_resampled_ensemble(estimator, n_estimators=100):
    estimators = []
    for i in range(n_estimators):
        est = clone(estimator)
        if hasattr(est, "random_state"):
            est.random_state = i
        pipe = make_imb_pipeline(RandomUnderSampler(random_state=i, replacement=True),
                                 est)
        estimators.append(("est_i".format(i), pipe))
    return VotingClassifier(estimators, voting="soft")

Clearly you could do that more elegantly by actually writing a meta-estimator class...

glemaitre · 2017-03-20T07:55:25Z

You are right. I think that this is also the solution at #149.

chkoar · 2017-03-20T09:32:27Z

@amueller is right. On the other hand we could preserve functionality by rename this class as EasyEnsembleSampler and on top of that build the EasyEnsembleClassifier. I am ok either way. I could do it. PS the sampler and the classifier could easily paralellized.

glemaitre · 2017-03-20T09:46:09Z

Yep exactly. We need to deprecate the class as well. I let you on this one. We should do the same for BalanceCascade G

amueller · 2017-03-20T12:46:28Z

fwiw I don't have a strong opinion on also keeping the current version or not - people might rely on it.

glemaitre · 2017-04-11T09:18:07Z

@chkoar I was thinking for that matter. I think that the resampling strategy should take any under-sampling methods and that the classifier could be any classifier as well. You should probably keep this in mind while designing the meta-estimators.

glemaitre added the Type: Enhancement Indicates new feature requests label Mar 20, 2017

glemaitre mentioned this issue Mar 22, 2017

Pipeline for ensemble methods #149

Closed

glemaitre assigned chkoar Mar 22, 2017

glemaitre added Status: Blocker new features labels Mar 22, 2017

glemaitre mentioned this issue Aug 10, 2017

[MRG] EHN add BalancedBaggingClassifier #315

Merged

glemaitre closed this as completed in #315 Aug 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EasyEnsemble should be a meta-estimator #252

EasyEnsemble should be a meta-estimator #252

amueller commented Mar 20, 2017 •

edited

amueller commented Mar 20, 2017

glemaitre commented Mar 20, 2017

chkoar commented Mar 20, 2017

glemaitre commented Mar 20, 2017 via email

amueller commented Mar 20, 2017

glemaitre commented Apr 11, 2017

EasyEnsemble should be a meta-estimator #252

EasyEnsemble should be a meta-estimator #252

Comments

amueller commented Mar 20, 2017 • edited

amueller commented Mar 20, 2017

glemaitre commented Mar 20, 2017

chkoar commented Mar 20, 2017

glemaitre commented Mar 20, 2017 via email

amueller commented Mar 20, 2017

glemaitre commented Apr 11, 2017

amueller commented Mar 20, 2017 •

edited