New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TST factorize some common test of ensemble of heterogeneous es… #15387
TST factorize some common test of ensemble of heterogeneous es… #15387
Conversation
Looking at the voting tests, it seems that ping @jnothman @amueller @NicolasHug @thomasjpfan @qinhanmin2014 Could you confirm that we have to change the Stacking? |
By "expected to report dropped estimators" do you mean |
I have a preference for the interface in In the case of |
I did not consider the So we should be able to map For So we need:
This PR could focus on 1. WDYT? |
Reading the code, that's not true.
return Bunch(**{name: trans for name, trans, _
in self.transformers_}) |
@NicolasHug Are you sure? In [5]: import numpy as np
...: from sklearn.compose import ColumnTransformer
...: from sklearn.preprocessing import Normalizer
...: ct = ColumnTransformer(
...: [("norm1", Normalizer(norm='l1'), [0, 1]),
...: ("norm2", Normalizer(norm='l1'), slice(2, 4))])
...: X = np.array([[0., 1., 2., 2.],
...: [1., 1., 0., 1.]])
...: # Normalizer scales each row of X to unit norm. A separate scaling
...: # is applied for the two first and two last elements of each
...: # row independently.
...: ct.fit_transform(X)
...:
Out[5]:
array([[0. , 1. , 0.5, 0.5],
[0.5, 0.5, 0. , 1. ]])
In [6]: ct.set_params(norm1='drop')
Out[6]:
ColumnTransformer(n_jobs=None, remainder='drop', sparse_threshold=0.3,
transformer_weights=None,
transformers=[('norm1', 'drop', [0, 1]),
('norm2', Normalizer(copy=True, norm='l1'),
slice(2, 4, None))],
verbose=False)
In [7]: ct.fit_transform(X)
Out[7]:
array([[0.5, 0.5],
[0. , 1. ]])
In [8]: ct.transformers_
Out[8]:
[('norm1', 'drop', [0, 1]),
('norm2', Normalizer(copy=True, norm='l1'), slice(2, 4, None))]
In [9]: ct.named_transformers_
Out[9]: {'norm1': 'drop', 'norm2': Normalizer(copy=True, norm='l1')}
|
We also have another issues with
|
Sorry on my phone rn so can't double check but both the code comments and the docstrings suggest otherwise. Docstrings say "fitted estimator" which inpoes9the dropped estimators aren't there |
I am confused now. If you refer to the
If you refer to the |
Sorry, read too fast. I was confused that a 'drop' qualifies as a "fitted estimator" Ignore my comments ^^ |
I am okay with changing |
So I added the dropped estimator in We can have further discussion regarding what to do with |
No need to update the changelog for this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this, thank you.
Co-Authored-By: Joel Nothman <joel.nothman@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Merged with master to make sure the docs are okay (the lint error comes from the renaming of the Will merge when tests pass. |
Thanks for merging |
Thanks you for the PR! @glemaitre |
Reference Issues/PRs
A follow-up to #15375
Include a test to illustrate non-consistency between Stacking and Voting
TODO:
named_estimators_
)What does this implement/fix? Explain your changes.
We merged #15375 but included a regression/inconsistent between Stacking and Voting estimators.
The documentation is not straightforward but
estimators_
does not include the dropped estimators. For Stacking estimatorsnamed_estimators_
expose the same semantic whileVoting
estimators report the dropped estimator with the'drop'
values.Any other comments?