[MRG+1] Change named_steps to Bunch object #8586
Conversation
Codecov Report
@@ Coverage Diff @@
## master #8586 +/- ##
==========================================
+ Coverage 95.49% 95.49% +<.01%
==========================================
Files 342 342
Lines 61072 61088 +16
==========================================
+ Hits 58318 58334 +16
Misses 2754 2754
Continue to review full report at Codecov.
|
You should also test what happens when step names conflict with attributes of |
@@ -20,6 +20,7 @@ | |||
from .externals import six | |||
from .utils import tosequence | |||
from .utils.metaestimators import if_delegate_has_method | |||
from .datasets.base import Bunch |
jnothman
Mar 20, 2017
Member
I don't think we want a dependency on datasets
. Move Bunch
to utils
.
I don't think we want a dependency on datasets
. Move Bunch
to utils
.
@@ -122,7 +123,7 @@ class Pipeline(_BasePipeline): | |||
Attributes | |||
---------- | |||
named_steps : dict | |||
named_steps : bunch object | |||
Read-only attribute to access any step parameter by user given name. | |||
Keys are step names and values are steps parameters. |
jnothman
Mar 20, 2017
Member
Comment on what a bunch is: a dict with attribute access.
Comment on what a bunch is: a dict with attribute access.
Travis is failing because you have some PEP8 violations, please invest some time to configure on-the-fly flake8 checks inside your editor of choice. This allows you to spot these kind of problems while editing. |
To be sure, correct behaviour would be to provide dict's keys method, even
if a step is named 'keys'. We can't just break the existing dict
behaviour. Nor can we accept an error if a user chooses keys for a step
name; they just don't get the new functionality.
…On 23 Mar 2017 7:20 am, "RAKOTOARISON Herilalaina" ***@***.***> wrote:
Hi @jnothman <https://github.com/jnothman> @lesteve
<https://github.com/lesteve> , thank you for your comment.
Actually, the Bunch object didn't work when step names conflict with
attribute of dict like values, items ..
So I decide to override __getattribute__ on Bunch class.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8586 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz6xqn2JCmA-Mv9RYXAp9dKM8wxrCRks5roYKCgaJpZM4MdNFz>
.
|
To confirm what you said. I will remove the pipeline = Pipeline([('values', transf), ("items", mult2)])
assert_true(pipeline.named_steps.values is not transf)
assert_true(pipeline.named_steps.items is not mult2) |
I still think this deserves a brief mention in
Otherwise LGTM. Please add an entry to what's new. |
assert_true(pipeline.named_steps.mult is mult2) | ||
|
||
# Test bunch with conflict attribute of dict | ||
pipeline = Pipeline([('values', transf), ("items", mult2)]) |
jnothman
Mar 23, 2017
Member
Seeing as I could imagine a bad implementation where a Bunch was only used if the step names did not conflict, I would rather a test that included conflicting and non-conflicting names together. That is, include these steps all in the above pipeline.
Seeing as I could imagine a bad implementation where a Bunch was only used if the step names did not conflict, I would rather a test that included conflicting and non-conflicting names together. That is, include these steps all in the above pipeline.
@@ -247,6 +247,12 @@ API changes summary | |||
needed for the perplexity calculation. :issue:`7954` by | |||
:user:`Gary Foreman <garyForeman>`. | |||
|
jnothman
Mar 24, 2017
Member
I think this should mention pipeline
I think this should mention pipeline
herilalaina
Mar 24, 2017
Author
Contributor
Hi, I tried to organize into section each entry in API change summary. This resulted the conflict. I resolved it but appveyor test still failed (without merge). Should I remove my last commit ?
Hi, I tried to organize into section each entry in API change summary. This resulted the conflict. I resolved it but appveyor test still failed (without merge). Should I remove my last commit ?
jnothman
Mar 25, 2017
Member
Don't worry about sections. We'll do that at release
Don't worry about sections. We'll do that at release
herilalaina
Mar 26, 2017
Author
Contributor
Ok, thank you. I cleaned it.
Ok, thank you. I cleaned it.
4fb4dad
to
7b25de2
Awesome, thanks! |
@@ -270,6 +270,12 @@ API changes summary | |||
needed for the perplexity calculation. :issue:`7954` by | |||
:user:`Gary Foreman <garyForeman>`. | |||
|
|||
- Replace attribute ``named_steps`` ``dict`` to :class:`sklearn.utils.Bunch` | |||
in :class:`sklearn.pipeline.Pipeline` to enable tab completion in interactive | |||
environment. In the case conflict value on ``named_steps`` and ``dict`` |
amueller
Mar 30, 2017
Member
I'm not sure I understand this sentence.
I'm not sure I understand this sentence.
* Change named_steps to Bunch object * Update named_steps attribute documentation * Add test for named steps bunch object * Delete whitespace in test_pipeline * Update test_pipeline.py * Add comment for named_steps usage * Move dataset/Bunch to utils * Fix to PEP8 format * Add __getattribute method to Bunch class, Fix pep8 bug * Remove __getattribute__, update test_pipeline * Update test with conflict and non-conflict named_steps * Add reference to class Pipeline
* Change named_steps to Bunch object * Update named_steps attribute documentation * Add test for named steps bunch object * Delete whitespace in test_pipeline * Update test_pipeline.py * Add comment for named_steps usage * Move dataset/Bunch to utils * Fix to PEP8 format * Add __getattribute method to Bunch class, Fix pep8 bug * Remove __getattribute__, update test_pipeline * Update test with conflict and non-conflict named_steps * Add reference to class Pipeline
* Change named_steps to Bunch object * Update named_steps attribute documentation * Add test for named steps bunch object * Delete whitespace in test_pipeline * Update test_pipeline.py * Add comment for named_steps usage * Move dataset/Bunch to utils * Fix to PEP8 format * Add __getattribute method to Bunch class, Fix pep8 bug * Remove __getattribute__, update test_pipeline * Update test with conflict and non-conflict named_steps * Add reference to class Pipeline
* Change named_steps to Bunch object * Update named_steps attribute documentation * Add test for named steps bunch object * Delete whitespace in test_pipeline * Update test_pipeline.py * Add comment for named_steps usage * Move dataset/Bunch to utils * Fix to PEP8 format * Add __getattribute method to Bunch class, Fix pep8 bug * Remove __getattribute__, update test_pipeline * Update test with conflict and non-conflict named_steps * Add reference to class Pipeline
* Change named_steps to Bunch object * Update named_steps attribute documentation * Add test for named steps bunch object * Delete whitespace in test_pipeline * Update test_pipeline.py * Add comment for named_steps usage * Move dataset/Bunch to utils * Fix to PEP8 format * Add __getattribute method to Bunch class, Fix pep8 bug * Remove __getattribute__, update test_pipeline * Update test with conflict and non-conflict named_steps * Add reference to class Pipeline
* Change named_steps to Bunch object * Update named_steps attribute documentation * Add test for named steps bunch object * Delete whitespace in test_pipeline * Update test_pipeline.py * Add comment for named_steps usage * Move dataset/Bunch to utils * Fix to PEP8 format * Add __getattribute method to Bunch class, Fix pep8 bug * Remove __getattribute__, update test_pipeline * Update test with conflict and non-conflict named_steps * Add reference to class Pipeline
Reference Issue
Fix #8481
What does this implement/fix? Explain your changes.
This PR changes
pipeline.named_steps
into bunch object instead of dictionary.Any other comments?
Usage example
To do