New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Change named_steps to Bunch object #8586

Merged
merged 13 commits into from Mar 30, 2017

Conversation

Projects
None yet
5 participants
@herilalaina
Contributor

herilalaina commented Mar 14, 2017

Reference Issue

Fix #8481

What does this implement/fix? Explain your changes.

This PR changes pipeline.named_steps into bunch object instead of dictionary.

Any other comments?

Usage example

> clf = svm.SVC(kernel='linear')
> anova_svm = Pipeline([('anova', anova_filter), ('svc', clf)])
> anova_svm.named_steps.anova
SelectKBest(k=5, score_func=<function f_regression at 0x7f1d984d8620>)

To do

  • Update source code
  • Test
  • Documentation
@codecov

This comment has been minimized.

Show comment
Hide comment
@codecov

codecov bot Mar 14, 2017

Codecov Report

Merging #8586 into master will increase coverage by <.01%.
The diff coverage is 99.56%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8586      +/-   ##
==========================================
+ Coverage   95.49%   95.49%   +<.01%     
==========================================
  Files         342      342              
  Lines       61072    61088      +16     
==========================================
+ Hits        58318    58334      +16     
  Misses       2754     2754
Impacted Files Coverage Δ
sklearn/feature_selection/univariate_selection.py 99.46% <ø> (ø) ⬆️
sklearn/linear_model/coordinate_descent.py 96.94% <ø> (ø) ⬆️
sklearn/ensemble/forest.py 98.16% <ø> (ø) ⬆️
sklearn/metrics/classification.py 97.77% <ø> (ø) ⬆️
sklearn/mixture/tests/test_gaussian_mixture.py 99.63% <ø> (ø) ⬆️
sklearn/model_selection/_validation.py 98.71% <ø> (ø) ⬆️
sklearn/tree/tree.py 98.41% <ø> (ø) ⬆️
sklearn/calibration.py 98.87% <ø> (ø) ⬆️
sklearn/neighbors/approximate.py 98.95% <ø> (ø) ⬆️
sklearn/mixture/gaussian_mixture.py 100% <ø> (ø) ⬆️
... and 62 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 134c9b7...00e67c9. Read the comment docs.

codecov bot commented Mar 14, 2017

Codecov Report

Merging #8586 into master will increase coverage by <.01%.
The diff coverage is 99.56%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8586      +/-   ##
==========================================
+ Coverage   95.49%   95.49%   +<.01%     
==========================================
  Files         342      342              
  Lines       61072    61088      +16     
==========================================
+ Hits        58318    58334      +16     
  Misses       2754     2754
Impacted Files Coverage Δ
sklearn/feature_selection/univariate_selection.py 99.46% <ø> (ø) ⬆️
sklearn/linear_model/coordinate_descent.py 96.94% <ø> (ø) ⬆️
sklearn/ensemble/forest.py 98.16% <ø> (ø) ⬆️
sklearn/metrics/classification.py 97.77% <ø> (ø) ⬆️
sklearn/mixture/tests/test_gaussian_mixture.py 99.63% <ø> (ø) ⬆️
sklearn/model_selection/_validation.py 98.71% <ø> (ø) ⬆️
sklearn/tree/tree.py 98.41% <ø> (ø) ⬆️
sklearn/calibration.py 98.87% <ø> (ø) ⬆️
sklearn/neighbors/approximate.py 98.95% <ø> (ø) ⬆️
sklearn/mixture/gaussian_mixture.py 100% <ø> (ø) ⬆️
... and 62 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 134c9b7...00e67c9. Read the comment docs.

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Mar 14, 2017

Member
Member

jnothman commented Mar 14, 2017

@herilalaina herilalaina changed the title from Change named_steps to Bunch object to [WIP] Change named_steps to Bunch object Mar 15, 2017

@herilalaina herilalaina changed the title from [WIP] Change named_steps to Bunch object to [MRG] Change named_steps to Bunch object Mar 17, 2017

@jnothman

You should also test what happens when step names conflict with attributes of dict such as values.

Show outdated Hide outdated sklearn/pipeline.py
Show outdated Hide outdated sklearn/pipeline.py
@lesteve

This comment has been minimized.

Show comment
Hide comment
@lesteve

lesteve Mar 22, 2017

Member

Travis is failing because you have some PEP8 violations, please invest some time to configure on-the-fly flake8 checks inside your editor of choice. This allows you to spot these kind of problems while editing.

Member

lesteve commented Mar 22, 2017

Travis is failing because you have some PEP8 violations, please invest some time to configure on-the-fly flake8 checks inside your editor of choice. This allows you to spot these kind of problems while editing.

@herilalaina

This comment has been minimized.

Show comment
Hide comment
@herilalaina

herilalaina Mar 22, 2017

Contributor

Hi @jnothman @lesteve , thank you for your comment.
Actually, the Bunch object didn't work when step names conflict with attribute of dict like values, items ..
So I decide to override __getattribute__ on Bunch class.

Contributor

herilalaina commented Mar 22, 2017

Hi @jnothman @lesteve , thank you for your comment.
Actually, the Bunch object didn't work when step names conflict with attribute of dict like values, items ..
So I decide to override __getattribute__ on Bunch class.

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Mar 22, 2017

Member
Member

jnothman commented Mar 22, 2017

@herilalaina

This comment has been minimized.

Show comment
Hide comment
@herilalaina

herilalaina Mar 22, 2017

Contributor

To confirm what you said. I will remove the __getattribute__ in the Bunch class then change test_pipeline.py (line 524) with

pipeline = Pipeline([('values', transf), ("items", mult2)])
assert_true(pipeline.named_steps.values is not transf)
assert_true(pipeline.named_steps.items is not mult2)
Contributor

herilalaina commented Mar 22, 2017

To confirm what you said. I will remove the __getattribute__ in the Bunch class then change test_pipeline.py (line 524) with

pipeline = Pipeline([('values', transf), ("items", mult2)])
assert_true(pipeline.named_steps.values is not transf)
assert_true(pipeline.named_steps.items is not mult2)
@jnothman

I still think this deserves a brief mention in doc/modules/pipeline.rst, where it says "and as a dict in named_steps", with something like:

Attributes of named_steps map to keys, enabling tab completion in interactive environments:
    >>> pipe.named_steps.reduce_dim is pipe.named_steps['reduce_dim']
    True

Otherwise LGTM. Please add an entry to what's new.

Show outdated Hide outdated sklearn/tests/test_pipeline.py
@@ -247,6 +247,12 @@ API changes summary
needed for the perplexity calculation. :issue:`7954` by
:user:`Gary Foreman <garyForeman>`.

This comment has been minimized.

@jnothman

jnothman Mar 24, 2017

Member

I think this should mention pipeline

@jnothman

jnothman Mar 24, 2017

Member

I think this should mention pipeline

This comment has been minimized.

@herilalaina

herilalaina Mar 24, 2017

Contributor

Hi, I tried to organize into section each entry in API change summary. This resulted the conflict. I resolved it but appveyor test still failed (without merge). Should I remove my last commit ?

@herilalaina

herilalaina Mar 24, 2017

Contributor

Hi, I tried to organize into section each entry in API change summary. This resulted the conflict. I resolved it but appveyor test still failed (without merge). Should I remove my last commit ?

This comment has been minimized.

@jnothman

jnothman Mar 25, 2017

Member

Don't worry about sections. We'll do that at release

@jnothman

jnothman Mar 25, 2017

Member

Don't worry about sections. We'll do that at release

This comment has been minimized.

@herilalaina

herilalaina Mar 26, 2017

Contributor

Ok, thank you. I cleaned it.

@herilalaina

herilalaina Mar 26, 2017

Contributor

Ok, thank you. I cleaned it.

@jnothman jnothman changed the title from [MRG] Change named_steps to Bunch object to [MRG+1] Change named_steps to Bunch object Mar 24, 2017

@TomDLT

TomDLT approved these changes Mar 30, 2017

@TomDLT TomDLT merged commit fb5a498 into scikit-learn:master Mar 30, 2017

5 checks passed

ci/circleci Your tests passed on CircleCI!
Details
codecov/patch 100% of diff hit (target 95.49%)
Details
codecov/project 95.49% (+<.01%) compared to 134c9b7
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@amueller

This comment has been minimized.

Show comment
Hide comment
@amueller

amueller Mar 30, 2017

Member

Awesome, thanks!

Member

amueller commented Mar 30, 2017

Awesome, thanks!

@@ -270,6 +270,12 @@ API changes summary
needed for the perplexity calculation. :issue:`7954` by
:user:`Gary Foreman <garyForeman>`.
- Replace attribute ``named_steps`` ``dict`` to :class:`sklearn.utils.Bunch`
in :class:`sklearn.pipeline.Pipeline` to enable tab completion in interactive
environment. In the case conflict value on ``named_steps`` and ``dict``

This comment has been minimized.

@amueller

amueller Mar 30, 2017

Member

I'm not sure I understand this sentence.

@amueller

amueller Mar 30, 2017

Member

I'm not sure I understand this sentence.

massich added a commit to massich/scikit-learn that referenced this pull request Apr 26, 2017

[MRG+1] Change named_steps to Bunch object (#8586)
* Change named_steps to Bunch object

* Update named_steps attribute documentation

* Add test for named steps bunch object

* Delete whitespace in test_pipeline

* Update test_pipeline.py

* Add comment for named_steps usage

* Move dataset/Bunch to utils

* Fix to PEP8 format

* Add __getattribute method to Bunch class, Fix pep8 bug

* Remove __getattribute__, update test_pipeline

* Update test with conflict and non-conflict named_steps

* Add reference to class Pipeline

@herilalaina herilalaina deleted the herilalaina:autocomplete-ipython branch Apr 30, 2017

Sundrique added a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017

[MRG+1] Change named_steps to Bunch object (#8586)
* Change named_steps to Bunch object

* Update named_steps attribute documentation

* Add test for named steps bunch object

* Delete whitespace in test_pipeline

* Update test_pipeline.py

* Add comment for named_steps usage

* Move dataset/Bunch to utils

* Fix to PEP8 format

* Add __getattribute method to Bunch class, Fix pep8 bug

* Remove __getattribute__, update test_pipeline

* Update test with conflict and non-conflict named_steps

* Add reference to class Pipeline

NelleV added a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017

[MRG+1] Change named_steps to Bunch object (#8586)
* Change named_steps to Bunch object

* Update named_steps attribute documentation

* Add test for named steps bunch object

* Delete whitespace in test_pipeline

* Update test_pipeline.py

* Add comment for named_steps usage

* Move dataset/Bunch to utils

* Fix to PEP8 format

* Add __getattribute method to Bunch class, Fix pep8 bug

* Remove __getattribute__, update test_pipeline

* Update test with conflict and non-conflict named_steps

* Add reference to class Pipeline

paulha added a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017

[MRG+1] Change named_steps to Bunch object (#8586)
* Change named_steps to Bunch object

* Update named_steps attribute documentation

* Add test for named steps bunch object

* Delete whitespace in test_pipeline

* Update test_pipeline.py

* Add comment for named_steps usage

* Move dataset/Bunch to utils

* Fix to PEP8 format

* Add __getattribute method to Bunch class, Fix pep8 bug

* Remove __getattribute__, update test_pipeline

* Update test with conflict and non-conflict named_steps

* Add reference to class Pipeline

maskani-moh added a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017

[MRG+1] Change named_steps to Bunch object (#8586)
* Change named_steps to Bunch object

* Update named_steps attribute documentation

* Add test for named steps bunch object

* Delete whitespace in test_pipeline

* Update test_pipeline.py

* Add comment for named_steps usage

* Move dataset/Bunch to utils

* Fix to PEP8 format

* Add __getattribute method to Bunch class, Fix pep8 bug

* Remove __getattribute__, update test_pipeline

* Update test with conflict and non-conflict named_steps

* Add reference to class Pipeline

jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017

[MRG+1] Change named_steps to Bunch object (#8586)
* Change named_steps to Bunch object

* Update named_steps attribute documentation

* Add test for named steps bunch object

* Delete whitespace in test_pipeline

* Update test_pipeline.py

* Add comment for named_steps usage

* Move dataset/Bunch to utils

* Fix to PEP8 format

* Add __getattribute method to Bunch class, Fix pep8 bug

* Remove __getattribute__, update test_pipeline

* Update test with conflict and non-conflict named_steps

* Add reference to class Pipeline
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment