Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+2] Minimize the validation of X in adaboost #13174

Merged
merged 25 commits into from Feb 28, 2019

Conversation

Projects
None yet
6 participants
@chkoar
Copy link
Contributor

chkoar commented Feb 15, 2019

Reference Issues/PRs

Fixes #7768. Takes over #8304.

What does this implement/fix? Explain your changes.

This PR (almost) transfers the responsibility of the validation of X and y from AdaBoost to the base estimator.

Any other comments?

I retained all the tests. If we want to relax the check about sparsity we should remove the test_sparse_classification and test_sparse_regression tests.

@jnothman
Copy link
Member

jnothman left a comment

Looks nice

Please add an entry to the change log at doc/whats_new/v0.21.rst. Like the other entries there, please reference this pull request with :issue: and credit yourself (and other contributors if applicable) with :user:

Show resolved Hide resolved sklearn/ensemble/weight_boosting.py Outdated
Show resolved Hide resolved sklearn/ensemble/weight_boosting.py Outdated
Show resolved Hide resolved sklearn/ensemble/weight_boosting.py Outdated

boost = AdaBoostRegressor(DummyEstimator(), n_estimators=3)
boost.fit(X, y_regr)
assert_equal(len(boost.estimator_weights_), len(boost.estimator_errors_))

This comment has been minimized.

@jnothman

jnothman Feb 17, 2019

Member

With the adoption of pytest, we are phasing out use of test helpers assert_equal, assert_true, etc. Please use bare assert statements, e.g. assert x == y, assert not x, etc.

This comment has been minimized.

@chkoar

chkoar Feb 18, 2019

Author Contributor

I did not touch that test. It seems that I did because I removed and then readd the test_sparse_classification and test_sparse_regression tests. Do you want me to fix that even it is unrelated to the PR?

@NicolasHug
Copy link
Contributor

NicolasHug left a comment

some comments

Show resolved Hide resolved sklearn/ensemble/tests/test_weight_boosting.py Outdated
Show resolved Hide resolved sklearn/ensemble/tests/test_weight_boosting.py Outdated
Show resolved Hide resolved sklearn/ensemble/tests/test_weight_boosting.py Outdated
@chkoar

This comment has been minimized.

Copy link
Contributor Author

chkoar commented Feb 18, 2019

@jnothman @NicolasHug do you think that we should retain the test_sparse_classification and test_sparse_regression tests?

@jnothman
Copy link
Member

jnothman left a comment

I don't understand. Why should we not retain those tests?

Show resolved Hide resolved sklearn/ensemble/tests/test_weight_boosting.py Outdated
@chkoar

This comment has been minimized.

Copy link
Contributor Author

chkoar commented Feb 18, 2019

I don't understand. Why should we not retain those tests?

@jnothman in the process of the minimization of the validation I was thinking that these checks/tests should be in the responsibility of the base estimator. That's why I am asking.

@chkoar chkoar force-pushed the chkoar:minimize_validation_in_meta_estimators branch from 88cebc6 to 2e9a045 Feb 21, 2019

chkoar added some commits Feb 21, 2019

@chkoar

This comment has been minimized.

Copy link
Contributor Author

chkoar commented Feb 21, 2019

@jnothman _num_samples might be unnecessary since we use check_array and check_X_y

chkoar added some commits Feb 23, 2019

chkoar added some commits Feb 25, 2019

@jnothman
Copy link
Member

jnothman left a comment

Otherwise LGTM

@@ -118,6 +118,9 @@ Support for Python 3.4 and below has been officially dropped.
value of ``learning_rate`` in ``update_terminal_regions`` is not consistent
with the document and the caller functions.
:issue:`6463` by :user:`movelikeriver <movelikeriver>`.

- |Enhancement| Minimized the validation of X in :class:`ensemble.AdaBoostClassifier`
and :class:`ensemble.AdaBoostRegressor` :issue:`13174` by :user:`Christos Aridas <chkoar>`.

This comment has been minimized.

@jnothman

jnothman Feb 26, 2019

Member

please keep lines under 80 characters

@@ -70,16 +69,32 @@ def __init__(self,
self.learning_rate = learning_rate
self.random_state = random_state

def _validate_data(self, X, y=None):

accept_sparse = ['csr', 'csc']

This comment has been minimized.

@jnothman

jnothman Feb 26, 2019

Member

It took me a while to understand why we don't just accept_sparse=True. Maybe add a comment that these are required for safe_indexing support???

This comment has been minimized.

@chkoar

chkoar Feb 26, 2019

Author Contributor

test_sparse_regression and test_sparse_classification check that sparse matrices will be converted in one of the two formats. Check here for instance. We could minimize more the validation by turn of these assertions and use accept_sparse=True but safe_indexing does not support all sparse matrices. What do you propose?

This comment has been minimized.

@GaelVaroquaux

GaelVaroquaux Feb 27, 2019

Member

What @jnothman is asking for is simply a comment on the line above saying that these are required for safe_indexing support.

@agramfort agramfort changed the title Minimize the validation of X in adaboost [MRG+1] Minimize the validation of X in adaboost Feb 27, 2019

@GaelVaroquaux
Copy link
Member

GaelVaroquaux left a comment

LGTM, but the comment by @jnothman should be addressed (it's just a question of adding a comment line)

@@ -70,16 +69,32 @@ def __init__(self,
self.learning_rate = learning_rate
self.random_state = random_state

def _validate_data(self, X, y=None):

accept_sparse = ['csr', 'csc']

This comment has been minimized.

@GaelVaroquaux

GaelVaroquaux Feb 27, 2019

Member

What @jnothman is asking for is simply a comment on the line above saying that these are required for safe_indexing support.

@GaelVaroquaux GaelVaroquaux changed the title [MRG+1] Minimize the validation of X in adaboost [MRG+2] Minimize the validation of X in adaboost Feb 27, 2019

@glemaitre

This comment has been minimized.

Copy link
Contributor

glemaitre commented Feb 27, 2019

Some PEP8 I think

@chkoar

This comment has been minimized.

Copy link
Contributor Author

chkoar commented Feb 27, 2019

@glemaitre good catch. I think it is ok.

@jnothman jnothman merged commit 9d21197 into scikit-learn:master Feb 28, 2019

10 checks passed

LGTM analysis: C/C++ No code changes detected
Details
LGTM analysis: JavaScript No code changes detected
Details
LGTM analysis: Python No new or fixed alerts
Details
ci/circleci: deploy Your tests passed on CircleCI!
Details
ci/circleci: doc Your tests passed on CircleCI!
Details
ci/circleci: doc-min-dependencies Your tests passed on CircleCI!
Details
ci/circleci: lint Your tests passed on CircleCI!
Details
codecov/patch 100% of diff hit (target 92.47%)
Details
codecov/project 92.55% (+0.08%) compared to 09bc276
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@jnothman

This comment has been minimized.

Copy link
Member

jnothman commented Feb 28, 2019

Thanks @chkoar!

Kiku-git added a commit to Kiku-git/scikit-learn that referenced this pull request Mar 4, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.