Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+2] Minimize the validation of X in adaboost #13174

Merged
Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
3 changes: 3 additions & 0 deletions doc/whats_new/v0.21.rst
Expand Up @@ -118,6 +118,9 @@ Support for Python 3.4 and below has been officially dropped.
value of ``learning_rate`` in ``update_terminal_regions`` is not consistent
with the document and the caller functions.
:issue:`6463` by :user:`movelikeriver <movelikeriver>`.

- |Enhancement| Minimized the validation of X in :class:`ensemble.AdaBoostClassifier`
and :class:`ensemble.AdaBoostRegressor` :issue:`13174` by :user:`Christos Aridas <chkoar>`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please keep lines under 80 characters


:mod:`sklearn.externals`
........................
Expand Down
47 changes: 46 additions & 1 deletion sklearn/ensemble/tests/test_weight_boosting.py
Expand Up @@ -471,7 +471,6 @@ def fit(self, X, y, sample_weight=None):
def test_sample_weight_adaboost_regressor():
"""
AdaBoostRegressor should work without sample_weights in the base estimator

The random weighted sampling is done internally in the _boost method in
AdaBoostRegressor.
"""
Expand All @@ -486,3 +485,49 @@ def predict(self, X):
boost = AdaBoostRegressor(DummyEstimator(), n_estimators=3)
boost.fit(X, y_regr)
assert_equal(len(boost.estimator_weights_), len(boost.estimator_errors_))


def test_multidimensional_X():
"""
Check that the AdaBoost estimators can work with n-dimensional
data matrix
"""
class DummyClassifier(BaseEstimator):

def fit(self, X, y, sample_weight=None):
self.classes_ = np.unique(y)
return self

def predict(self, X):
n_samples = len(X)
predictions = np.random.choice(self.classes_, n_samples)
return predictions

def predict_proba(self, X):
n_samples = len(X)
n_classes = len(self.classes_)
probas = np.random.randn(n_samples, n_classes)
return probas

class DummyRegressor(BaseEstimator):

def fit(self, X, y):
return self

def predict(self, X):
n_samples = len(X)
predictions = np.random.randn(n_samples)
return predictions

X = np.random.randn(50, 3, 3)
yc = np.random.choice([0, 1], 50)
yr = np.random.randn(50)

boost = AdaBoostClassifier(DummyClassifier())
chkoar marked this conversation as resolved.
Show resolved Hide resolved
boost.fit(X, yc)
boost.predict(X)
boost.predict_proba(X)

chkoar marked this conversation as resolved.
Show resolved Hide resolved
boost = AdaBoostRegressor(DummyRegressor())
boost.fit(X, yr)
boost.predict(X)