Skip to content

Conversation

ArturoAmorQ
Copy link
Member

Reference Issues/PRs

What does this implement/fix? Explain your changes.

DecisionBoundaryDisplay expects the input data X to be of shape (n_samples, 2), but this is not ensured by an internal validation. Because of this, the following code will output an unintuitive ValueError message:

import numpy as np
import pandas as pd
from sklearn.neighbors import KNeighborsClassifier
from sklearn.inspection import DecisionBoundaryDisplay

rng = np.random.RandomState(0)
X = pd.DataFrame(np.random.randn(5, 4))
y = pd.Series(rng.randint(0, 2, 5))

model = KNeighborsClassifier()
model.fit(X, y)

DecisionBoundaryDisplay.from_estimator(estimator=model, X=X)

Even worse, a further

from sklearn.ensemble import HistGradientBoostingClassifier

model = HistGradientBoostingClassifier()
model.fit(X, y)

DecisionBoundaryDisplay.from_estimator(estimator=model, X=X)

makes a plot!

This PR is a simple fix to solve the issue.

Any other comments?

A meta comment: I am wondering if we could add a parameter column_values (or similar) that accepts dictionaries to fix values for n_features - 2 of the variables, allowing then for surface curves of the decision boundary display.

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM.

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR @ArturoAmorQ !

@glemaitre glemaitre changed the title MAINT Array validation for DecisionBoundaryDisplay FIX Array validation for DecisionBoundaryDisplay Dec 1, 2022
@glemaitre
Copy link
Member

Actually, I would consider this as a fix and we should also add an entry in the changelog.
We can make it for the 1.2.0 release.

@glemaitre glemaitre added this to the 1.2 milestone Dec 1, 2022
@glemaitre glemaitre added the To backport PR merged in master that need a backport to a release branch defined based on the milestone. label Dec 1, 2022
Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @ArturoAmorQ

@ArturoAmorQ
Copy link
Member Author

Any idea of the reason for the Windows install fail?

@jeremiedbb
Copy link
Member

It's failing everywhere, but we did not find out why yet

@jeremiedbb jeremiedbb merged commit c047241 into scikit-learn:main Dec 2, 2022
@ArturoAmorQ ArturoAmorQ deleted the DecisionBoundaryDisplay branch December 2, 2022 12:44
jeremiedbb pushed a commit to jeremiedbb/scikit-learn that referenced this pull request Dec 6, 2022
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module:inspection To backport PR merged in master that need a backport to a release branch defined based on the milestone.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants