Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+2] MultiOutputClassifier classes_ attribute #14629

Merged
merged 10 commits into from Aug 14, 2019

Conversation

agamemnonc
Copy link
Contributor

@agamemnonc agamemnonc commented Aug 12, 2019

Reference Issues/PRs

Fixes #14615.

What does this implement/fix? Explain your changes.

MultiOutputClassifier class now has attribute classes_ (same as ClassifierChain)

Any other comments?

@agamemnonc
Copy link
Contributor Author

@agamemnonc agamemnonc commented Aug 12, 2019

Shall we include a non-regression test checking the classes_ of a fitted MultiOutputClassifier?

Copy link
Member

@jnothman jnothman left a comment

Please do extend a test to check this.

@agamemnonc
Copy link
Contributor Author

@agamemnonc agamemnonc commented Aug 12, 2019

Please do extend a test to check this.

OK done.

Please let me know if anything else is needed. Otherwise please review and change PR title accordingly.

@agamemnonc agamemnonc changed the title Multioutput classes fix MultiOutputClassifier classes_ attribute fix Aug 12, 2019
@agamemnonc agamemnonc changed the title MultiOutputClassifier classes_ attribute fix MultiOutputClassifier classes_ attribute Aug 12, 2019
sklearn/tests/test_multioutput.py Outdated Show resolved Hide resolved
@agamemnonc
Copy link
Contributor Author

@agamemnonc agamemnonc commented Aug 13, 2019

On a side note, the dataset used in most of the tests is created globally, would it not be better practice to have that wrapped in a function definition, then call that function from within all tests that use the particular dataset?

# Import the data
iris = datasets.load_iris()
# create a multiple targets by randomized shuffling and concatenating y.
X = iris.data
y1 = iris.target
y2 = shuffle(y1, random_state=1)
y3 = shuffle(y1, random_state=2)
y = np.column_stack((y1, y2, y3))
n_samples, n_features = X.shape
n_outputs = y.shape[1]
n_classes = len(np.unique(y1))
classes = list(map(np.unique, (y1, y2, y3)))

@glemaitre
Copy link
Contributor

@glemaitre glemaitre commented Aug 13, 2019

On a side note, the dataset used in most of the tests is created globally, would it not be better practice to have that wrapped in a function definition, then call that function from within all tests that use the particular dataset?

We could use the pytest.fixture instead. But I would say that this is out of the scope for this PR. We might need to have something consistent across the source code then such that this is easy to know where the fixtures are defined etc. However, this is a good remark

@agamemnonc
Copy link
Contributor Author

@agamemnonc agamemnonc commented Aug 13, 2019

On a side note, the dataset used in most of the tests is created globally, would it not be better practice to have that wrapped in a function definition, then call that function from within all tests that use the particular dataset?

We could use the pytest.fixture instead. But I would say that this is out of the scope for this PR. We might need to have something consistent across the source code then such that this is easy to know where the fixtures are defined etc. However, this is a good remark

Agreed. Feel free to open a separate issue then if you wish.

Copy link
Contributor

@glemaitre glemaitre left a comment

Couple of comments but it looks good

sklearn/multioutput.py Outdated Show resolved Hide resolved
sklearn/multioutput.py Outdated Show resolved Hide resolved
sklearn/multioutput.py Outdated Show resolved Hide resolved
sklearn/multioutput.py Outdated Show resolved Hide resolved
sklearn/tests/test_multioutput.py Outdated Show resolved Hide resolved
Copy link
Member

@NicolasHug NicolasHug left a comment

Nit but LGTM, thanks @agamemnonc

doc/whats_new/v0.22.rst Outdated Show resolved Hide resolved
Copy link
Contributor

@glemaitre glemaitre left a comment

LGTM

sklearn/tests/test_multioutput.py Outdated Show resolved Hide resolved
@glemaitre
Copy link
Contributor

@glemaitre glemaitre commented Aug 13, 2019

Could you also merge master into your branch such that we check that the CI for the doc is passing.
It seems just an issue that we earlier fixed

@agamemnonc
Copy link
Contributor Author

@agamemnonc agamemnonc commented Aug 14, 2019

Thanks @glemaitre @NicolasHug and @jnothman for feedback.

I think this is now ready.

@agamemnonc agamemnonc changed the title MultiOutputClassifier classes_ attribute [MRG+2] MultiOutputClassifier classes_ attribute Aug 14, 2019
@glemaitre glemaitre merged commit 06dc44c into scikit-learn:master Aug 14, 2019
16 of 17 checks passed
@glemaitre
Copy link
Contributor

@glemaitre glemaitre commented Aug 14, 2019

@agamemnonc Thanks for your contribution

@agamemnonc agamemnonc deleted the multioutput_classes_fix branch Aug 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants