Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Fix KernelCenterer requires square input but doesn't error on non-square #14336

Merged
merged 27 commits into from Jul 30, 2019

Conversation

@gdex1
Copy link
Contributor

gdex1 commented Jul 13, 2019

Reference Issues/PRs

Fix #14240

What does this implement/fix? Explain your changes.

KernelCenterer.fit will now raise a ValueError when called on a non-square kernel matrix.

Added pairwise_estimator_convert_X to check_fit2d_1sample and check_transformer_data_not_an_array as described by #14240 (comment).

Added logical check X.T != X to assert_raises statement in _check_transformer. Previously, _check_transformer assumed that X.T had a different number of features than X. However, if pairwise_estimator_convert_X is called where transformer is pairwise, then X.T == X therefore no error ValueError will be raised so the check is invalid.

  • Adding pairwise_estimator_convert_X to check_transformer_general caused two tests to fail through _check_transformer assert_raises statement that "The transformer {} does not raise an error when the number of features in transform is different from the number of features in fit."

Any other comments?

Fitting KernelCenterer on an nxn matrix and transforming a kxk matrix where n!=k is not currently tested by any unit test. Doing this manually raises the errorValueError: operands could not be broadcast together with shapes (k,k) (n,) (k,k). _check_transformer could be modified to transpose X before pairwise_estimator_convert_X is called to make the test work on pairwise estimators. KernelCenterer would need to be modified to give a more useful error message.

@gdex1 gdex1 closed this Jul 13, 2019
@gdex1 gdex1 reopened this Jul 13, 2019
@gdex1

This comment has been minimized.

Copy link
Contributor Author

gdex1 commented Jul 13, 2019

Do the proposed changes in the additional comments section seem like the appropriate ways to handle these issues?

I'm new to open source so any feedback on my solutions/code/workflow/PR is welcome. Thanks.

@amueller

This comment has been minimized.

Copy link
Member

amueller commented Jul 13, 2019

Thanks. Your proposed solution makes sense but leaves that part of the code untested as I mentioned above so I suggested an alternative.

@gdex1 gdex1 force-pushed the gdex1:Fix-issue-14240 branch from 2e474c9 to 37862ac Jul 17, 2019
gdex1 added 9 commits Jul 17, 2019
Update
@gdex1

This comment has been minimized.

Copy link
Contributor Author

gdex1 commented Jul 18, 2019

I've made the change mentioned #14240 and added a test to check the change. I am not sure if I structured the test correctly so could someone check it?

I think that possibly some other tests and errors need to be added to KernelCenterer. Am I correct that a new issue would be a better place to discuss it?

@gdex1 gdex1 changed the title [WIP] Fix KernelCenterer requires square input but doesn't error on non-square [MRG] Fix KernelCenterer requires square input but doesn't error on non-square Jul 18, 2019
@adrinjalali

This comment has been minimized.

Copy link
Member

adrinjalali commented Jul 19, 2019

It's kinda expected than when you add a test to the common tests, it detects issues with a couple of estimators that we were not detecting before. For the ones which are easy to fix, you can fix them, and for the rest, apply the changes you have done here, and we'll see together why the other tests are failing.

@gdex1

This comment has been minimized.

Copy link
Contributor Author

gdex1 commented Jul 19, 2019

I think everything is complete now. In total I added:
-Value error when KernelCenterer gets non-square input
-Unit test to check pairwise estimators raise an error on non-square input
-Value error when KNeighborsRegressor(metric='precomputed') get non square input

and modified some tests that gave non-square input to pairwise estimators

@gdex1 gdex1 force-pushed the gdex1:Fix-issue-14240 branch from a1fea8b to d869bbd Jul 19, 2019
@adrinjalali

This comment has been minimized.

Copy link
Member

adrinjalali commented Jul 22, 2019

Otherwise LGTM!

Copy link
Member

adrinjalali left a comment

LGTM, thanks @gdex1

Copy link
Member

jnothman left a comment

This is looking good. The nearest neighbour changes might be duplicated in #10482 but that's fine.

sklearn/utils/tests/test_estimator_checks.py Outdated Show resolved Hide resolved
sklearn/utils/tests/test_estimator_checks.py Outdated Show resolved Hide resolved
sklearn/utils/estimator_checks.py Outdated Show resolved Hide resolved
gdex1 and others added 3 commits Jul 24, 2019
Co-Authored-By: Joel Nothman <joel.nothman@gmail.com>
… into Fix-issue-14240
@jnothman

This comment has been minimized.

Copy link
Member

jnothman commented Jul 24, 2019

Tests failing. I've not checked why

@gdex1

This comment has been minimized.

Copy link
Contributor Author

gdex1 commented Jul 24, 2019

@jnothman Hi. If I'm interpreting the failed tests correctly, the problem is that estimator_checks.py imports pytest for pytest.raises as described in #14336 (comment). I am not exactly sure how to fix this as other places in the repository seem to import pytest without causing an error.

Edit: Also I was wondering what the best way to test my code is before I push?

@jnothman

This comment has been minimized.

Copy link
Member

jnothman commented Jul 25, 2019

Ahh right. For good or bad, we can't use pytest.raises in estimator_checks.py as it is considered public sklearn API and sklearn does not have a formal dependency on pytest (only its test suite does). So please revert to using assert_raise*

@gdex1 gdex1 changed the title [MRG] Fix KernelCenterer requires square input but doesn't error on non-square [MRG+1] Fix KernelCenterer requires square input but doesn't error on non-square Jul 26, 2019
Copy link
Member

jnothman left a comment

Please add an entry to the change log at doc/whats_new/v0.22.rst under the Changes to Estimator Checks section. Like the other entries there, please reference this pull request with :pr: and credit yourself (and other contributors if applicable) with :user:

@gdex1

This comment has been minimized.

Copy link
Contributor Author

gdex1 commented Jul 28, 2019

@jnothman I've added document changes, but I am not completely sure if it is correct. I also noticed that the modules are not in alphabetical order.

@amueller

This comment has been minimized.

Copy link
Member

amueller commented Jul 30, 2019

awesome, thanks!

@amueller amueller merged commit 86c5998 into scikit-learn:master Jul 30, 2019
17 checks passed
17 checks passed
LGTM analysis: C/C++ No code changes detected
Details
LGTM analysis: JavaScript No code changes detected
Details
LGTM analysis: Python No new or fixed alerts
Details
ci/circleci: deploy Your tests passed on CircleCI!
Details
ci/circleci: doc Your tests passed on CircleCI!
Details
ci/circleci: doc-min-dependencies Your tests passed on CircleCI!
Details
ci/circleci: lint Your tests passed on CircleCI!
Details
codecov/patch 100% of diff hit (target 96.68%)
Details
codecov/project 96.79% (+0.11%) compared to 3896d49
Details
scikit-learn.scikit-learn Build #20190728.25 succeeded
Details
scikit-learn.scikit-learn (Linux py35_conda_openblas) Linux py35_conda_openblas succeeded
Details
scikit-learn.scikit-learn (Linux py35_ubuntu_atlas) Linux py35_ubuntu_atlas succeeded
Details
scikit-learn.scikit-learn (Linux pylatest_conda_mkl_pandas) Linux pylatest_conda_mkl_pandas succeeded
Details
scikit-learn.scikit-learn (Linux32 py35_ubuntu_atlas_32bit) Linux32 py35_ubuntu_atlas_32bit succeeded
Details
scikit-learn.scikit-learn (Windows py35_pip_openblas_32bit) Windows py35_pip_openblas_32bit succeeded
Details
scikit-learn.scikit-learn (Windows py37_conda_mkl) Windows py37_conda_mkl succeeded
Details
scikit-learn.scikit-learn (macOS pylatest_conda_mkl) macOS pylatest_conda_mkl succeeded
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.