New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Arbitrary SVC kernels #11296
[MRG] Arbitrary SVC kernels #11296
Conversation
And yet it seems you already have test failures :( Let us know if you need help |
8e5bf21
to
f324b1f
Compare
@@ -125,7 +126,7 @@ def test_precomputed(): | |||
|
|||
kfunc = lambda x, y: np.dot(x, y.T) | |||
clf = svm.SVC(kernel=kfunc) | |||
clf.fit(X, Y) | |||
clf.fit(np.array(X), Y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
basically here it's an API change. Before for a custom kernel we would convert X to array. I propose to delegate the input checking to kernel function.
@@ -980,7 +981,7 @@ def test_svc_bad_kernel(): | |||
def test_timeout(): | |||
a = svm.SVC(kernel=lambda x, y: np.dot(x, y.T), probability=True, | |||
random_state=0, max_iter=1) | |||
assert_warns(ConvergenceWarning, a.fit, X, Y) | |||
assert_warns(ConvergenceWarning, a.fit, np.array(X), Y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here. Due to API change.
sklearn/svm/tests/test_svm.py
Outdated
assert svc1.score(data, y) == svc3.score(K, y) | ||
assert svc1.score(data, y) == svc2.score(X, y) | ||
if hasattr(svc1, 'decision_function'): # classifier | ||
assert_array_almost_equal(svc1.decision_function(data), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use assert_allclose
instead of assert_array_almost_equal
sklearn/svm/_base.py
Outdated
|
||
if self.kernel == "precomputed": | ||
if X.shape[1] != self.shape_fit_[0]: | ||
raise ValueError("X.shape[1] = %d should be equal to %d, " | ||
"the number of samples at training time" % | ||
(X.shape[1], self.shape_fit_[0])) | ||
elif n_features != self.shape_fit_[1]: | ||
elif not callable(self.kernel) and n_features != self.shape_fit_[1]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If n_features
is only used here, let's be explicit:
elif not callable(self.kernel) and n_features != self.shape_fit_[1]: | |
elif not callable(self.kernel) and X.shape[1] != self.shape_fit_[1]: |
sklearn/svm/_base.py
Outdated
|
||
if self.kernel == "precomputed": | ||
if X.shape[1] != self.shape_fit_[0]: | ||
raise ValueError("X.shape[1] = %d should be equal to %d, " | ||
"the number of samples at training time" % | ||
(X.shape[1], self.shape_fit_[0])) | ||
elif n_features != self.shape_fit_[1]: | ||
elif not callable(self.kernel) and n_features != self.shape_fit_[1]: | ||
raise ValueError("X.shape[1] = %d should be equal to %d, " | ||
"the number of features at training time" % | ||
(n_features, self.shape_fit_[1])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(n_features, self.shape_fit_[1])) | |
(X.shape[1], self.shape_fit_[1])) |
sklearn/svm/_base.py
Outdated
@@ -454,14 +467,15 @@ def _validate_for_predict(self, X): | |||
raise ValueError( | |||
"cannot use sparse input in %r trained on dense data" | |||
% type(self).__name__) | |||
n_samples, n_features = X.shape | |||
if not callable(self.kernel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove this assignment then
sklearn/svm/_base.py
Outdated
raise ValueError("sample_weight and X have incompatible shapes: " | ||
"%r vs %r\n" | ||
"Note: Sparse matrices cannot be indexed w/" | ||
"boolean masks (use `indices=True` in CV)." | ||
% (sample_weight.shape, X.shape)) | ||
|
||
if isinstance(self.gamma, str): | ||
kernel = self.kernel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kernel = self.kernel | |
kernel = 'precomputed' if callable(self.kernel) else self.kernel |
doc/whats_new/v0.23.rst
Outdated
kernels in :class:`svm.SVC` and :class:`svm.SVR`. | ||
:pr:`11296` by `Alexandre Gramfort`_ and :user:`Georgi Peev <georgipeev>`. | ||
|
||
- |API| Do not enforce float entries in X when using custom kernel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could add this change in Change of behaviour
. I think this is fine since it could be considered as bug fix but better to document it as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I thought I posted this yesterday. Hopefully not out of date
doc/whats_new/v0.23.rst
Outdated
kernels in :class:`svm.SVC` and :class:`svm.SVR`. | ||
:pr:`11296` by `Alexandre Gramfort`_ and :user:`Georgi Peev <georgipeev>`. | ||
|
||
- |API| Do not enforce float entries in X when using custom kernel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand "API" to mean "there's a new way to do something I could do before". I'm not sure it quite fits here
@@ -185,6 +185,14 @@ Changelog | |||
`probB_`, are now deprecated as they were not useful. :pr:`15558` by | |||
`Thomas Fan`_. | |||
|
|||
- |Fix| Fix use of custom kernel not taking float entries such as string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be regarded as an enhancement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well it was support to work but it was not tested. I prefer to see this as a bug fix.
order='C', accept_sparse='csr', | ||
accept_large_sparse=False) | ||
if callable(self.kernel): | ||
check_consistent_length(X, y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now do we have a backward incompatibility since before a kernel would have been passed a validated float array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay with that level of incompatibility though perhaps it deserves a note in the change log
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is mentioned in section Changed models
e649ab3
to
638b989
Compare
@jnothman feel free to merge. I tried to address your comments. |
Thanks @georgipeev |
Reference Issues/PRs
Fixes #10713
What does this implement/fix? Explain your changes.
Removed non-applicable input validation when a custom kernel is used. Allowed specifying X as a list when using custom kernels.
Any other comments?
WIP until I add a meaningful unit test aside from the current one that merely checks that no errors arise when using custom kernels.