New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH scipy blas for svm kernel function #16530
ENH scipy blas for svm kernel function #16530
Conversation
@rth @jeremiedbb The failed test is probably caused by the precision of atlas, which I'm trying to reproduce. Since my code can pass the failed test by using openblas for scipy, can you give me some suggestion by reviewing my patch first? Can you supply the atlas version for Linux32 py36_ubuntu_atlas_32bit? |
Thanks for doing this @jim0421 ! I am not able to review this PR in detail today but the general approach sounds good. For the failing test, we could try to reproduce the failure by installing dependencies in a |
Hi buddy @rth, I've reproduced the problem on i386/ubuntu:18.04. However, on x86-64 platform ubuntu:bionic the problem doesn't exist. Thus we can reduce the problem to a platform-dependent problem.
|
Look forward to your advise @rth . |
Sorry @jim0421 I don't have much availability to investigate the failure in detail at the moment. This addition is great though, and personally I would be very happy to see it merged. It could take some time until someone takes a more detailed look. Maybe @jeremiedbb would have some availability, not sure. |
Really pleasant to get your reply. @rth @jeremiedbb Put aside my patch, I find a simple bug in svm.cpp, which can only be reproduced on 32-bit ubuntu with dependency. Here is the code (only 3 lines).
And the bug report is as follows.
It really confused me and hope for your help. |
@rth Hi buddy, are you free these days? Can you try this simple change on ubuntu-32? |
@jim0421 we still need to make CI green in order to merge. Would increasing tolerances a bit (specifically for the failing class and 32bit linux) fix it, say 5-4 digits instead of 6? That could be a possibility. I can't see the log for the failing CI job for some reason. I don't really understand why the diff in #16530 (comment) could lead to a test failure. |
Also please add an entry to the change log at |
@rth If you increase tolerances to 4 bits after point for the failing test, this precision problem will be eliminated. And I've dug deeply inside the precision problem. The detail is as below.
According to my experiment, adding any system call like sleep, mmap, printf, fopen or scipy blas api will cause the precision difference. And the precision is improved with the system call added into dot function, as you can see in my older reply. From the asm code, I find the dot function is not optimized into the whole calculation.
I suppose this is a precision bug for kernel_rbf in svm.cpp with patch on 32 ubuntu system which can be reproduced stablely, |
Ahh, so the failing test is in Independently we still need to make sure we have a tests that check that BTW, as far as I can tell we don't actually check this in common tests (e.g. in
Wow, thanks! I don't think we often had contributors digging that deep :). Overall the fact that the optimized version produces slightly different output from the non-optimized case doesn't seem too unexpected to me. It looks like this deviation just gets amplified when used with |
doc/whats_new/v0.23.rst
Outdated
- |Enhancement| invoke scipy blas api for svm kernel function in ``fit``, | ||
``predict``, ``predict_proba``, ``decision_function``, ``cross_validation``, | ||
``libsvm_sparse_train``, ``libsvm_sparse_predict``, | ||
``libsvm_sparse_predict_proba`` and ``libsvm_sparse_decision_function`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The libsvm_*
are not part of the public API so that only leaves predict
, predict_proba
, decision_function
methods I think.
Also wouldn't nuSVR
and nuSVC
(and possibly OneClassSVM
) also be affected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've checked the code in _classes.py
and svm.cpp
. nuSVR
, nuSVC
and OneClassSVM
are also affected. Besides, I think fit
is also a public api, which I would like you to confirm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great. Yes sure fit as well.
@rth It's a nice suggestion to use
Here is the result.
|
@jim0421 How about LinearSVC? It's from liblinear instead of libsvm... |
@rth Applying such modification will eliminate the precision problem on ubuntu 32. However, some data in the loop seems to be problematic, which is shown in the end of the comment.
This modification will bring convergence warning.
Strange data.
|
OK, one problem is that the data is not scaled while it should be for SVMs. Once it's scaled, the linear kernel works fine. diff --git a/sklearn/ensemble/tests/test_bagging.py b/sklearn/ensemble/tests/test_bagging.py
index 883f0067f..92f50adde 100644
--- a/sklearn/ensemble/tests/test_bagging.py
+++ b/sklearn/ensemble/tests/test_bagging.py
@@ -4,9 +4,11 @@ Testing for the bagging ensemble module (sklearn.ensemble.bagging).
# Author: Gilles Louppe
# License: BSD 3 clause
+from itertools import product
import numpy as np
import joblib
+import pytest
from sklearn.base import BaseEstimator
@@ -31,7 +33,7 @@ from sklearn.feature_selection import SelectKBest
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_boston, load_iris, make_hastie_10_2
from sklearn.utils import check_random_state
-from sklearn.preprocessing import FunctionTransformer
+from sklearn.preprocessing import FunctionTransformer, scale
from scipy.sparse import csc_matrix, csr_matrix
@@ -76,8 +78,31 @@ def test_classification():
random_state=rng,
**params).fit(X_train, y_train).predict(X_test)
-
-def test_sparse_classification():
+@pytest.mark.parametrize(
+ 'sparse_format, params, method',
+ product(
+ [csc_matrix, csr_matrix],
+ [{
+ "max_samples": 0.5,
+ "max_features": 2,
+ "bootstrap": True,
+ "bootstrap_features": True
+ }, {
+ "max_samples": 1.0,
+ "max_features": 4,
+ "bootstrap": True,
+ "bootstrap_features": True
+ }, {
+ "max_features": 2,
+ "bootstrap": False,
+ "bootstrap_features": True
+ }, {
+ "max_samples": 0.5,
+ "bootstrap": True,
+ "bootstrap_features": False
+ }],
+ ['predict', 'predict_proba', 'predict_log_proba', 'decision_function']))
+def test_sparse_classification(sparse_format, params, method):
# Check classification for various parameter settings on sparse input.
class CustomSVC(SVC):
@@ -89,52 +114,33 @@ def test_sparse_classification():
return self
rng = check_random_state(0)
- X_train, X_test, y_train, y_test = train_test_split(iris.data,
+ X_train, X_test, y_train, y_test = train_test_split(scale(iris.data),
iris.target,
random_state=rng)
- parameter_sets = [
- {"max_samples": 0.5,
- "max_features": 2,
- "bootstrap": True,
- "bootstrap_features": True},
- {"max_samples": 1.0,
- "max_features": 4,
- "bootstrap": True,
- "bootstrap_features": True},
- {"max_features": 2,
- "bootstrap": False,
- "bootstrap_features": True},
- {"max_samples": 0.5,
- "bootstrap": True,
- "bootstrap_features": False},
- ]
- for sparse_format in [csc_matrix, csr_matrix]:
- X_train_sparse = sparse_format(X_train)
- X_test_sparse = sparse_format(X_test)
- for params in parameter_sets:
- for f in ['predict', 'predict_proba', 'predict_log_proba', 'decision_function']:
- # Trained on sparse format
- sparse_classifier = BaggingClassifier(
- base_estimator=CustomSVC(decision_function_shape='ovr'),
- random_state=1,
- **params
- ).fit(X_train_sparse, y_train)
- sparse_results = getattr(sparse_classifier, f)(X_test_sparse)
-
- # Trained on dense format
- dense_classifier = BaggingClassifier(
- base_estimator=CustomSVC(decision_function_shape='ovr'),
- random_state=1,
- **params
- ).fit(X_train, y_train)
- dense_results = getattr(dense_classifier, f)(X_test)
- assert_array_almost_equal(sparse_results, dense_results)
-
- sparse_type = type(X_train_sparse)
- types = [i.data_type_ for i in sparse_classifier.estimators_]
-
- assert all([t == sparse_type for t in types])
+ X_train_sparse = sparse_format(X_train)
+ X_test_sparse = sparse_format(X_test)
+ # Trained on sparse format
+ sparse_classifier = BaggingClassifier(
+ base_estimator=CustomSVC(kernel="linear", decision_function_shape='ovr'),
+ random_state=1,
+ **params
+ ).fit(X_train_sparse, y_train)
+ sparse_results = getattr(sparse_classifier, method)(X_test_sparse)
+
+ # Trained on dense format
+ dense_classifier = BaggingClassifier(
+ base_estimator=CustomSVC(kernel="linear", decision_function_shape='ovr'),
+ random_state=1,
+ **params
+ ).fit(X_train, y_train)
+ dense_results = getattr(dense_classifier, method)(X_test)
+ assert_array_almost_equal(sparse_results, dense_results)
+
+ sparse_type = type(X_train_sparse)
+ types = [i.data_type_ for i in sparse_classifier.estimators_]
+
+ assert all([t == sparse_type for t in types]) I also parametrized that test with pytest to make it easier to see what fails. Previously it was, the check on the decision function for all values of bagging parameters,
|
@rth All related test has passed after applying your patch for
|
Great, can you commit it? |
…cation in test_bagging.py
Thanks and just do it when it's at your convenience. Not a request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like how m_blas
is now a normal class attribute. Thank you @jim0421 !
LGTM
Thank you for working on this @jim0421 ! |
Really appreciate co-working with you and thanks for your suggestion and guidance. @thomasjpfan @rth @jeremiedbb |
Reference Issues/PRs
This PR is the related work for #15962.
What does this implement/fix? Explain your changes.
In the old PR above, I proposed an AVX512 version of SVM kernel function dot
and k_function. There is around 40% on our CLX machine on MLpack benchmark.
However, @rth mentioned that writing AVX512 code without run-time detection
was of limited use since most users don't build packages from sources
(with custom compile flags). Accordingly, I choose replacing my AVX512 implementation
svm kernel function with scipy blas api as suggested by @jeremiedbb . My implemenation
is similar to that in liblinear, which is to pass a pointer to the blas function
kernel::dot
method. Please help to review the patch and thanks for your great advise.
Any other comments?
The test data is attached.
profile final.xlsx
The precision and recall file is also attached.
training accuracy.xlsx