Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH scipy blas for svm kernel function #16530

Merged

Conversation

jim0421
Copy link
Contributor

@jim0421 jim0421 commented Feb 24, 2020

Reference Issues/PRs

This PR is the related work for #15962.

What does this implement/fix? Explain your changes.

In the old PR above, I proposed an AVX512 version of SVM kernel function dot
and k_function. There is around 40% on our CLX machine on MLpack benchmark.
However, @rth mentioned that writing AVX512 code without run-time detection
was of limited use since most users don't build packages from sources
(with custom compile flags). Accordingly, I choose replacing my AVX512 implementation
svm kernel function with scipy blas api as suggested by @jeremiedbb . My implemenation
is similar to that in liblinear, which is to pass a pointer to the blas function kernel::dot
method. Please help to review the patch and thanks for your great advise.

Any other comments?

The test data is attached.
profile final.xlsx
The precision and recall file is also attached.
training accuracy.xlsx

@jim0421
Copy link
Contributor Author

jim0421 commented Feb 25, 2020

@rth @jeremiedbb The failed test is probably caused by the precision of atlas, which I'm trying to reproduce. Since my code can pass the failed test by using openblas for scipy, can you give me some suggestion by reviewing my patch first? Can you supply the atlas version for Linux32 py36_ubuntu_atlas_32bit?

@rth
Copy link
Member

rth commented Feb 25, 2020

Thanks for doing this @jim0421 ! I am not able to review this PR in detail today but the general approach sounds good.

For the failing test, we could try to reproduce the failure by installing dependencies in a i386/ubuntu:18.04 docker image. But it might also be acceptable to increase the tolerance from 6 significant digits to 5 for 32 bit (it the test only passes with 4 significant digits that's a bit more problematic)

@jim0421
Copy link
Contributor Author

jim0421 commented Feb 28, 2020

Hi buddy @rth, I've reproduced the problem on i386/ubuntu:18.04. However, on x86-64 platform ubuntu:bionic the problem doesn't exist. Thus we can reduce the problem to a platform-dependent problem.
Moreover, I've tried to log all the input and output to a file, but the action itself also cause the precision problem without my scipy blas patch, which is quite strange. You can check the code below, and you will find I've commented out all the fprintf action in write_file and open_file to /dev/null to avoid IO influence.
Could you help to debug this strange problem, since it's not directly related to my scipy blas patch?
Or I suggest skipping the ubuntu-32 testing for this patch, since ubuntu-32 is not common these days.

Subject: [PATCH] simply add logging to file functionality

---
 svm.cpp | 43 +++++++++++++++++++++++++++++++++++++------
 1 file changed, 37 insertions(+), 6 deletions(-)

diff --git a/svm.cpp b/svm.cpp
index 9321340..cce1e76 100644
--- a/svm.cpp
+++ b/svm.cpp
@@ -294,11 +294,15 @@ public:
 		swap(x[i],x[j]);
 		if(x_square) swap(x_square[i],x_square[j]);
 	}
+	static void open_file();
+        static void close_file();
+        static void write_file(int type, double* arr1, double* arr2, int len, double sum);
 protected:
 
 	double (Kernel::*kernel_function)(int i, int j) const;
 
 private:
+	static FILE* m_file;
 #ifdef _DENSE_REP
 	PREFIX(node) *x;
 #else
@@ -396,6 +400,7 @@ double Kernel::dot(const PREFIX(node) *px, const PREFIX(node) *py)
 	int dim = min(px->dim, py->dim);
 	for (int i = 0; i < dim; i++)
 		sum += (px->values)[i] * (py->values)[i];
+	write_file(1, px->values, py->values, dim, sum);
 	return sum;
 }
 
@@ -406,6 +411,7 @@ double Kernel::dot(const PREFIX(node) &px, const PREFIX(node) &py)
 	int dim = min(px.dim, py.dim);
 	for (int i = 0; i < dim; i++)
 		sum += px.values[i] * py.values[i];
+	write_file(2, px.values, py.values, dim, sum);
 	return sum;
 }
 #else
@@ -431,6 +437,25 @@ double Kernel::dot(const PREFIX(node) *px, const PREFIX(node) *py)
 	return sum;
 }
 #endif
+void Kernel::open_file()
+{
+	/* Simply open and close file won't produce precision problem.
+	 * But the precision problem will appear if we add write_file,
+	 * though I've commented out everything. */
+        Kernel::m_file = fopen("/dev/null", "a");
+}
+
+void Kernel::close_file()
+{
+        fclose(Kernel::m_file);
+}
+
+void Kernel::write_file(int type, double* arr1, double* arr2, int len, double sum)
+{
+/*
+	I've commented out everything.
+*/
+}
 
 double Kernel::k_function(const PREFIX(node) *x, const PREFIX(node) *y,
 			  const svm_parameter& param)
@@ -446,11 +471,13 @@ double Kernel::k_function(const PREFIX(node) *x, const PREFIX(node) *y,
 			double sum = 0;
 #ifdef _DENSE_REP
 			int dim = min(x->dim, y->dim), i;
-			for (i = 0; i < dim; i++)
-			{
-				double d = x->values[i] - y->values[i];
-				sum += d*d;
-			}
+                        for (i = 0; i < dim; i++)
+                        {
+                                double d = x->values[i] - y->values[i];
+                                sum += d*d;
+                        }
+			write_file(3, x->values, y->values, dim, sum);
+
 			for (; i < x->dim; i++)
 				sum += x->values[i] * x->values[i];
 			for (; i < y->dim; i++)
@@ -508,7 +535,7 @@ double Kernel::k_function(const PREFIX(node) *x, const PREFIX(node) *y,
 			return 0;  // Unreachable 
 	}
 }
-
+FILE* NAMESPACE::Kernel::m_file=0;
 // An SMO algorithm in Fan et al., JMLR 6(2005), p. 1889--1918
 // Solves:
 //
@@ -2338,6 +2365,7 @@ static void remove_zero_weight(PREFIX(problem) *newprob, const PREFIX(problem) *
 PREFIX(model) *PREFIX(train)(const PREFIX(problem) *prob, const svm_parameter *param,
         int *status)
 {
+	NAMESPACE::Kernel::open_file();
 	PREFIX(problem) newprob;
 	remove_zero_weight(&newprob, prob);
 	prob = &newprob;
@@ -2615,6 +2643,7 @@ PREFIX(model) *PREFIX(train)(const PREFIX(problem) *prob, const svm_parameter *p
 	free(newprob.x);
 	free(newprob.y);
 	free(newprob.W);
+	NAMESPACE::Kernel::close_file();
 	return model;
 }
 
@@ -2787,6 +2816,7 @@ double PREFIX(get_svr_probability)(const PREFIX(model) *model)
 
 double PREFIX(predict_values)(const PREFIX(model) *model, const PREFIX(node) *x, double* dec_values)
 {
+	NAMESPACE::Kernel::open_file();
 	int i;
 	if(model->param.svm_type == ONE_CLASS ||
 	   model->param.svm_type == EPSILON_SVR ||
@@ -2868,6 +2898,7 @@ double PREFIX(predict_values)(const PREFIX(model) *model, const PREFIX(node) *x,
 		free(vote);
 		return model->label[vote_max_idx];
 	}
+	NAMESPACE::Kernel::close_file();
 }
 
 double PREFIX(predict)(const PREFIX(model) *model, const PREFIX(node) *x)
-- 

@jim0421
Copy link
Contributor Author

jim0421 commented Mar 3, 2020

Look forward to your advise @rth .

@rth
Copy link
Member

rth commented Mar 5, 2020

Sorry @jim0421 I don't have much availability to investigate the failure in detail at the moment. This addition is great though, and personally I would be very happy to see it merged. It could take some time until someone takes a more detailed look. Maybe @jeremiedbb would have some availability, not sure.

@jim0421
Copy link
Contributor Author

jim0421 commented Mar 6, 2020

Really pleasant to get your reply. @rth @jeremiedbb Put aside my patch, I find a simple bug in svm.cpp, which can only be reproduced on 32-bit ubuntu with dependency. Here is the code (only 3 lines).

From 3c903b79aba97f13225255f4b395066d3c90bc9f Mon Sep 17 00:00:00 2001
From: Shuahua Fan <mc_george123@hotmail.com>
Date: Fri, 6 Mar 2020 11:26:09 +0800
Subject: [PATCH] simple addition for logging to file

---
 sklearn/svm/src/libsvm/svm.cpp | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/sklearn/svm/src/libsvm/svm.cpp b/sklearn/svm/src/libsvm/svm.cpp
index 9321340..09fa935 100644
--- a/sklearn/svm/src/libsvm/svm.cpp
+++ b/sklearn/svm/src/libsvm/svm.cpp
@@ -406,6 +406,9 @@ double Kernel::dot(const PREFIX(node) &px, const PREFIX(node) &py)
 	int dim = min(px.dim, py.dim);
 	for (int i = 0; i < dim; i++)
 		sum += px.values[i] * py.values[i];
+	FILE* m_file = fopen("/dev/null", "a");
+	fprintf(m_file, "test\n");
+	fclose(m_file);
 	return sum;
 }
 #else
-- 
2.17.1

And the bug report is as follows.

>>> tt.test_sparse_classification()
/usr/local/lib/python3.6/dist-packages/sklearn/ensemble/_bagging.py:783: RuntimeWarning: divide by zero encountered in log
  return np.log(self.predict_proba(X))
/usr/local/lib/python3.6/dist-packages/sklearn/ensemble/_bagging.py:783: RuntimeWarning: divide by zero encountered in log
  return np.log(self.predict_proba(X))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/mnt/ubuntu1804/scikit-learn-0.22.1-simple/sklearn/ensemble/tests/test_bagging.py", line 132, in test_sparse_classification
    assert_array_almost_equal(sparse_results, dense_results)
  File "/usr/lib/python3/dist-packages/numpy/testing/utils.py", line 962, in assert_array_almost_equal
    precision=decimal)
  File "/usr/lib/python3/dist-packages/numpy/testing/utils.py", line 778, in assert_array_compare
    raise AssertionError(msg)
AssertionError:
Arrays are not almost equal to 6 decimals

(mismatch 66.66666666666666%)
 x: array([[-0.219957,  0.984658,  2.232507],
       [-0.218276,  1.671745,  1.573845],
       [-0.223031,  1.780723,  1.525337],...
 y: array([[-0.219949,  0.984647,  2.232505],
       [-0.218277,  1.671742,  1.573849],
       [-0.223032,  1.780719,  1.52534 ],...

It really confused me and hope for your help.

@jim0421
Copy link
Contributor Author

jim0421 commented Apr 1, 2020

@rth Hi buddy, are you free these days? Can you try this simple change on ubuntu-32?

@jim0421
Copy link
Contributor Author

jim0421 commented Apr 3, 2020

@rth From this article, most common ubuntu is no longer supporting 32-bit linux. Thus, can we just skip the 32-bit ubuntu test for this patch?

@rth rth added this to the 0.23 milestone Apr 3, 2020
@rth rth added the Performance label Apr 3, 2020
@rth
Copy link
Member

rth commented Apr 3, 2020

@jim0421 we still need to make CI green in order to merge. Would increasing tolerances a bit (specifically for the failing class and 32bit linux) fix it, say 5-4 digits instead of 6? That could be a possibility. I can't see the log for the failing CI job for some reason.

I don't really understand why the diff in #16530 (comment) could lead to a test failure.

@rth
Copy link
Member

rth commented Apr 3, 2020

Also please add an entry to the change log at doc/whats_new/v0.23.rst. Like the other entries there, please reference this pull request with :pr: and credit yourself with :user:.

@jim0421
Copy link
Contributor Author

jim0421 commented Apr 3, 2020

@rth If you increase tolerances to 4 bits after point for the failing test, this precision problem will be eliminated. And I've dug deeply inside the precision problem. The detail is as below.

file:svm.cpp
 335         double kernel_rbf(int i, int j) const
 336         {
 337                 return exp(-gamma*(x_square[i]+x_square[j]-2*dot(x[i],x[j])));
 338         }
in the version with precision problem
i = j
gamma is  0x3FEA45952FB3BC98
x_square[i] is 0x4048028F5C28F5C4
dot(x[i], x[j]) is 0x4048028F5C28F5C4
x_square[i]+x_square[j]-2*dot(x[i],x[j]) is 0
function return 3FF0000000000000 // this is value 1

in the original version without precision problem
i = j
gamma is 0x3FEA45952FB3BC98
x_square[i] is 0x4048028F5C28F5C4
dot(x[i], x[j]) is 0x4048028F5C28F5C4
x_square[i]+x_square[j]-2*dot(x[i],x[j]) is 0x3CED700000000000 // this is value almost equal to zero
function return 3FEFFFFFFFFFFFD0 // this is a value almost equal to one

According to my experiment, adding any system call like sleep, mmap, printf, fopen or scipy blas api will cause the precision difference. And the precision is improved with the system call added into dot function, as you can see in my older reply. From the asm code, I find the dot function is not optimized into the whole calculation.

       │     return exp(-gamma*(x_square[i]+x_square[j]-2*dot(x[i],x[j])));                         
 30.02 │       fldl   0x14(%esi)                                                                    
       │       mov    0x10(%esi),%eax                                                               
       │       fchs                                                                                 
       │       fstpl  0x20(%esp)                                                                   
 15.36 │       fldl   (%eax,%edi,8)                                                                 
       │       mov    0x58(%esp),%edi                                                               
  5.80 │       fldl   (%eax,%edi,8)                                                                 
       │       mov    0x4(%esi),%eax                                                                
       │       faddp  %st,%st(1)                                                                    
       │       add    %eax,%ebp                                                                     
       │       fstpl  0x28(%esp)                                                                    
       │       pop    %esi                                                                          
       │       pop    %edi                                                                          
       │       push   %ebp                                                                          
       │       add    0x18(%esp),%eax                                                               
       │       push   %eax                                                                          
       │     → call   svm::Kernel::dot                                                              
       │       fadd   %st(0),%st                                                                    
       │       fsubrl 0x28(%esp)                                                                    
       │       fmull  0x20(%esp)                                                                    
       │       fstpl  (%esp)                                                                        
       │     → call   exp@plt                                                                       
       │     }  

I suppose this is a precision bug for kernel_rbf in svm.cpp with patch on 32 ubuntu system which can be reproduced stablely,
Waiting for your reply.

@rth
Copy link
Member

rth commented Apr 3, 2020

Ahh, so the failing test is in ensemble/tests/test_bagging.py it's expected to test BaggingClassifier not SVC, SVC just happened to be used there. So I would try SVC(kernel='linear') or use LinearSVC there and see it it's passes. libsvm based estimators is generally not the most reliable example for meta estimators, and I remember similar workaround had to be applied in other meta-estimator tests.

Independently we still need to make sure we have a tests that check that SVC() produces consistent output in dense and sparse. That's done in sklearn/svm/tests/test_sparse.py:test_svc so we should be fine.

BTW, as far as I can tell we don't actually check this in common tests (e.g. in check_estimator_sparse_data) -- maybe we should but that's a separate concern cc @thomasjpfan @jnothman

From the asm code, I find the dot function is not optimized into the whole calculation.

Wow, thanks! I don't think we often had contributors digging that deep :). Overall the fact that the optimized version produces slightly different output from the non-optimized case doesn't seem too unexpected to me. It looks like this deviation just gets amplified when used with BaggingClassifier.

- |Enhancement| invoke scipy blas api for svm kernel function in ``fit``,
``predict``, ``predict_proba``, ``decision_function``, ``cross_validation``,
``libsvm_sparse_train``, ``libsvm_sparse_predict``,
``libsvm_sparse_predict_proba`` and ``libsvm_sparse_decision_function``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The libsvm_* are not part of the public API so that only leaves predict, predict_proba, decision_function methods I think.

Also wouldn't nuSVR and nuSVC (and possibly OneClassSVM) also be affected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've checked the code in _classes.py and svm.cpp. nuSVR, nuSVC and OneClassSVM are also affected. Besides, I think fit is also a public api, which I would like you to confirm.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great. Yes sure fit as well.

@jim0421
Copy link
Contributor Author

jim0421 commented Apr 7, 2020

@rth It's a nice suggestion to use SVC(kernel='linear'). However, the result turned out to be unsatisfied. The upstream copy of svm without any modification will cause bigger precision mismatch with linear kernel. Here is the code. The test is on 32-bit ubuntu.

file: scikit-learn/sklearn/ensemble/tests/test_bagging.py
def test_sparse_classification():
-            for f in ['predict', 'predict_proba', 'predict_log_proba', 'decision_function']:
+            for f in ['decision_function']:
                # Trained on sparse format
                sparse_classifier = BaggingClassifier(
-                    base_estimator=CustomSVC(decision_function_shape='ovr'),
+                    base_estimator=CustomSVC(kernel='linear',decision_function_shape='ovr'),
                    random_state=1,
                    **params
                ).fit(X_train_sparse, y_train)

Here is the result.

Traceback (most recent call last):
  File "main.py", line 5, in <module>
    tt.test_sparse_classification()
  File "/home/pnp/scikit-learn/sklearn/ensemble/tests/test_bagging.py", line 122, in test_sparse_classification
    assert_array_almost_equal(sparse_results, dense_results)
  File "/usr/lib/python3/dist-packages/numpy/testing/utils.py", line 962, in assert_array_almost_equal
    precision=decimal)
  File "/usr/lib/python3/dist-packages/numpy/testing/utils.py", line 778, in assert_array_compare
    raise AssertionError(msg)
AssertionError:
Arrays are not almost equal to 6 decimals

(mismatch 100.0%)
 x: array([[-0.294842,  1.219134,  2.280216],
       [-0.250532,  1.635373,  1.699024],
       [-0.252775,  1.729176,  1.64705 ],...
 y: array([[-0.219957,  0.984658,  2.232507],
       [-0.218276,  1.671745,  1.573845],
       [-0.223031,  1.780723,  1.525337],...

@rth
Copy link
Member

rth commented Apr 7, 2020

@jim0421 How about LinearSVC? It's from liblinear instead of libsvm...

@jim0421
Copy link
Contributor Author

jim0421 commented Apr 7, 2020

@rth Applying such modification will eliminate the precision problem on ubuntu 32. However, some data in the loop seems to be problematic, which is shown in the end of the comment.

file: scikit-learn/sklearn/ensemble/tests/test_bagging.py
- from sklearn.svm import SVC, SVR
+ from sklearn.svm import SVC, SVR, LinearSVC

def test_sparse_classification():
-    class CustomSVC(SVC):
+    class CustomSVC(LinearSVC):
        """SVC variant that records the nature of the training set"""
        def fit(self, X, y):
            super().fit(X, y)
            self.data_type_ = type(X)
            return self

    for sparse_format in [csc_matrix, csr_matrix]:
        X_train_sparse = sparse_format(X_train)
        X_test_sparse = sparse_format(X_test)
        for params in parameter_sets
            for f in ['predict', 'predict_proba', 'predict_log_proba', 'decision_function']:
                # Trained on sparse format
                sparse_classifier = BaggingClassifier(
-                    base_estimator=CustomSVC(decision_function_shape='ovr'),
+                    base_estimator=CustomSVC(),
                    random_state=1,
                    **params
                ).fit(X_train_sparse, y_train)
                sparse_results = getattr(sparse_classifier, f)(X_test_sparse)

                # Trained on dense format
                dense_classifier = BaggingClassifier(
-                    base_estimator=CustomSVC(decision_function_shape='ovr'),
+                    base_estimator=CustomSVC(),
                    random_state=1,
                    **params
                ).fit(X_train, y_train)
                dense_results = getattr(dense_classifier, f)(X_test)

+                print(sparse_results)
+                print(dense_results)

This modification will bring convergence warning.

/usr/local/lib/python3.6/dist-packages/sklearn/svm/_base.py:961: ConvergenceWarning: Liblinear failed to converge, increase the number of iterations.
  "the number of iterations.", ConvergenceWarning)

Strange data.

# sparse result
[[       -inf        -inf  0.        ]
 [       -inf        -inf  0.        ]
 [       -inf        -inf  0.        ]
 [ 0.                -inf        -inf]
 [       -inf  0.                -inf]
 [ 0.                -inf        -inf]
 [       -inf        -inf  0.        ]
 [       -inf        -inf  0.        ]
 [       -inf        -inf  0.        ]
 [       -inf  0.                -inf]
 [ 0.                -inf        -inf]
 [       -inf        -inf  0.        ]
 [ 0.                -inf        -inf]
 [       -inf -1.60943791 -0.22314355]
 [       -inf  0.                -inf]
 [       -inf        -inf  0.        ]
 [ 0.                -inf        -inf]
 [       -inf        -inf  0.        ]
 [       -inf  0.                -inf]
 [       -inf  0.                -inf]
 [       -inf        -inf  0.        ]
 [       -inf  0.                -inf]
 [       -inf -1.60943791 -0.22314355]
 [       -inf  0.                -inf]
 [       -inf        -inf  0.        ]
 [ 0.                -inf        -inf]
 [ 0.                -inf        -inf]
 [ 0.                -inf        -inf]
 [ 0.                -inf        -inf]
 [ 0.                -inf        -inf]
 [       -inf        -inf  0.        ]
 [       -inf        -inf  0.        ]
 [       -inf  0.                -inf]
 [       -inf  0.                -inf]
 [       -inf -0.10536052 -2.30258509]
 [       -inf        -inf  0.        ]
 [ 0.                -inf        -inf]
 [       -inf        -inf  0.        ]]
# dense result
[[       -inf        -inf  0.        ]
 [       -inf        -inf  0.        ]
 [       -inf        -inf  0.        ]
 [ 0.                -inf        -inf]
 [       -inf  0.                -inf]
 [ 0.                -inf        -inf]
 [       -inf        -inf  0.        ]
 [       -inf        -inf  0.        ]
 [       -inf        -inf  0.        ]
 [       -inf  0.                -inf]
 [ 0.                -inf        -inf]
 [       -inf        -inf  0.        ]
 [ 0.                -inf        -inf]
 [       -inf -1.60943791 -0.22314355]
 [       -inf  0.                -inf]
 [       -inf        -inf  0.        ]
 [ 0.                -inf        -inf]
 [       -inf        -inf  0.        ]
 [       -inf  0.                -inf]
 [       -inf  0.                -inf]
 [       -inf        -inf  0.        ]
 [       -inf  0.                -inf]
 [       -inf -1.60943791 -0.22314355]
 [       -inf  0.                -inf]
 [       -inf        -inf  0.        ]
 [ 0.                -inf        -inf]
 [ 0.                -inf        -inf]
 [ 0.                -inf        -inf]
 [ 0.                -inf        -inf]
 [ 0.                -inf        -inf]
 [       -inf        -inf  0.        ]
 [       -inf        -inf  0.        ]
 [       -inf  0.                -inf]
 [       -inf  0.                -inf]
 [       -inf -0.10536052 -2.30258509]
 [       -inf        -inf  0.        ]
 [ 0.                -inf        -inf]
 [       -inf        -inf  0.        ]]

@rth
Copy link
Member

rth commented Apr 7, 2020

OK, one problem is that the data is not scaled while it should be for SVMs. Once it's scaled, the linear kernel works fine.
Could you try with the below diff (it might need linting improvements)? This passes for me on an ubuntu-32bit locally in a Docker.

diff --git a/sklearn/ensemble/tests/test_bagging.py b/sklearn/ensemble/tests/test_bagging.py
index 883f0067f..92f50adde 100644
--- a/sklearn/ensemble/tests/test_bagging.py
+++ b/sklearn/ensemble/tests/test_bagging.py
@@ -4,9 +4,11 @@ Testing for the bagging ensemble module (sklearn.ensemble.bagging).
 
 # Author: Gilles Louppe
 # License: BSD 3 clause
+from itertools import product
 
 import numpy as np
 import joblib
+import pytest
 
 from sklearn.base import BaseEstimator
 
@@ -31,7 +33,7 @@ from sklearn.feature_selection import SelectKBest
 from sklearn.model_selection import train_test_split
 from sklearn.datasets import load_boston, load_iris, make_hastie_10_2
 from sklearn.utils import check_random_state
-from sklearn.preprocessing import FunctionTransformer
+from sklearn.preprocessing import FunctionTransformer, scale
 
 from scipy.sparse import csc_matrix, csr_matrix
 
@@ -76,8 +78,31 @@ def test_classification():
                               random_state=rng,
                               **params).fit(X_train, y_train).predict(X_test)
 
-
-def test_sparse_classification():
+@pytest.mark.parametrize(
+    'sparse_format, params, method',
+    product(
+     [csc_matrix, csr_matrix],
+     [{
+         "max_samples": 0.5,
+         "max_features": 2,
+         "bootstrap": True,
+         "bootstrap_features": True
+      }, {
+         "max_samples": 1.0,
+         "max_features": 4,
+         "bootstrap": True,
+         "bootstrap_features": True
+      }, {
+         "max_features": 2,
+         "bootstrap": False,
+         "bootstrap_features": True
+      }, {
+         "max_samples": 0.5,
+         "bootstrap": True,
+         "bootstrap_features": False
+      }],
+      ['predict', 'predict_proba', 'predict_log_proba', 'decision_function']))
+def test_sparse_classification(sparse_format, params, method):
     # Check classification for various parameter settings on sparse input.
 
     class CustomSVC(SVC):
@@ -89,52 +114,33 @@ def test_sparse_classification():
             return self
 
     rng = check_random_state(0)
-    X_train, X_test, y_train, y_test = train_test_split(iris.data,
+    X_train, X_test, y_train, y_test = train_test_split(scale(iris.data),
                                                         iris.target,
                                                         random_state=rng)
-    parameter_sets = [
-        {"max_samples": 0.5,
-         "max_features": 2,
-         "bootstrap": True,
-         "bootstrap_features": True},
-        {"max_samples": 1.0,
-         "max_features": 4,
-         "bootstrap": True,
-         "bootstrap_features": True},
-        {"max_features": 2,
-         "bootstrap": False,
-         "bootstrap_features": True},
-        {"max_samples": 0.5,
-         "bootstrap": True,
-         "bootstrap_features": False},
-    ]
 
-    for sparse_format in [csc_matrix, csr_matrix]:
-        X_train_sparse = sparse_format(X_train)
-        X_test_sparse = sparse_format(X_test)
-        for params in parameter_sets:
-            for f in ['predict', 'predict_proba', 'predict_log_proba', 'decision_function']:
-                # Trained on sparse format
-                sparse_classifier = BaggingClassifier(
-                    base_estimator=CustomSVC(decision_function_shape='ovr'),
-                    random_state=1,
-                    **params
-                ).fit(X_train_sparse, y_train)
-                sparse_results = getattr(sparse_classifier, f)(X_test_sparse)
-
-                # Trained on dense format
-                dense_classifier = BaggingClassifier(
-                    base_estimator=CustomSVC(decision_function_shape='ovr'),
-                    random_state=1,
-                    **params
-                ).fit(X_train, y_train)
-                dense_results = getattr(dense_classifier, f)(X_test)
-                assert_array_almost_equal(sparse_results, dense_results)
-
-            sparse_type = type(X_train_sparse)
-            types = [i.data_type_ for i in sparse_classifier.estimators_]
-
-            assert all([t == sparse_type for t in types])
+    X_train_sparse = sparse_format(X_train)
+    X_test_sparse = sparse_format(X_test)
+    # Trained on sparse format
+    sparse_classifier = BaggingClassifier(
+        base_estimator=CustomSVC(kernel="linear", decision_function_shape='ovr'),
+        random_state=1,
+        **params
+    ).fit(X_train_sparse, y_train)
+    sparse_results = getattr(sparse_classifier, method)(X_test_sparse)
+
+    # Trained on dense format
+    dense_classifier = BaggingClassifier(
+        base_estimator=CustomSVC(kernel="linear", decision_function_shape='ovr'),
+        random_state=1,
+        **params
+    ).fit(X_train, y_train)
+    dense_results = getattr(dense_classifier, method)(X_test)
+    assert_array_almost_equal(sparse_results, dense_results)
+
+    sparse_type = type(X_train_sparse)
+    types = [i.data_type_ for i in sparse_classifier.estimators_]
+
+    assert all([t == sparse_type for t in types])

I also parametrized that test with pytest to make it easier to see what fails. Previously it was, the check on the decision function for all values of bagging parameters,

$ pytest sklearn/ensemble/tests/test_bagging.py -k test_sparse_classification -v
[...]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params0-predict] PASSED                           [  3%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params1-predict_proba] PASSED                     [  6%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params2-predict_log_proba] PASSED                 [  9%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params3-decision_function] FAILED                 [ 12%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params4-predict] PASSED                           [ 15%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params5-predict_proba] PASSED                     [ 18%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params6-predict_log_proba] PASSED                 [ 21%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params7-decision_function] FAILED                 [ 25%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params8-predict] PASSED                           [ 28%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params9-predict_proba] PASSED                     [ 31%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params10-predict_log_proba] PASSED                [ 34%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params11-decision_function] FAILED                [ 37%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params12-predict] PASSED                          [ 40%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params13-predict_proba] PASSED                    [ 43%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params14-predict_log_proba] PASSED                [ 46%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params15-decision_function] FAILED                [ 50%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params16-predict] PASSED                          [ 53%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params17-predict_proba] PASSED                    [ 56%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params18-predict_log_proba] PASSED                [ 59%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params19-decision_function] FAILED                [ 62%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params20-predict] PASSED                          [ 65%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params21-predict_proba] PASSED                    [ 68%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params22-predict_log_proba] PASSED                [ 71%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params23-decision_function] FAILED                [ 75%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params24-predict] PASSED                          [ 78%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params25-predict_proba] PASSED                    [ 81%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params26-predict_log_proba] PASSED                [ 84%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params27-decision_function] FAILED                [ 87%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params28-predict] PASSED                          [ 90%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params29-predict_proba] PASSED                    [ 93%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params30-predict_log_proba] PASSED                [ 96%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params31-decision_function] FAILED                [100%]

@jim0421
Copy link
Contributor Author

jim0421 commented Apr 8, 2020

@rth All related test has passed after applying your patch for SVC(kernel='linear'). Both for simple patch with fprintf and for my scipy blas api patch.

pytest sklearn/ensemble/tests/test_bagging.py -k test_sparse_classification -v

sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params0-predict] PASSED               [  3%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params1-predict_proba] PASSED         [  6%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params2-predict_log_proba] PASSED     [  9%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params3-decision_function] PASSED     [ 12%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params4-predict] PASSED               [ 15%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params5-predict_proba] PASSED         [ 18%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params6-predict_log_proba] PASSED     [ 21%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params7-decision_function] PASSED     [ 25%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params8-predict] PASSED               [ 28%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params9-predict_proba] PASSED         [ 31%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params10-predict_log_proba] PASSED    [ 34%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params11-decision_function] PASSED    [ 37%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params12-predict] PASSED              [ 40%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params13-predict_proba] PASSED        [ 43%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params14-predict_log_proba] PASSED    [ 46%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csc_matrix-params15-decision_function] PASSED    [ 50%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params16-predict] PASSED              [ 53%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params17-predict_proba] PASSED        [ 56%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params18-predict_log_proba] PASSED    [ 59%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params19-decision_function] PASSED    [ 62%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params20-predict] PASSED              [ 65%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params21-predict_proba] PASSED        [ 68%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params22-predict_log_proba] PASSED    [ 71%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params23-decision_function] PASSED    [ 75%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params24-predict] PASSED              [ 78%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params25-predict_proba] PASSED        [ 81%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params26-predict_log_proba] PASSED    [ 84%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params27-decision_function] PASSED    [ 87%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params28-predict] PASSED              [ 90%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params29-predict_proba] PASSED        [ 93%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params30-predict_log_proba] PASSED    [ 96%]
sklearn/ensemble/tests/test_bagging.py::test_sparse_classification[csr_matrix-params31-decision_function] PASSED    [100%]

=================================================== 31 tests deselected ===================================================
================================== 32 passed, 31 deselected, 16 warnings in 7.18 seconds ==================================

@rth
Copy link
Member

rth commented Apr 8, 2020

Great, can you commit it?

@jim0421
Copy link
Contributor Author

jim0421 commented May 8, 2020

I did that in interactive mode, but I'll reproduce them and post it.

Thanks and just do it when it's at your convenience. Not a request.

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how m_blas is now a normal class attribute. Thank you @jim0421 !

LGTM

@thomasjpfan thomasjpfan changed the title scipy blas for svm kernel function ENH scipy blas for svm kernel function May 24, 2020
@thomasjpfan thomasjpfan merged commit d5de894 into scikit-learn:master May 24, 2020
@thomasjpfan
Copy link
Member

Thank you for working on this @jim0421 !

@jim0421
Copy link
Contributor Author

jim0421 commented May 25, 2020

Thank you for working on this @jim0421 !

Really appreciate co-working with you and thanks for your suggestion and guidance. @thomasjpfan @rth @jeremiedbb

ivannz added a commit to ivannz/scikit-learn that referenced this pull request May 30, 2020
viclafargue pushed a commit to viclafargue/scikit-learn that referenced this pull request Jun 26, 2020
jayzed82 pushed a commit to jayzed82/scikit-learn that referenced this pull request Oct 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants