Make model parameters for SVCs with linear kernels accessible in SKLL #443

desilinguist · 2018-12-07T19:48:09Z

For SVCs with linear kernels, we want to print out the primal weights - that is, the weights for each feature for each one-vs-one binary classifier. These are the weights contained in the coef_ attribute of the underlying scikit-learn model. This is a matrix that has the shape [(n_classes)(n_classes -1)/2, n_features] since there are C(n_classes, 2) = n_classes(n_classes-1)/2 one-vs-one classifiers and each one has weights for each of the features. According to the scikit-learn user guide and the code for the function _one_vs_one_coef() in svm/base.py, the order of the rows is as follows is "0 vs 1", "0 vs 2", ... "0 vs n", "1 vs 2", "1 vs 3", "1 vs n", ... "n-1 vs n".

I have implemented this in the Learner.model_params() method. In order to doubly ensure that taking the coef_ values and assigning them to these class pairs, I wanted to check that LIBSVM (which is actually underlies the SVC classifier in scikit-learn) itself also gets the same weights. To do this, I first trained an SVC model with a linear kernel on our "Iris" example and ranprint_model_weights (which uses model_params()). The output of print_model_weights is as follows:

== intercept values ==
1.454174024546  setosa-vs-versicolor
1.507402364191  setosa-vs-virginica
5.730615419818  versicolor-vs-virginica

Number of nonzero features: 12
-1.921263583676 versicolor-vs-virginica f2
-1.862045422601 versicolor-vs-virginica f3
1.195963669885  versicolor-vs-virginica f1
-1.002975155027 setosa-vs-versicolor    f2
0.546299987195  versicolor-vs-virginica f0
-0.538398699463 setosa-vs-virginica f2
0.520869022653  setosa-vs-versicolor    f1
-0.464101072449 setosa-vs-versicolor    f3
-0.292290974055 setosa-vs-virginica f3
0.178903585666  setosa-vs-virginica f1
-0.046389035027 setosa-vs-versicolor    f0
-0.007134560620 setosa-vs-virginica f0

Note that in this case label 0 corresponds to setosa, label 1 corresponds to versicolor, and label 2 corresponds to virginica.

Next, I downloaded LIBSVM, compiled it, and ran the following commands that use LIBSVM to train an equivalent SVC model with the same hyperparameters.

$> cd examples/iris/train
$> skll_convert example_iris_features.jsonlines iris.libsvm
$> svm-train -s 0 -t 0 -c 1 ~/work/skll/examples/iris/train/iris.libsvm

Next, I looked at this entry in the LIBSVM FAQ that explains how to get the primal coefficients from the dual ones. To do this, I compiled the Python interface that comes with LIBSVM and then ran the following Python commands:

>>> from svmutil import svm_load_model
>>> m = svm_load_model('iris.libsvm.model') 

# note that the order of the labels is not what we would expect
>>> m.get_labels()
[1, 2, 0]

# get the number of support vectors per class
>>> m.nSV[:3]
[10, 9, 3]

# now get the dual coefficients
>>> sv_coefs = m.get_sv_coef() 
>>> svs = m.get_SV()

Now, as the entry says, in order to compute the primal coefficients for, say, label 1 (versicolor) vs label 2 (virginica), we first need to compute the y_i alpha_i for the two classes. Given that sv_coefs is a 22x2 array - the first 10 rows representing the support vector coefficeints for class 1 (label 1), the next 9 representing the same for class 2 (label 2), and the last 3 representing the same for class 3 (label 0). Each coefficient has two entries, with each entry representing the coefficient of classifer trained to classify between the main class vs. the other class. For example, each of the first 10 support vector coefficients have 2 entries - the first representing those for the classifier for class 1 (label 1) vs. class 2 (label 2) and the second representing those for the classifier for class 1 (label 1) and class 3 (label 0). The next 9 represent those for class 2 (label 2) vs. class 1 (label 1) and class 3 (label 0) respectively. And, finally, the last 3 represent those for class 3 (label 0) vs. class 1 (label 1) and class 2 (label 2) respectively (note that the classes in the two columns are in strict increasing order). Given this setup, the coefficients we are interested (class 1 vs class 2) can be computed as follows:

>>> our_coefs = np.array([x[0] for x in sv_coefs[:10]] + [x[0] for x in sv_coefs[10:19]])

Next, let's get the actual support vectors from the svs list of dictionaries for the same corresponding indices (the first 19):

>>> our_svs = np.array([[x[1], x[2], x[3], x[4]] for x in svs[:19]])

And, finally, let's take the dot product between the support vector coefficients (1x19) against the support vectors (19x4). This will give us the feature weight vector (1x4) for the class 1 vs. class 2 binary classifier.

>>> our_coefs2 = our_coefs.reshape(1,19)
>>> our_coefs2.dot(our_svs)
array([[ 0.54628096,  1.19553697, -1.92187359, -1.86235093]])

The four versicolor-vs-virginica weights from our print_model_weights output – sorted by feature names - are as follows:

0.546299987195  versicolor-vs-virginica f0
1.195963669885  versicolor-vs-virginica f1
-1.921263583676 versicolor-vs-virginica f2
-1.862045422601 versicolor-vs-virginica f3

The two vectors match to a satisfactory number of significant digits and this confirms that the implementation is accurate.

Allow printing out the primal weights of SVCs with linear kernels.

mulhod

Thanks for the thorough explanation! I've followed along with what you did myself and everything seems satisfactory to me.

desilinguist · 2018-12-10T19:54:27Z

Thanks, @mulhod! 👍

jbiggsets

Looks good to me! Also, really interesting. :)

desilinguist added 3 commits December 6, 2018 15:03

Make Learner.model_params() work with SVCs

e9974d1

Allow printing out the primal weights of SVCs with linear kernels.

Add print_model_weights test for SVCs with linear kernel

9e27087

Merge branch 'master' into svc-linear-model-params

acb98c1

desilinguist added this to the 1.5.3 milestone Dec 7, 2018

desilinguist self-assigned this Dec 7, 2018

desilinguist requested review from mulhod, Lguyogiro and jbiggsets December 7, 2018 19:48

mulhod approved these changes Dec 10, 2018

View reviewed changes

jbiggsets approved these changes Dec 11, 2018

View reviewed changes

desilinguist merged commit 8a6e329 into master Dec 11, 2018

desilinguist deleted the svc-linear-model-params branch December 11, 2018 22:08

desilinguist mentioned this pull request Dec 11, 2018

SKLL should support model_params for SVCs with linear kernels #425

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make model parameters for SVCs with linear kernels accessible in SKLL #443

Make model parameters for SVCs with linear kernels accessible in SKLL #443

desilinguist commented Dec 7, 2018 •

edited

Loading

mulhod left a comment

desilinguist commented Dec 10, 2018

jbiggsets left a comment

Make model parameters for SVCs with linear kernels accessible in SKLL #443

Make model parameters for SVCs with linear kernels accessible in SKLL #443

Conversation

desilinguist commented Dec 7, 2018 • edited Loading

mulhod left a comment

Choose a reason for hiding this comment

desilinguist commented Dec 10, 2018

jbiggsets left a comment

Choose a reason for hiding this comment

desilinguist commented Dec 7, 2018 •

edited

Loading