add svm in last layer #2588

mundher · 2016-05-03T11:02:44Z

I want to add svm in last layer of my model

model = Sequential()
model.add(Convolution2D(nb_filters, nb_conv, nb_conv,
border_mode='valid',
input_shape=(1, img_rows, img_cols)))
model.add(Flatten())
model.add(Dense(1024))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
optimizer='adadelta',
metrics=['accuracy'])

I tried to change the loss to hinge

model = Sequential()
model.add(Convolution2D(nb_filters, nb_conv, nb_conv,
                            border_mode='valid',
                            input_shape=(1, img_rows, img_cols)))
model.add(Flatten())
model.add(Dense(1024))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('linear'))

model.compile(loss='hinge',
              optimizer='adadelta',
              metrics=['accuracy'])

but the accuracy of training data doesn't change in each iteration

The text was updated successfully, but these errors were encountered:

erlendd · 2016-06-18T10:54:44Z

I'm not sure linear in the correct activation on your output: I think it should be tanh so that you get labels [-1,+1].

alyato · 2016-08-03T11:09:10Z

@mundher Do you solve your problem? When i want to use the svm, i always call the scikit-learn.svm. How do you implement it ? Thanks.

mundher · 2016-08-03T12:11:47Z

@alyato I didn't solve the problem. it looks like the hinge loss works only for binary class output

fish128 · 2016-09-29T22:09:01Z

I tried a few different activations at the last layer, and found that the softmax activation works best with hinge loss. Can anyone explain this?

huanglianghua · 2016-11-15T11:19:05Z

You need to regularize the weight.

The hinge loss with regularization term forms the complete SVM loss function. Try:

from keras.regularizers import l2

model = Sequential()
model.add(Convolution2D(nb_filters, nb_conv, nb_conv,
                            border_mode='valid',
                            input_shape=(1, img_rows, img_cols)))
model.add(Flatten())
model.add(Dense(1024))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes), W_regularizer=l2(0.01))
model.add(Activation('linear'))

model.compile(loss='hinge',
              optimizer='adadelta',
              metrics=['accuracy'])

instead.

McLawrence · 2017-04-29T15:01:52Z

in the code the hinge loss is defined as:
K.mean(K.maximum(1. - y_true * y_pred, 0.), axis=-1)

However, as I know it, the loss for an SVM should be this:

where s_i ist the score of the i-th output unit. As I understand it the hingeloss ignores the additive s_j term and only uses the s_{y_i}-term in the formula. Where is my error?

aniket03 · 2017-08-11T11:38:32Z

@McLawrence the hinge loss implemented in keras is for a specific case of binary classification [A vs ~A]. If this is to be used labels must be in the format of {-1, 1}. You might refer this for reference #2830. And there is also categorical hinge now in losses.py. You can refer that also.

bit-scientist · 2020-04-10T10:00:21Z

@huanglianghua , @aniket03 Is it really enough to change the loss in the compile to hinge and regularize the last Dense layer with any ( like this, kernel_regularizer=regularizers.l2(0.01)) penalty? I can't seem to understand that? Could you cite some links, posts? Thanks

statcom · 2020-05-29T21:00:31Z

@huanglianghua , @aniket03 Is it really enough to change the loss in the compile to hinge and regularize the last Dense layer with any ( like this, kernel_regularizer=regularizers.l2(0.01)) penalty? I can't seem to understand that? Could you cite some links, posts? Thanks

Here are a couple of good references::
https://cs231n.github.io/linear-classify/
https://github.com/nfmcclure/tensorflow_cookbook#ch-4-support-vector-machines

I tested the following code with real data without any problem. Of course, the results will be different from the ones from real SVM implementation (e.g., sklearn's SVM). An interesting thing is that this Keras implementation produced a lot better results than the official SVM with specific data sometimes. Of course, another benefit is that you can use GPU to train SVM.

fine_model_st.add(Dense(nb_classes, kernel_regularizer=regularizers.l2(0.0001)))
fine_model_st.add(Activation('linear'))
fine_model_st.compile(loss='squared_hinge',
                      optimizer='adadelta', metrics=['accuracy'])

momja · 2020-10-26T05:26:49Z

How does this align with the use of RandomFourierFeatures for SVM approximation found here? Is RandomFourierFeatures a better, more modern approach?

Apidcloud · 2021-05-14T23:38:59Z

The suggested code from @huanglianghua and @statcom seem to work, but I wonder if there is a way of outputting probabilities instead. I tried to use softmax, but the model doesn't improve at all (with 3 classes it gets stuck at 33%; with 2 classes gets stuck at 50%). My goal is to 'exactly' replicate an SVM model, so that I can convert it to .onnx and reuse it.

The only settings I got it to work are the following:

  # create model
  model = Sequential()
  # note sure whether the number of filters for the first layer needs to match the input shape or not
  model.add(Dense(30, input_shape=(30,), activation='relu', kernel_initializer='he_uniform'))
  model.add(Dense(count_classes, kernel_regularizer=regularizers.l2(0.1)))
  #model.add(Activation('softmax')) # linear by default; softmax doesn't seem to work. Any ideas?
  model.compile(loss=keras.losses.CategoricalHinge(), optimizer=keras.optimizers.Adam(lr=1e-3), metrics=['accuracy'])

Any ideas on how to get a probability at the end? Using argmax seems to work just fine, but how do I interpret the output? I mean, I would like to set some sort of threshold to decide whether a prediction is good or not. Let's say I got 3 classes, but I input something else to the model to predict. I will still get a max value from argmax, even if it's wrong. A probability would let me avoid this by setting some sort of threshold. Any ideas on how to approach this?

erlendd · 2021-05-17T08:10:45Z

Gettng probabilities out of SVM is usually done by adding a logistic regression after the linear output of the base SVM model. Use a K.stop_gradient to prevent the logistic layer from affecting the weights in the base model.

Apidcloud · 2021-05-17T11:35:19Z

Thanks, @erlendd

I tried the following approach:

  input = Input(shape=(30,))
  dense = Dense(30, activation='relu', kernel_initializer='he_uniform', name='mul')
  x = dense(input)
  x = Dense(count_classes, kernel_regularizer=regularizers.l2(0.1), name='regul')(x)
  stop_grad = Lambda(lambda x: K.stop_gradient(x))(x)
  x = Activation('linear')(stop_grad)
  # anything I add after linear activation, gets the model stuck at 33% (or 25% if 4 classes; 50% if 2; etc.)
  output = Dense(count_classes, activation="softmax", name='out')(x)

  model = Model(inputs=input, outputs=output)
  model.compile(loss=keras.losses.CategoricalHinge(), optimizer=keras.optimizers.Adam(lr=1e-3), metrics=['accuracy'])

But it still gets stuck at 33% for some reason. I guess the input data would have to be normalised in order to use softmax here, which is not. The input data are raw vectors--and that works totally fine as long as I don't add anything after the linear output.

Edit:
As I suspected, got softmax to work (as per the first example; no stop gradient or anything) by scaling the features of the model. I was specifically using really big numbers, which despite training well, were preventing softmax (logistic regression) to work properly. The scaling of the features can be done through the following code:

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_std = scaler.fit_transform(X)

mundher closed this as completed Jun 18, 2016

unrealwill mentioned this issue Mar 31, 2017

i want to use SVM as last layer of CNN any suggestion? #6090

Closed

4 tasks

littlemountainman mentioned this issue Jun 15, 2019

SVM as last layer of LSTM #12960

Closed

ghost mentioned this issue Oct 25, 2019

Add SVM as last layer of CNN #9884

Closed

SuryanarayanaY mentioned this issue Jan 22, 2024

I want to add svm in last layer of my model #19078

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add svm in last layer #2588

add svm in last layer #2588

mundher commented May 3, 2016 •

edited

erlendd commented Jun 18, 2016

alyato commented Aug 3, 2016 •

edited

mundher commented Aug 3, 2016 •

edited

fish128 commented Sep 29, 2016

huanglianghua commented Nov 15, 2016

McLawrence commented Apr 29, 2017

aniket03 commented Aug 11, 2017 •

edited

bit-scientist commented Apr 10, 2020

statcom commented May 29, 2020 •

edited

momja commented Oct 26, 2020

Apidcloud commented May 14, 2021 •

edited

erlendd commented May 17, 2021

Apidcloud commented May 17, 2021 •

edited

add svm in last layer #2588

add svm in last layer #2588

Comments

mundher commented May 3, 2016 • edited

erlendd commented Jun 18, 2016

alyato commented Aug 3, 2016 • edited

mundher commented Aug 3, 2016 • edited

fish128 commented Sep 29, 2016

huanglianghua commented Nov 15, 2016

McLawrence commented Apr 29, 2017

aniket03 commented Aug 11, 2017 • edited

bit-scientist commented Apr 10, 2020

statcom commented May 29, 2020 • edited

momja commented Oct 26, 2020

Apidcloud commented May 14, 2021 • edited

erlendd commented May 17, 2021

Apidcloud commented May 17, 2021 • edited

mundher commented May 3, 2016 •

edited

alyato commented Aug 3, 2016 •

edited

mundher commented Aug 3, 2016 •

edited

aniket03 commented Aug 11, 2017 •

edited

statcom commented May 29, 2020 •

edited

Apidcloud commented May 14, 2021 •

edited

Apidcloud commented May 17, 2021 •

edited