Some questions about the code of the svm prediction part #27

anbo1024 · 2019-06-23T03:17:40Z

This is the predicted output part of the svm network：
output = tf.identity(tf.sign(output), name='prediction')
correct_prediction = tf.equal(tf.argmax(output, 1), tf.argmax(y_input, 1))

After the output is processed by the function tf.sign()，a two-category tag that is processed into (1,-1).This is not predictable for multi-class handwritten data sets.
I want to ask you this question.

AFAgarap · 2019-06-23T04:17:06Z

@anbo1024 it can be used in a one-versus-all manner, i.e. in a one-hot encoded vector, instead of using 0 for the classes that a sample does not belong, we use -1 : [0, 1, 0, 0] --> [-1, 1, -1, 1]

anbo1024 · 2019-06-24T00:46:46Z

@AFAgarap I still have some doubts.
1、You will [0, 1, 0, 0] --> [-1, 1, -1, 1]，Softmax regards the maximum subscript as the final classification result. How do you determine the final classification result with this encoding method?
2、I think that in CNN, whether using SVM or Softmax, the network output has been classified. SVM and Softmax calculate loss in different ways.

output = tf.identity(tf.sign(output), name='prediction')
output = tf.identity((output), name='prediction')
correct_prediction = tf.equal(tf.argmax(output, 1), tf.argmax(y_input, 1))

I think tf. sign and tf. nn. softmax functions can be removed.The following Correct_prediction function already contains the maximum operation.
I hope I can get your advice. Thank you very much.

AFAgarap · 2019-06-24T03:23:02Z

Well, in terms of having a one-hot encoded vector with a -1 in place of 0, e.g. [-1, 1, -1, -1], which one is greater, -1 or 1? 1, yes? So, it's still fine to be there. But you're right, it can be discarded for the purposes of getting the training accuracy.

anbo1024 · 2019-06-24T07:00:12Z

@AFAgarap I understand what you mean. Now, I use SVM as a loss function on my own dataset, but the performance is very poor, and Softmax performs very well. Why is this? Is the data still to be preprocessed?

AFAgarap · 2019-06-24T07:07:22Z

In Yichuan Tang's paper, Deep Learning using Linear Support Vector Machines, he used PCA and added Gaussian noise for MNIST. But that was on a feed-forward neural network with 2 layers having 512 units each.

AFAgarap · 2019-06-24T07:07:34Z

The paper is here

anbo1024 · 2019-06-24T07:17:29Z

Yes, I have read this article. Good results have been achieved in your code. Did you preprocess it?

AFAgarap · 2019-06-24T07:22:06Z

No, I didn't

AFAgarap closed this as completed Jun 23, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some questions about the code of the svm prediction part #27

Some questions about the code of the svm prediction part #27

anbo1024 commented Jun 23, 2019

AFAgarap commented Jun 23, 2019

anbo1024 commented Jun 24, 2019

AFAgarap commented Jun 24, 2019

anbo1024 commented Jun 24, 2019

AFAgarap commented Jun 24, 2019

AFAgarap commented Jun 24, 2019

anbo1024 commented Jun 24, 2019

AFAgarap commented Jun 24, 2019

Some questions about the code of the svm prediction part #27

Some questions about the code of the svm prediction part #27

Comments

anbo1024 commented Jun 23, 2019

AFAgarap commented Jun 23, 2019

anbo1024 commented Jun 24, 2019

AFAgarap commented Jun 24, 2019

anbo1024 commented Jun 24, 2019

AFAgarap commented Jun 24, 2019

AFAgarap commented Jun 24, 2019

anbo1024 commented Jun 24, 2019

AFAgarap commented Jun 24, 2019