Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accuracies lower than 50% if the random seed is unlucky #207

Closed
giulio-datamind opened this issue Oct 31, 2023 · 8 comments
Closed

Accuracies lower than 50% if the random seed is unlucky #207

giulio-datamind opened this issue Oct 31, 2023 · 8 comments

Comments

@giulio-datamind
Copy link

giulio-datamind commented Oct 31, 2023

In the context of binary SVM classification problems, while I was doing experiments on this input data (constituted by 18 samples, divided in two classes of equal cardinality) I felt into a model with 0% accuracy. The parameters I used for training are: -g 16 -c 32 -b 1.

While attempting to go deep in understanding the reasons of this 0% accuracy, I concluded that it is a consequence of the choice of the random seed.

Hence, I did multiple tests with different seeds. By using as seed the first 1000 integers, I obtained some results that can be summarized in this way:

  • for 35/1000 seeds the accuracy of the corresponding model is 0%;
  • for 960/1000 seeds the accuracy is 100%;
  • for the remaining 5/1000 seeds the accuracy is comprised in ]0, 1[.

I noticed that the following sentence is true for all 1000 seeds:

the accuracy of the trained model is greater than 50% if and only if probA is lower than 0.

In fact, the condition probA < 0 corresponds to having an increasing sigmoid function for modelling the decision values' response.

Considered this long premise, my questions are:

  1. am I someway wrong about this, or is the current version of libSVM supposed to give accuracies below 50% if the combination of input data and random seed is unlucky?
  2. can we do something to avoid to fall in those cases, for example constraining the sigmoid modelling function to always be increasing?

Thank you very much to anyone who will contribute.

PS: looking for similar questions I didn't find the answer to my question, but I suspect that the questions here are related to the issues
#152, #153 and #155.

@cjlin1
Copy link
Owner

cjlin1 commented Nov 1, 2023

The main issue is that you have too few data. If -b 1 is used, which is for probabilistic outputs, internally we conduct a cross validation process. Thus there is some randomness. To have deterministic results, either fix the seed or, if prob outputs not needed, remove -b 1

@cjlin1 cjlin1 closed this as completed Nov 1, 2023
@giulio-datamind
Copy link
Author

giulio-datamind commented Nov 2, 2023

@cjlin1 thank you very much for your reply.

In my case I need probabilistic results, so option -b 1 is mandatory.

Yes, I could change the seed to make this particular example work, but in general I cannot consider this a solution, since for other input data I could fall into the same low-accuracy problem.

Why do you think that constraining the sigmoid function to be always increasing (considered the fact that we expect high probabilities for samples labeled +1 and low probabilities for those labeled -1) could not be a solution? Are there any drawbacks in constraining someway probA < 0 inside sigmoid_train function?

@cjlin1
Copy link
Owner

cjlin1 commented Nov 2, 2023

I think we do have that sigmoid is always increasing. So I don't understand your question

@giulio-datamind
Copy link
Author

I'm sorry: I probably made some confusion with the sign of probA (I edited the above messages to fix them). I try to explain me better with other words.

Consider the input data attached to my first message. Working on this data I experimented that by setting (at startup) the random generator seed to some integer number, normally (i.e., for about the 96% of these seeds) the resulting trained probabilistic model has 100% accuracy. However, for some seeds (only about 3,5%; an example is srand(42) on my machine) the trained model has accuracy of 0%.

I noticed that 0% accuracy models have a positive value for probA, while 100% accuracy models have a negative one. The sigmoid function is defined as SF(x) = 1/(1+exp(probA*x+probB)), where x is the decision value. I stated that 0% accuracy models are associated to a decreasing sigmoid function because for x that tends to +infinity, SF(x) tends to 1 if probA < 0 and to 0 if probA > 0.

Considering the fact that we expect high probabilities (i.e., high values of SF(x)) for samples labeled +1 and low probabilities for those labeled -1, I suspect that there is room for an improvement if we constrain probA to be always lower than 0.

I attach the adaptation of svm-train.c that I used to make the experiments, hoping it can help.

@cjlin1
Copy link
Owner

cjlin1 commented Nov 28, 2023 via email

@giulio-datamind
Copy link
Author

Yes, you are correct.

By simply replicating the input data 10 times, the input file becomes like this; with this input all the 1000 experimented seeds lead to a 100% accuracy model.

I think, however, that there is no reason for not trying to directly improve the algorithm in order to make it work better also for lower-cardinality datasets, as is often the case.

@giulio-datamind
Copy link
Author

I tried to impose the constraint probA < 0 by adding the line

newA = newA > -eps ? -2 * eps - newA : newA;

immediately after the

newA = A + stepsize * dA;

in the backtracking loop of sigmoid_train function. Furthermore, I set the initial value to A = 1 instead of A = 0.

In practice, I implemented the constraint by reflecting, at every iteration, the point (A, B) of the parameters' search space around the line A = -eps.

With these changes even if the random seed choice is unfortunate, the accuracies of the trained models never fall below 50%. This happens because in the worst case the samples are classified with a very flat sigmoid (when A is near 0) all into the same class; but, unlike before, it cannot happen that the classification is opposite to the labeling.

Are there any disadvantages I didn't foresee in these modifications?

@cjlin1
Copy link
Owner

cjlin1 commented Jan 5, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants