New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix platt scaling #4166
Fix platt scaling #4166
Conversation
The SIGSEV on SGObject seems to be due to some code in upstream develop. |
|
||
using namespace shogun; | ||
|
||
CCalibration::CCalibration() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems to be an abstract class, as nothing is implemented.
why dont you just drop the implementation and specify all the virtual methods pure virtual?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of adding common fit
and calibrate
methods that would call fit_binary
, fit_multiclass
etc as required. Yes, there doesn't seem to be any point in the implementation. I'll just add those in the header file itself.
|
||
private: | ||
/** Array to store sigmoid parameters for each class. In case of binary labels, only one pair of parameters are stored. */ | ||
CStatistics::SigmoidParamters* m_sigmoid_parameters; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't storing this in an std::vector would be better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or in worst case DynamicObjectArray, for assuring ref counting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will replace this with std::vector
... DynamicObjectArray seems to support only CSGObjects, it can't hold structs like CStatistics::SigmoidParamters
. Please correct me if I'm wrong about that :)
index_t num_classes = | ||
predictions->get_num_classes(); | ||
|
||
SG_FREE(m_sigmoid_parameters); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as said i would rather use something more high level than this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood
int32_t prior1=0; | ||
SG_SDEBUG("counting number of positive and negative labels\n") | ||
{ | ||
for (index_t i=0; i<scores.vlen; ++i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't it enough to count the one of them as the other can be simply deducted?
i would use something like std::count_if
or std::transform
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, counting one of them would be enough. I using std::count_if
would indeed be neat.
|
||
/* parameter setting */ | ||
/* maximum number of iterations */ | ||
index_t maxiter=100; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all these parameters should be set somewhere else... as you want to be able to control them without actually changing the code... i.e. there should be a getter/setter for them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not the one who originally implemented fit_sigmoid
, I merely made a few corrections in it. However, the code can certainly be improved. I'm on it.
index_t length=prior1+prior0; | ||
|
||
SGVector<float64_t> t(length); | ||
for (index_t i=0; i<length; ++i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above... too vanilla implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, can't this be done a bit more modern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can either use std::transform
or at least use the fact that SGVector has iterator so it can actually do a range-loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do, thanks for the suggestion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vigsterkr I opted to use std::transform
Thanks for the review! I will try and make all the requested changes. |
@vigsterkr While making all the methods pure virtual would be nice, wouldn't we lose the informative messages generated by |
/** Set maximum number of iterations | ||
* @param maxiter maximum number of iterations | ||
*/ | ||
virtual void set_maxiter(index_t maxiter); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are these all virtual? you will have a calibration method that is based on Sigmoid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not. Indeed there is no need to make them virtual.
@@ -42,28 +42,75 @@ | |||
|
|||
using namespace shogun; | |||
|
|||
CSigmoidCalibration::CSigmoidCalibration() : CCalibration() | |||
CSigmoidCalibration::CSigmoidCalibration() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should call the ctor of CCalibration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, my bad
src/shogun/evaluation/Calibration.h
Outdated
@@ -64,26 +70,44 @@ namespace shogun | |||
* @param targets The true labels corresponding to the predictions | |||
* @return boolean indicating whether the calibration was succesful | |||
**/ | |||
virtual bool fit_binary(CBinaryLabels* predictions, CBinaryLabels* targets); | |||
virtual bool fit_binary(CBinaryLabels* predictions, CBinaryLabels* targets) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i still dont get why do we need default implementation with error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about making all methods pure virtual... suppose a calibration method does not support multiclass calibration, then the implementation will have to derive fit_multiclass
and give error message manually. Would that be fine?
m_minstep = 1E-10; | ||
m_sigma = 1E-12; | ||
m_epsilon = 1E-5; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m_sigmoid_parameters
is not registered... that'll make it impossible to clone this class. since currently registering std::vector is not possible you would need to use DynArray :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for telling me about DynArray
, I was looking for the standard shogun way of storing arrays of non-CSGObjects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have opted to use two SGVectors for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think that should be OK as a workaround for now.
src/shogun/evaluation/Calibration.h
Outdated
* ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL | ||
* DAMAGES | ||
* (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; | ||
* LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you use the new (shorter) bsd header? see some other files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one? /*
- This software is distributed under BSD 3-clause license (see LICENSE file).
- Authors:
*/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
6f3b546
to
8e0e97b
Compare
* Written (w) 2012 - 2013 Heiko Strathmann | ||
* Written (w) 2018 Dhruv Arya | ||
* All rights reserved. | ||
* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can also use the small license here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay
src/shogun/mathematics/Statistics.h
Outdated
index_t maxiter = 100, float64_t minstep = 1E-10, | ||
float64_t sigma = 1E-12, float64_t eps = 1E-5); | ||
|
||
/** TODO: Remove this method after fixing MuticlassCalibration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls no todo in patches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood
|
||
if (it>=maxiter-1) | ||
{ | ||
SG_SWARNING("CStatistics::fit_sigmoid(): reaching maximal iterations," |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we dont do this ClassName::MethodName(): anymore. Just the message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is mostly taken from the previous fit_sigmoid
. Do you recommend changing all the messages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah let's modernize a bit :)
/* initial Point and Initial Fun Value */ | ||
/* result parameters of sigmoid */ | ||
float64_t a=0; | ||
float64_t b=CMath::log((prior0+1.0)/(prior1+1.0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is this code coming from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's all from the old fit_sigmoid
|
||
for (index_t i=0; i<length; ++i) | ||
{ | ||
float64_t fApB=scores[i]*a+b; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this code is copy pasted, fine. Otherwise, can we rename this variable? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is mostly copy-pasted. Should I rename it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe leave it so it is clear where the code is coming from
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the state of this? :)
648c371
to
c769d80
Compare
@karlnapf I have added a test for multiclass calibration. |
src/shogun/evaluation/Calibration.h
Outdated
|
||
namespace shogun | ||
{ | ||
/** @brief Base class for all calibration methods. Call fit to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you tell doxygen that "fit" is a method here?
src/shogun/evaluation/Calibration.h
Outdated
/** Fit calibration parameters for binary labels. | ||
* @param predictions The predictions outputted by the machine | ||
* @param targets The true labels corresponding to the predictions | ||
* @return boolean indicating whether the calibration was succesful |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: no need to state the type in return, c++ tell you that already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood.
{ | ||
REQUIRE((m_multiclass_confidences.num_rows != 0) && | ||
(m_multiclass_confidences.num_cols != 0), | ||
"Call allocate_confidences_for() before fetching confidences"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing newline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I woudl also suggest to change the error msg
"Empty confidences, which need to be allocated before fetching.\n")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, that looks much better :)
// [0.31003506, 0.60319518, 0.08676976], | ||
// [0.26630134, 0.01476002, 0.71893864], | ||
// [0.72470815, 0.20387599, 0.07141586], | ||
// [0.074445 , 0.80897488, 0.11658012], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you remove those?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
0.75674645, 0.20144443, 0.04180912, | ||
0.82689269, 0.05965391, 0.11345341, | ||
0.81409222, 0.12715657, 0.05875121, | ||
0.48765218, 0.38500051, 0.12734731, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where do the numbers in the tests come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I trained a linear SVC in sklearn, this is the decision function of that SVC. The expected output is the probabilities of the calibrated SVC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thats good! Could maybe be added as a comment in there. But nevermind for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Getting there!
I think one more iteration and we are good :)
@karlnapf I have made all the requested changes. Code style fix should be done after this has been approved right? |
@@ -73,16 +75,19 @@ int main(int argc, char** argv) | |||
using the method described in Lin, H., Lin, C., and Weng, R. (2007). A note | |||
on Platt's probabilistic outputs for support vector machines. | |||
See BinaryLabels documentation for details*/ | |||
out_labels->scores_to_probabilities(); | |||
CSigmoidCalibration* sigmoid_calibration = new CSigmoidCalibration(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this deleted anywhere? SG_REF?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed this, sorry for that.
|
||
using namespace shogun; | ||
|
||
void test_sigmoid_fitting() | ||
{ | ||
CBinaryLabels* labels=new CBinaryLabels(10); | ||
CBinaryLabels* predictions=new CBinaryLabels(10); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleted somewhere?
CSigmoidCalibration* sigmoid_calibration = new CSigmoidCalibration(); | ||
sigmoid_calibration->fit_binary(predictions, labels); | ||
CBinaryLabels* calibrated_labels = sigmoid_calibration->calibrate_binary(predictions); | ||
calibrated_labels->get_values().display_vector("probabilities"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why display?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure myself. It was in the original code
labels->get_values().display_vector("probabilities"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Not your fault of course. Would still be better removed :)
In fact, what would be best was to remove the example altogether (another PR) and have a meta example for that
params.b = m_sigmoid_bs[0]; | ||
|
||
/** Convert predictions to probabilties. */ | ||
auto values = calibrate_values(predictions->get_values(), params); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than the comment, I would name the variable probabilities
and then you dont need it. Minor of course
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would look much better indeed
m_sigmoid_as.resize_vector(num_classes); | ||
m_sigmoid_bs.resize_vector(num_classes); | ||
|
||
/** Fit and store parameters for for each class seperately. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for this comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay
index_t num_classes = predictions->get_num_classes(); | ||
index_t num_samples = predictions->get_num_labels(); | ||
|
||
/** Matrix to temporarily store the probabilities. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would never write the type of a variable in a comment, but minor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it seems does seem very redundant. Will change this.
float64_t sum = SGVector<float64_t>::sum(values); | ||
|
||
/** All classes have equal probability when sum is zero */ | ||
if (sum == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch!
namespace shogun | ||
{ | ||
/** @brief Calibrates labels based on Platt Scaling [1]. Note that first calibration parameters need to be fitted by | ||
* calling fit_binary() or fit_multiclass(). Then call calibrate on calibrate_binary() or calibrate_multiclass() on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what will happen if a user forgets to call fit first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The length of the parameter vector is checked in calibrate now and an appropriate message is displayed.
SGVector<float64_t> calibrate_values(SGVector<float64_t> values, CStatistics::SigmoidParamters params); | ||
|
||
private: | ||
/** Vector to store parameter A of sigmoid for each class. In case of binary labels, only one pair of parameters are stored. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no type in the comments or doxygen, it is clear from the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it the first line be something like: "Stores parameter A of sigmoid for each class"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but that is really me nitpicking. Just for the future
{ | ||
SG_SDEBUG("entering CStatistics::fit_sigmoid()\n") | ||
|
||
REQUIRE(scores.vector, "Specify scores vector!\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Provided scores are empty.\n"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood.
index_t maxiter, float64_t minstep, | ||
float64_t sigma, float64_t epsilon) | ||
{ | ||
SG_SDEBUG("entering CStatistics::fit_sigmoid()\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay
EXPECT_EQ(calibrated, true); | ||
|
||
auto calibrated_labels = sigmoid_calibration->calibrate_binary(predictions); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why all the empty lines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not required? Will remove them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is good to go very soon
I put some minor comments in if you wanna do another iteration, but happy to merge (modulo CI) otherwise
@vigsterkr what do you think?
index_t length=prior1+prior0; | ||
|
||
SGVector<float64_t> t(length); | ||
for (index_t i=0; i<length; ++i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can either use std::transform
or at least use the fact that SGVector has iterator so it can actually do a range-loop
@vigsterkr @karlnapf I have made the requested changes. Should I fix the style, rebase and push? |
Yes! style fix and let's merge. Nice work! |
64c86ac
to
56cde39
Compare
56cde39
to
d679103
Compare
🎉 |
This PR fixes platt scaling and adds new classes SigmoidCalibration and Calibration. Previously,
fit_sigmoid
was only using the predictions to fit parameters which is incorrect. SigmoidCalibration has been added so thatscore_to_probabilites
fromCBinaryLabels
can be removed. This will make the x-validated calibration API cleaner (#4009 ). The oldfit_sigmoid
andscores_to_probabilities
have still not be removed since they are still needed in multiclass calibration. It would probably be safe to remove them after fixing #4164