Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix platt scaling #4166

Merged
merged 1 commit into from Feb 26, 2018
Merged

Fix platt scaling #4166

merged 1 commit into from Feb 26, 2018

Conversation

durovo
Copy link
Contributor

@durovo durovo commented Feb 9, 2018

This PR fixes platt scaling and adds new classes SigmoidCalibration and Calibration. Previously, fit_sigmoid was only using the predictions to fit parameters which is incorrect. SigmoidCalibration has been added so that score_to_probabilites from CBinaryLabels can be removed. This will make the x-validated calibration API cleaner (#4009 ). The old fit_sigmoid and scores_to_probabilities have still not be removed since they are still needed in multiclass calibration. It would probably be safe to remove them after fixing #4164

@durovo
Copy link
Contributor Author

durovo commented Feb 10, 2018

The SIGSEV on SGObject seems to be due to some code in upstream develop.


using namespace shogun;

CCalibration::CCalibration()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems to be an abstract class, as nothing is implemented.
why dont you just drop the implementation and specify all the virtual methods pure virtual?

Copy link
Contributor Author

@durovo durovo Feb 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of adding common fit and calibrate methods that would call fit_binary, fit_multiclass etc as required. Yes, there doesn't seem to be any point in the implementation. I'll just add those in the header file itself.


private:
/** Array to store sigmoid parameters for each class. In case of binary labels, only one pair of parameters are stored. */
CStatistics::SigmoidParamters* m_sigmoid_parameters;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't storing this in an std::vector would be better?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or in worst case DynamicObjectArray, for assuring ref counting

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will replace this with std::vector... DynamicObjectArray seems to support only CSGObjects, it can't hold structs like CStatistics::SigmoidParamters. Please correct me if I'm wrong about that :)

index_t num_classes =
predictions->get_num_classes();

SG_FREE(m_sigmoid_parameters);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as said i would rather use something more high level than this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood

int32_t prior1=0;
SG_SDEBUG("counting number of positive and negative labels\n")
{
for (index_t i=0; i<scores.vlen; ++i)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't it enough to count the one of them as the other can be simply deducted?
i would use something like std::count_if or std::transform

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, counting one of them would be enough. I using std::count_if would indeed be neat.


/* parameter setting */
/* maximum number of iterations */
index_t maxiter=100;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all these parameters should be set somewhere else... as you want to be able to control them without actually changing the code... i.e. there should be a getter/setter for them

Copy link
Contributor Author

@durovo durovo Feb 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not the one who originally implemented fit_sigmoid, I merely made a few corrections in it. However, the code can certainly be improved. I'm on it.

index_t length=prior1+prior0;

SGVector<float64_t> t(length);
for (index_t i=0; i<length; ++i)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above... too vanilla implementation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, can't this be done a bit more modern?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can either use std::transform or at least use the fact that SGVector has iterator so it can actually do a range-loop

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do, thanks for the suggestion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vigsterkr I opted to use std::transform

@durovo
Copy link
Contributor Author

durovo commented Feb 10, 2018

Thanks for the review! I will try and make all the requested changes.

@durovo
Copy link
Contributor Author

durovo commented Feb 10, 2018

@vigsterkr While making all the methods pure virtual would be nice, wouldn't we lose the informative messages generated by SG_NOTIMPLEMENTED? Suppose, the implementation class only want to use fit_binary wouldn't it be useful to have the message generated by SG_NOTIMPLEMENTED when fit_multiclass is called?

/** Set maximum number of iterations
* @param maxiter maximum number of iterations
*/
virtual void set_maxiter(index_t maxiter);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are these all virtual? you will have a calibration method that is based on Sigmoid?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not. Indeed there is no need to make them virtual.

@@ -42,28 +42,75 @@

using namespace shogun;

CSigmoidCalibration::CSigmoidCalibration() : CCalibration()
CSigmoidCalibration::CSigmoidCalibration()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should call the ctor of CCalibration

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, my bad

@@ -64,26 +70,44 @@ namespace shogun
* @param targets The true labels corresponding to the predictions
* @return boolean indicating whether the calibration was succesful
**/
virtual bool fit_binary(CBinaryLabels* predictions, CBinaryLabels* targets);
virtual bool fit_binary(CBinaryLabels* predictions, CBinaryLabels* targets)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i still dont get why do we need default implementation with error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about making all methods pure virtual... suppose a calibration method does not support multiclass calibration, then the implementation will have to derive fit_multiclass and give error message manually. Would that be fine?

m_minstep = 1E-10;
m_sigma = 1E-12;
m_epsilon = 1E-5;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m_sigmoid_parameters is not registered... that'll make it impossible to clone this class. since currently registering std::vector is not possible you would need to use DynArray :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for telling me about DynArray, I was looking for the standard shogun way of storing arrays of non-CSGObjects.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have opted to use two SGVectors for now

Copy link
Member

@karlnapf karlnapf Feb 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think that should be OK as a workaround for now.

* ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES
* (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
* LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you use the new (shorter) bsd header? see some other files

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one? /*

  • This software is distributed under BSD 3-clause license (see LICENSE file).
  • Authors:
    */

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

@durovo durovo force-pushed the feature/platt branch 2 times, most recently from 6f3b546 to 8e0e97b Compare February 18, 2018 11:31
* Written (w) 2012 - 2013 Heiko Strathmann
* Written (w) 2018 Dhruv Arya
* All rights reserved.
*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can also use the small license here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay

index_t maxiter = 100, float64_t minstep = 1E-10,
float64_t sigma = 1E-12, float64_t eps = 1E-5);

/** TODO: Remove this method after fixing MuticlassCalibration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls no todo in patches.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood


if (it>=maxiter-1)
{
SG_SWARNING("CStatistics::fit_sigmoid(): reaching maximal iterations,"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we dont do this ClassName::MethodName(): anymore. Just the message

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is mostly taken from the previous fit_sigmoid. Do you recommend changing all the messages?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah let's modernize a bit :)

/* initial Point and Initial Fun Value */
/* result parameters of sigmoid */
float64_t a=0;
float64_t b=CMath::log((prior0+1.0)/(prior1+1.0));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is this code coming from?

Copy link
Contributor Author

@durovo durovo Feb 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's all from the old fit_sigmoid


for (index_t i=0; i<length; ++i)
{
float64_t fApB=scores[i]*a+b;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this code is copy pasted, fine. Otherwise, can we rename this variable? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is mostly copy-pasted. Should I rename it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe leave it so it is clear where the code is coming from

Copy link
Member

@karlnapf karlnapf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the state of this? :)

@durovo durovo force-pushed the feature/platt branch 2 times, most recently from 648c371 to c769d80 Compare February 20, 2018 13:55
@durovo
Copy link
Contributor Author

durovo commented Feb 20, 2018

@karlnapf I have added a test for multiclass calibration.


namespace shogun
{
/** @brief Base class for all calibration methods. Call fit to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you tell doxygen that "fit" is a method here?

/** Fit calibration parameters for binary labels.
* @param predictions The predictions outputted by the machine
* @param targets The true labels corresponding to the predictions
* @return boolean indicating whether the calibration was succesful
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: no need to state the type in return, c++ tell you that already

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood.

{
REQUIRE((m_multiclass_confidences.num_rows != 0) &&
(m_multiclass_confidences.num_cols != 0),
"Call allocate_confidences_for() before fetching confidences");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing newline

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I woudl also suggest to change the error msg
"Empty confidences, which need to be allocated before fetching.\n")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, that looks much better :)

// [0.31003506, 0.60319518, 0.08676976],
// [0.26630134, 0.01476002, 0.71893864],
// [0.72470815, 0.20387599, 0.07141586],
// [0.074445 , 0.80897488, 0.11658012],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you remove those?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

0.75674645, 0.20144443, 0.04180912,
0.82689269, 0.05965391, 0.11345341,
0.81409222, 0.12715657, 0.05875121,
0.48765218, 0.38500051, 0.12734731,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where do the numbers in the tests come from?

Copy link
Contributor Author

@durovo durovo Feb 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I trained a linear SVC in sklearn, this is the decision function of that SVC. The expected output is the probabilities of the calibrated SVC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thats good! Could maybe be added as a comment in there. But nevermind for now

Copy link
Member

@karlnapf karlnapf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting there!
I think one more iteration and we are good :)

@durovo
Copy link
Contributor Author

durovo commented Feb 22, 2018

@karlnapf I have made all the requested changes. Code style fix should be done after this has been approved right?

@@ -73,16 +75,19 @@ int main(int argc, char** argv)
using the method described in Lin, H., Lin, C., and Weng, R. (2007). A note
on Platt's probabilistic outputs for support vector machines.
See BinaryLabels documentation for details*/
out_labels->scores_to_probabilities();
CSigmoidCalibration* sigmoid_calibration = new CSigmoidCalibration();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this deleted anywhere? SG_REF?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed this, sorry for that.


using namespace shogun;

void test_sigmoid_fitting()
{
CBinaryLabels* labels=new CBinaryLabels(10);
CBinaryLabels* predictions=new CBinaryLabels(10);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted somewhere?

CSigmoidCalibration* sigmoid_calibration = new CSigmoidCalibration();
sigmoid_calibration->fit_binary(predictions, labels);
CBinaryLabels* calibrated_labels = sigmoid_calibration->calibrate_binary(predictions);
calibrated_labels->get_values().display_vector("probabilities");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why display?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure myself. It was in the original code

labels->get_values().display_vector("probabilities");

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Not your fault of course. Would still be better removed :)
In fact, what would be best was to remove the example altogether (another PR) and have a meta example for that

params.b = m_sigmoid_bs[0];

/** Convert predictions to probabilties. */
auto values = calibrate_values(predictions->get_values(), params);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than the comment, I would name the variable probabilities and then you dont need it. Minor of course

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would look much better indeed

m_sigmoid_as.resize_vector(num_classes);
m_sigmoid_bs.resize_vector(num_classes);

/** Fit and store parameters for for each class seperately. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for this comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay

index_t num_classes = predictions->get_num_classes();
index_t num_samples = predictions->get_num_labels();

/** Matrix to temporarily store the probabilities. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would never write the type of a variable in a comment, but minor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it seems does seem very redundant. Will change this.

float64_t sum = SGVector<float64_t>::sum(values);

/** All classes have equal probability when sum is zero */
if (sum == 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!

namespace shogun
{
/** @brief Calibrates labels based on Platt Scaling [1]. Note that first calibration parameters need to be fitted by
* calling fit_binary() or fit_multiclass(). Then call calibrate on calibrate_binary() or calibrate_multiclass() on
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what will happen if a user forgets to call fit first?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The length of the parameter vector is checked in calibrate now and an appropriate message is displayed.

SGVector<float64_t> calibrate_values(SGVector<float64_t> values, CStatistics::SigmoidParamters params);

private:
/** Vector to store parameter A of sigmoid for each class. In case of binary labels, only one pair of parameters are stored. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no type in the comments or doxygen, it is clear from the code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it the first line be something like: "Stores parameter A of sigmoid for each class"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but that is really me nitpicking. Just for the future

{
SG_SDEBUG("entering CStatistics::fit_sigmoid()\n")

REQUIRE(scores.vector, "Specify scores vector!\n");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Provided scores are empty.\n"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood.

index_t maxiter, float64_t minstep,
float64_t sigma, float64_t epsilon)
{
SG_SDEBUG("entering CStatistics::fit_sigmoid()\n")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay

EXPECT_EQ(calibrated, true);

auto calibrated_labels = sigmoid_calibration->calibrate_binary(predictions);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why all the empty lines?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not required? Will remove them.

Copy link
Member

@karlnapf karlnapf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good to go very soon
I put some minor comments in if you wanna do another iteration, but happy to merge (modulo CI) otherwise
@vigsterkr what do you think?

index_t length=prior1+prior0;

SGVector<float64_t> t(length);
for (index_t i=0; i<length; ++i)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can either use std::transform or at least use the fact that SGVector has iterator so it can actually do a range-loop

@durovo
Copy link
Contributor Author

durovo commented Feb 23, 2018

@vigsterkr @karlnapf I have made the requested changes. Should I fix the style, rebase and push?

@karlnapf
Copy link
Member

Yes! style fix and let's merge. Nice work!

@vigsterkr vigsterkr changed the title WIP: Fix platt scaling Fix platt scaling Feb 26, 2018
@vigsterkr vigsterkr merged commit 0a2c468 into shogun-toolbox:develop Feb 26, 2018
@karlnapf
Copy link
Member

🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants