Fix huber loss #178

henry0312 · 2017-01-09T08:36:27Z

TODO:

merge and rebase Add L1 objective function #175

msftclas · 2017-01-09T08:36:32Z

Hi @henry0312, I'm your friendly neighborhood Microsoft Pull Request Bot (You can call me MSBOT). Thanks for your contribution!
You've already signed the contribution license agreement. Thanks!

The agreement was validated by Microsoft and real humans are currently evaluating your PR.

TTYL, MSBOT;

guolinke · 2017-01-09T08:39:42Z

src/objective/objective_function.cpp

    return new RegressionL2loss(config);
+  } else if (type == std::string("regression_l2") || type == std::string("mean_squared_error") || type == std::string("mse")) {
+    return new RegressionL2loss(config);


merge these two if?

I'm sorry, I don't understand what you mean 😢
This PR will be merged after #175

change

if () ... else if() ...

to

if() ...

. They are all regression objective, right?

yes, they are.

you mean,

if (type == std::string("regression") || type == std::string("regression_l2") || type == std::string("mean_squared_error") || type == std::string("mse")) { return new RegressionL2loss(config); }

right?

yes. and you can break if() to two line:

if (type == std::string("regression") || type == std::string("regression_l2") || type == std::string("mean_squared_error") || type == std::string("mse")) { return new RegressionL2loss(config); }

guolinke · 2017-01-09T08:41:45Z

src/objective/regression_objective.hpp

+        const double a = 2.0 * weights_[i];  // difference of two first derivatives, (zero to inf) and (zero to -inf).
+        const double b = 0.0;
+        const double c = (std::fabs(score[i]) + std::fabs(label_[i])) / 1.0e3;
+        hessians[i] = std::exp(-(x - b) * (x - b) / 2.0 * c * c) / a;


I think you can use an inline function to calculate hessian and avoid these repeated code.

guolinke · 2017-01-09T08:42:59Z

src/objective/regression_objective.hpp

+            const double a = 2.0;  // difference of two first derivatives, (zero to inf) and (zero to -inf)
+            const double b = 0.0;
+            const double c = (std::fabs(score[i]) + std::fabs(label_[i])) / 1.0e3;
+            hessians[i] = std::exp(-(x - b) * (x - b) / 2.0 * c * c) / a;


same as above, and you can put this function into utils/common.h

henry0312 · 2017-01-09T09:37:51Z

rebased master.
However, there may be something wrong with ApproximateHessianWithGaussian.
I'll check.

Laurae2 · 2017-01-09T09:55:58Z

@henry0312 This topic has a bunch of Gradient/Hessian functions for helping to converge for a L1 loss function when a hessian is mandatory for convergence.

They were all tested in xgboost and are all converging fast. Without a hessian, the convergence is slow (which seems to be the case in LightGBM also as @guolinke mentioned in the previous PR for L1 loss). It would be good to have an example to check the convergence speed of the L1 loss.

The short "explanation" is that when a hessian is mandatory for good convergence and we want to use a L1 loss function, we should avoid at any cost a constant hessian. To fix this issue, we smooth the linear function using a curve so the hessian is never zeroed out (better and faster convergence speed). For instance, the Fair objective (from a Microsoft research) is used to approximate such L1 function requiring a hessian.

The R versions with their gradient/hessian formulas are below:

ln_cosh_obj <- function(preds, dtrain) {
  labels <- getinfo(dtrain, "label")
  grad <- tanh(preds-labels)
  hess <- 1-grad*grad
  return(list(grad = grad, hess = hess))
}

ln_expexp_obj <- function(preds, dtrain) {
  labels <- getinfo(dtrain, "label")
  x <- preds-labels
  grad <-  (exp(2*x)-1) / (exp(2*x)+1)
  hess <-  (4*exp(2*x)) / (exp(2*x) + 1)^2 
  return(list(grad = grad, hess = hess))
}

cauchyobj <- function(preds, dtrain) {
  labels <- getinfo(dtrain, "label")
  c <- 2  #the lower the "slower/smoother" the loss is
  x <-  preds-labels
  grad <- x / (x^2/c^2+1)
  hess <- -c^2*(x^2-c^2)/(x^2+c^2)^2
  return(list(grad = grad, hess = hess))
}

fairobj <- function(preds, dtrain) {
  labels <- getinfo(dtrain, "label")
  c <- 2  #the lower the "slower/smoother" the loss is
  x <-  preds-labels
  grad <- c*x / (abs(x)+c)
  hess <- c^2 / (abs(x)+c)^2
  return(list(grad = grad, hess = hess))
}

henry0312 · 2017-01-09T10:26:46Z

@Laurae2 the hessian of mae is delta function, therefore, if we can approximate it with something appropriate (e.g. gaussian function), converge and performance will be good, I guess.

guolinke · 2017-01-09T12:56:47Z

include/LightGBM/utils/common.h

 * w means weights.
 */
-inline static double ApproximateHessianWithGaussian(double y, double t, double w=1.0f) {
+inline static double ApproximateHessianWithGaussian(const double y, const double t, const double g, const double w=1.0f) {
  const double diff = y - t;
  const double pi = M_PI;


where is the define of M_PI ? I cannot compile it on windows.

Could you add #include<cmath>?

it seems need to add #define _USE_MATH_DEFINES in windows:
http://stackoverflow.com/questions/6563810/m-pi-works-with-math-h-but-not-with-cmath-in-visual-studio

probably fixed with 05e97f1

henry0312 · 2017-01-09T13:12:30Z

I finished fixing ApproximateHessianWithGaussian (also Huber loss).
L1 loss is approximated with Gaussian function like http://www.wolframalpha.com/input/?i=Integrate%5BErf%5Bx%5D%5D,+Integrate%5BErf%5Bx%2F10%5D%5D.

henry0312 · 2017-01-09T13:14:05Z

Next, I will create Fair loss in an another PR, which @Laurae2 tells me.

msftclas added the cla-already-signed label Jan 9, 2017

guolinke reviewed Jan 9, 2017

View reviewed changes

henry0312 added 2 commits January 9, 2017 18:36

fix typo

fc4fa6a

fix hessians to approximate hessians with Gaussian function

88983dc

henry0312 force-pushed the fix_huber_loss branch from 743e33c to 88983dc Compare January 9, 2017 09:36

henry0312 changed the title ~~Fix huber loss~~ [WIP] Fix huber loss Jan 9, 2017

fix ApproximateHessianWithGaussian

10b051c

guolinke reviewed Jan 9, 2017

View reviewed changes

henry0312 added 3 commits January 9, 2017 21:57

take fabs of gradient

702a921

use atan(1) to calculate pi

05e97f1

fix pi

13f9d04

henry0312 changed the title ~~[WIP] Fix huber loss~~ Fix huber loss Jan 9, 2017

henry0312 mentioned this pull request Jan 9, 2017

Add Fair loss #180

Merged

1 task

guolinke merged commit 27d3eb3 into microsoft:master Jan 9, 2017

henry0312 deleted the fix_huber_loss branch January 9, 2017 16:15

Laurae2 mentioned this pull request Apr 6, 2017

MAE objective function #66

Closed

henry0312 mentioned this pull request Jul 18, 2017

what happened? leaf_value have most value 100 , -100 #705

Closed

henry0312 mentioned this pull request Oct 11, 2017

Maybe a bug about Hessian matrix of regression_l1 objective function #979

Closed

lock bot locked as resolved and limited conversation to collaborators Mar 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix huber loss #178

Fix huber loss #178

henry0312 commented Jan 9, 2017 •

edited

msftclas commented Jan 9, 2017

guolinke Jan 9, 2017

henry0312 Jan 9, 2017

guolinke Jan 9, 2017

henry0312 Jan 9, 2017

guolinke Jan 9, 2017

guolinke Jan 9, 2017

guolinke Jan 9, 2017

henry0312 Jan 9, 2017

henry0312 commented Jan 9, 2017

Laurae2 commented Jan 9, 2017

henry0312 commented Jan 9, 2017

guolinke Jan 9, 2017

henry0312 Jan 9, 2017

guolinke Jan 9, 2017 •

edited

henry0312 Jan 9, 2017

henry0312 commented Jan 9, 2017

henry0312 commented Jan 9, 2017 •

edited

Fix huber loss #178

Fix huber loss #178

Conversation

henry0312 commented Jan 9, 2017 • edited

msftclas commented Jan 9, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

henry0312 commented Jan 9, 2017

Laurae2 commented Jan 9, 2017

henry0312 commented Jan 9, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

guolinke Jan 9, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

henry0312 commented Jan 9, 2017

henry0312 commented Jan 9, 2017 • edited

henry0312 commented Jan 9, 2017 •

edited

guolinke Jan 9, 2017 •

edited

henry0312 commented Jan 9, 2017 •

edited