-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix huber loss #178
Fix huber loss #178
Conversation
Hi @henry0312, I'm your friendly neighborhood Microsoft Pull Request Bot (You can call me MSBOT). Thanks for your contribution! The agreement was validated by Microsoft and real humans are currently evaluating your PR. TTYL, MSBOT; |
return new RegressionL2loss(config); | ||
} else if (type == std::string("regression_l2") || type == std::string("mean_squared_error") || type == std::string("mse")) { | ||
return new RegressionL2loss(config); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
merge these two if?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry, I don't understand what you mean 😢
This PR will be merged after #175
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change
if ()
...
else if()
...
to
if()
...
. They are all regression objective, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, they are.
you mean,
if (type == std::string("regression") || type == std::string("regression_l2") || type == std::string("mean_squared_error") || type == std::string("mse")) {
return new RegressionL2loss(config);
}
right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. and you can break if()
to two line:
if (type == std::string("regression") || type == std::string("regression_l2")
|| type == std::string("mean_squared_error") || type == std::string("mse")) {
return new RegressionL2loss(config);
}
const double a = 2.0 * weights_[i]; // difference of two first derivatives, (zero to inf) and (zero to -inf). | ||
const double b = 0.0; | ||
const double c = (std::fabs(score[i]) + std::fabs(label_[i])) / 1.0e3; | ||
hessians[i] = std::exp(-(x - b) * (x - b) / 2.0 * c * c) / a; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can use an inline function to calculate hessian and avoid these repeated code.
const double a = 2.0; // difference of two first derivatives, (zero to inf) and (zero to -inf) | ||
const double b = 0.0; | ||
const double c = (std::fabs(score[i]) + std::fabs(label_[i])) / 1.0e3; | ||
hessians[i] = std::exp(-(x - b) * (x - b) / 2.0 * c * c) / a; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above, and you can put this function into utils/common.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok!
743e33c
to
88983dc
Compare
rebased master. |
@henry0312 This topic has a bunch of Gradient/Hessian functions for helping to converge for a L1 loss function when a hessian is mandatory for convergence. They were all tested in xgboost and are all converging fast. Without a hessian, the convergence is slow (which seems to be the case in LightGBM also as @guolinke mentioned in the previous PR for L1 loss). It would be good to have an example to check the convergence speed of the L1 loss. The short "explanation" is that when a hessian is mandatory for good convergence and we want to use a L1 loss function, we should avoid at any cost a constant hessian. To fix this issue, we smooth the linear function using a curve so the hessian is never zeroed out (better and faster convergence speed). For instance, the Fair objective (from a Microsoft research) is used to approximate such L1 function requiring a hessian. The R versions with their gradient/hessian formulas are below: ln_cosh_obj <- function(preds, dtrain) {
labels <- getinfo(dtrain, "label")
grad <- tanh(preds-labels)
hess <- 1-grad*grad
return(list(grad = grad, hess = hess))
}
ln_expexp_obj <- function(preds, dtrain) {
labels <- getinfo(dtrain, "label")
x <- preds-labels
grad <- (exp(2*x)-1) / (exp(2*x)+1)
hess <- (4*exp(2*x)) / (exp(2*x) + 1)^2
return(list(grad = grad, hess = hess))
}
cauchyobj <- function(preds, dtrain) {
labels <- getinfo(dtrain, "label")
c <- 2 #the lower the "slower/smoother" the loss is
x <- preds-labels
grad <- x / (x^2/c^2+1)
hess <- -c^2*(x^2-c^2)/(x^2+c^2)^2
return(list(grad = grad, hess = hess))
}
fairobj <- function(preds, dtrain) {
labels <- getinfo(dtrain, "label")
c <- 2 #the lower the "slower/smoother" the loss is
x <- preds-labels
grad <- c*x / (abs(x)+c)
hess <- c^2 / (abs(x)+c)^2
return(list(grad = grad, hess = hess))
} |
@Laurae2 the hessian of mae is delta function, therefore, if we can approximate it with something appropriate (e.g. gaussian function), converge and performance will be good, I guess. |
* w means weights. | ||
*/ | ||
inline static double ApproximateHessianWithGaussian(double y, double t, double w=1.0f) { | ||
inline static double ApproximateHessianWithGaussian(const double y, const double t, const double g, const double w=1.0f) { | ||
const double diff = y - t; | ||
const double pi = M_PI; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is the define of M_PI
? I cannot compile it on windows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add #include<cmath>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems need to add #define _USE_MATH_DEFINES
in windows:
http://stackoverflow.com/questions/6563810/m-pi-works-with-math-h-but-not-with-cmath-in-visual-studio
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably fixed with 05e97f1
I finished fixing |
Next, I will create Fair loss in an another PR, which @Laurae2 tells me. |
TODO: