New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding weighted euclidean loss layer #5894
base: master
Are you sure you want to change the base?
Conversation
You can keep committing to the same branch in your fork, those commits will be updated in your PRs. |
|
||
virtual inline const char* type() const { return "WeightedEuclideanLoss"; } | ||
/** | ||
* Unlike most loss layers, in the EuclideanLossLayer we can backpropagate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed your layer can propagate to bottom[0]
and bottom[1]
, but what about bottom[2]
?
If this function was copied from "EuclideanLoss"
layer, you need to consider if it is still appropriate here as well...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From mathematical point of view it is doable. But what is the point to differentiate loss function by sample weight from machine learning point of view?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @shaibagon. I have implemented back propagation for weights (bottom[2]). Could you take a look? Thanks
diff_.cpu_data(), // a | ||
Dtype(0), // beta | ||
bottom[i]->mutable_cpu_diff()); // b | ||
for (int j = 0; j < bottom[i]->count(); ++j) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this loop of element-wise multiplication by the weights can be replaced with void caffe_mul(const int N, const Dtype* a, const Dtype* b, Dtype* y);
diff_.mutable_cpu_data()); | ||
|
||
Dtype wdot(0.0); | ||
for (int i = 0; i < count; ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can compute the loss in a slightly different way (assuming your weights are non-negative):
- compute diff <- (x1-x2)
- scale the diff by sqrt of the weights: sdiff <- sqrt(x).*diff
- compute the dot product of <sdiff,sdiff>
These three steps can be easily carried out using off-the-shelf math functions that can be found in math_functions.cpp
.
Once your forward/backward functions are implemented using caffe "math functions" it should be fairly easy to implement a GPU version of the layer - this can be really important for successful usage of the layer.
Hi,
I propose weighted euclidean loss layer. I think it might be very useful, for example it can naturally handle missing labeling data in training of multivariable regression models. It is possible to assign low weight to missing labels while leverage information from existing labels by assigning them higher weight.
This layer has additional input ("bottom") which should correspond 1 by 1 with labels. Values in this additional input represents label "weight"