-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrading SoftmaxWithLoss layer to accept also spatially varying weights #5828
base: master
Are you sure you want to change the base?
Conversation
…o accept spatially varying non-negative weights for the loss. making WeightedSoftmaxWithLoss layer inherit from SoftmaxWithLoss layer
@shelhamer please see this updated PR. I made the weighted loss layer inherit from Thanks! |
@@ -13,6 +13,8 @@ void SoftmaxWithLossLayer<Dtype>::LayerSetUp( | |||
LossLayer<Dtype>::LayerSetUp(bottom, top); | |||
LayerParameter softmax_param(this->layer_param_); | |||
softmax_param.set_type("Softmax"); | |||
// no loss weight for the Softmax internal layer. | |||
softmax_param.clear_loss_weight(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change suppose to resolve issue #2968.
loss_weight
should be of length 2, the second entry is ignored.
Note that one of the changes proposed in this PR to |
} | ||
} | ||
top[0]->mutable_cpu_data()[0] = loss | ||
/ this->get_normalizer(this->normalization_, agg_weight); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tried to normalize by count instead of agg_weight? For example, this weighted softmax loss implementation is normalized by count. Although I think yours makes a bit more sense, otherwise one might need to tune the loss weight a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point. It seems like the right way to normalize. However this is not set in stone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried a bit on my dataset. I set different weight for different classes. But it looks like normalizing by count
is better than agg_weight
. Maybe I should tune the learning late or loss_weight a bit to achieve good results if I choose to use 'agg_weight'. Have you tried the two different ways of normalization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@weiliu89 I use it mainly for semantic segmentation where each sample has several labels.
How about changing caffe.proto and get_normalizer
to have an additional normalizing method? @shelhamer what do you think about it?
@shelhamer thank you for "focus"ing on this PR. Is there anything I can do to ease the process of accepting this PR? |
@shaibagon Thank you for the post. I want to use this layer on my network. How can I add this layer to existing caffe? I installed the caffe with visual studio 2013 and my operating system is Windows. |
adding "WeightedSoftmaxWithloss" layer.
This new loss layer upgrades "SoftmaxWithLoss" layer (and is derived from it) to accept spatially varying non-negative weights for the loss.
(replaces PR #5801 - due to change in implementation)
Usage:
This PR includes GPU implementation and tests.
This PR
"SoftmaxWithLoss"
layer(a) changed type of
get_normalizer
argument otDtype
(b) ignore
loss_weight
for the internal"Softmax"
layercaffe.proto
(!)