Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrading SoftmaxWithLoss layer to accept also spatially varying weights #5828

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

shaibagon
Copy link
Member

@shaibagon shaibagon commented Aug 7, 2017

adding "WeightedSoftmaxWithloss" layer.
This new loss layer upgrades "SoftmaxWithLoss" layer (and is derived from it) to accept spatially varying non-negative weights for the loss.

(replaces PR #5801 - due to change in implementation)

Usage:

layer {
  name: "weighted_loss"
  type: "WeightedSoftmaxWithLoss"
  bottom: "predictions"  # raw predictions, e.g., B-C-H-W
  bottom: "labels"  # per-pixel label, e.g., B-1-H-W
  bottom: "weights"  # per pixel loss weight, e.g., B-1-H-W
  top: "loss"
  softmax_param { axis: 1 }
  loss_param { ignore_label: -1 normalization: VALID } # normalize by the SUM of the valid weights
}

This PR includes GPU implementation and tests.

This PR

  • adds a new layer
  • made minor changes to "SoftmaxWithLoss" layer
    (a) changed type of get_normalizer argument ot Dtype
    (b) ignore loss_weight for the internal "Softmax" layer
  • No changes to caffe.proto(!)

…o accept spatially varying non-negative weights for the loss.

making WeightedSoftmaxWithLoss layer inherit from SoftmaxWithLoss layer
@shaibagon
Copy link
Member Author

@shelhamer please see this updated PR. I made the weighted loss layer inherit from "SoftmaxWithLoss" as you recommended.

Thanks!

@@ -13,6 +13,8 @@ void SoftmaxWithLossLayer<Dtype>::LayerSetUp(
LossLayer<Dtype>::LayerSetUp(bottom, top);
LayerParameter softmax_param(this->layer_param_);
softmax_param.set_type("Softmax");
// no loss weight for the Softmax internal layer.
softmax_param.clear_loss_weight();
Copy link
Member Author

@shaibagon shaibagon Aug 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change suppose to resolve issue #2968.
loss_weight should be of length 2, the second entry is ignored.

@shaibagon
Copy link
Member Author

Note that one of the changes proposed in this PR to "SoftmaxWithLoss" layer resolves issue #2968.

}
}
top[0]->mutable_cpu_data()[0] = loss
/ this->get_normalizer(this->normalization_, agg_weight);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried to normalize by count instead of agg_weight? For example, this weighted softmax loss implementation is normalized by count. Although I think yours makes a bit more sense, otherwise one might need to tune the loss weight a bit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. It seems like the right way to normalize. However this is not set in stone.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried a bit on my dataset. I set different weight for different classes. But it looks like normalizing by count is better than agg_weight. Maybe I should tune the learning late or loss_weight a bit to achieve good results if I choose to use 'agg_weight'. Have you tried the two different ways of normalization?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weiliu89 I use it mainly for semantic segmentation where each sample has several labels.
How about changing caffe.proto and get_normalizer to have an additional normalizing method? @shelhamer what do you think about it?

@shaibagon
Copy link
Member Author

@shelhamer thank you for "focus"ing on this PR. Is there anything I can do to ease the process of accepting this PR?

@giihyun
Copy link

giihyun commented Feb 21, 2018

@shaibagon Thank you for the post. I want to use this layer on my network. How can I add this layer to existing caffe? I installed the caffe with visual studio 2013 and my operating system is Windows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants