Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about harcoded values in the class-balanced cross entropy loss function #3

Closed
philferriere opened this issue Feb 22, 2018 · 4 comments

Comments

@philferriere
Copy link

First, thank you for sharing your code with us. This is interesting work and I can't wait trying to reproduce some of your results.

I noticed that the way class-balanced cross entropy losses are computed here are slightly different from the "base" OSVOS implementation shared here where it is coded as follows:

def class_balanced_cross_entropy_loss(output, label):
    """Define the class balanced cross entropy loss to train the network
    Args:
    output: Output of the network
    label: Ground truth label
    Returns:
    Tensor that evaluates the loss
    """

    labels = tf.cast(tf.greater(label, 0.5), tf.float32)

    num_labels_pos = tf.reduce_sum(labels)
    num_labels_neg = tf.reduce_sum(1.0 - labels)
    num_total = num_labels_pos + num_labels_neg

    output_gt_zero = tf.cast(tf.greater_equal(output, 0), tf.float32)
    loss_val = tf.multiply(output, (labels - output_gt_zero)) - tf.log(
        1 + tf.exp(output - 2 * tf.multiply(output, output_gt_zero)))

    loss_pos = tf.reduce_sum(-tf.multiply(labels, loss_val))
    loss_neg = tf.reduce_sum(-tf.multiply(1.0 - labels, loss_val))

    final_loss = num_labels_neg / num_total * loss_pos + num_labels_pos / num_total * loss_neg

    return final_loss

However, for the lesion segmenter, the final loss is computed as shown here:

final_loss = 0.1018*loss_neg + 0.8982*loss_pos

For the liver segmenter, it is computed using the following formula:

final_loss = 0.931 * loss_pos + 0.069 * loss_neg

My questions are the following:

1/ Why use hardcoded constants instead of calculating the actual foreground/background proportions, as in the original implementation?
2/ What procedure did you use to come up with the hardcoded constants? Are those average foreground/background proportions over the entire training set? A portion of the training set? The training + validation set?

Thank you for your help with this!

@miriambellver
Copy link
Collaborator

Thanks for your interest! These values were obtained with the balancing strategy explained in the article https://arxiv.org/pdf/1711.11069.pdf in the section Loss objective. We tried several balancing strategies and this is the one that worked best for us.

@philferriere
Copy link
Author

Thank you for taking the time to answer my questions, Miriam.

May I suggest that you perhaps make it more obvious in the paper over which dataset (or portion of the dataset) the class weighting terms were computed?

Re: "we tried several balancing strategies and this is the one that worked best for us." Could you quantify how much your dice score improved by using dataset-wide class weights instead of using the original per-image actual proportions, as used in the original implementation?

@miriambellver
Copy link
Collaborator

If you want to find the exact numbers of the comparison, they are in my Master thesis, in the section of Loss objective, there is a subsection of the different balancing techniques we used, and in the results section you can see the exact number (in Table 4.2). We used a validation set that we selected from the whole training volume of LiTS, in the thesis document there is also this information. We chose the first 80% volumes for training, and the remaining 20% for validation.

@philferriere
Copy link
Author

Thank you for the link and taking the time to address my questions, Miriam.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants