### 1. dice loss

#### 1.1 Implementation with [TensorFlow / Keras](image_segmentation.ipynb)
Dice loss is a metric that measures overlap. More info on optimizing for Dice coefficient (our dice loss) can be found in the [paper](http://campar.in.tum.de/pub/milletari2016Vnet/milletari2016Vnet.pdf), where it was introduced. 

We use dice loss here because it performs better at class imbalanced problems by design. In addition, maximizing the dice coefficient and IoU metrics are the actual objectives and goals of our segmentation task. Using cross entropy is more of a proxy which is easier to maximize. Instead, we maximize our objective directly. 


In [1]:
def dice_coeff(y_true, y_pred):
    smooth = 1.
    # Flatten
    y_true_f = tf.reshape(y_true, [-1])
    y_pred_f = tf.reshape(y_pred, [-1])
    intersection = tf.reduce_sum(y_true_f * y_pred_f)
    score = (2. * intersection + smooth) / (tf.reduce_sum(y_true_f) + tf.reduce_sum(y_pred_f) + smooth)
    return score

def dice_loss(y_true, y_pred):
    loss = 1 - dice_coeff(y_true, y_pred)
    return loss

Here, we'll use a specialized loss function that combines binary cross entropy and our dice loss. This is based on [individuals who competed within this competition obtaining better results empirically](https://www.kaggle.com/c/carvana-image-masking-challenge/discussion/40199). Try out your own custom losses to measure performance (e.g. bce + log(dice_loss), only bce, etc.)!

In [2]:
def bce_dice_loss(y_true, y_pred):
    loss = losses.binary_crossentropy(y_true, y_pred) + dice_loss(y_true, y_pred)
    return loss

#### 1.2 [Implementation with Pytorch](https://github.com/pytorch/pytorch/issues/1249#issuecomment-305088398)

In [3]:
def dice_loss(input, target):
    smooth = 1.

    iflat = input.view(-1)
    tflat = target.view(-1)
    intersection = (iflat * tflat).sum()
    
    return 1 - ((2. * intersection + smooth) /
              (iflat.sum() + tflat.sum() + smooth))

**Q:**

Hi @IssamLaradji
I've a few questions about the code.

Does smooth similar to eps which avoid division by zero?
Like the cross entropy loss, the result should be a positive value so I'm wondering if is that correct :
`return 1 - ((2. * intersection + smooth) / (iflat.sum() + tflat.sum() + smooth))`

Thanks

**A:**

@tommy-qichang

1. smooth does more than that. You can set smooth to zero and add eps to the denominator to prevent division by zero. However, having a larger smooth value (also known as Laplace smooth, or Additive smooth) can be used to avoid overfitting. The larger the smooth value the closer the following term is to 1 (if everything else is fixed), `((2. * intersection + smooth) /  (iflat.sum() + tflat.sum() + smooth))` This decreases the penalty obtained from having `2*intersection different from iflat.sum() + tflat.sum()`. A similar approach is commonly used in Naive Bayes, see equation (119) in these [notes](https://nlp.stanford.edu/IR-book/html/htmledition/naive-bayes-text-classification-1.html).

2. Yah that should be the case, good catch!


### 2. [lovasz losses](LovaszSoftmax)