<a href="https://colab.research.google.com/github/gurmeetheera/torch/blob/main/explained/loss_functions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import torch
import torch.nn as nn

# Mean Absolute loss

In [3]:
loss = nn.L1Loss()

In [4]:
input = torch.randn(3, 5, requires_grad=True)
target = torch.randn(3, 5)
mae_loss = nn.L1Loss()
output = mae_loss(input, target)

In [5]:
input

tensor([[-1.0067,  0.7007, -0.5554,  1.9660, -0.2172],
        [ 0.2010,  0.5133, -1.0753,  0.1957, -0.2220],
        [ 1.2197,  0.3226,  2.3190,  0.4182,  0.2190]], requires_grad=True)

In [6]:
target

tensor([[-0.6690,  1.7592, -0.3545,  0.8823,  1.2551],
        [-1.7853, -0.3510,  1.8416, -2.1949,  2.0010],
        [ 1.0608, -1.4516, -2.9097,  0.4383, -0.6025]])

In [7]:
output

tensor(1.5025, grad_fn=<MeanBackward0>)

In [13]:
# numel refers to NUMber of ELements
torch.abs(input-target).sum()/input.numel()

tensor(1.5025, grad_fn=<DivBackward0>)

# MSE Loss

In [104]:
y = torch.randn(10,1)
y_hat = torch.randn(10,1)

In [107]:
torch.nn.MSELoss()(y_hat,y)

tensor(2.5226)

In [111]:
(y_hat-y).pow(2).sum()/y.numel()

tensor(2.5226)

# BCELoss and BCEWithLogitsLoss

Both these losses are for binary classes and both calculate cross-entropy. Both expects output of a model to be single dimentional (probability/logit of class 1).

1) BCELoss - expects output of forward pass to be probability. i.e. last layer of the model must have sigmoid activation.

2) BCEWithLogitLoss - This loss expects output of the model to be logits (i.e. log of odds, log(p/(1-p)). so this loss first applies sigmoid then calculates binary cross entropy.

Both can be used interchangeably, given output of your forward pass has sigmoid activation (BCELoss) or not (BCEWithLogitsLoss)

In [27]:
labels = torch.randint(0,2,(10,))

## tensor.to(float) converts to float64, tensor.float() converts to float 32
## BOTH THESE LOSSES EXPECTS FLOAT32
## ==================================
labels = labels.float()
labels

tensor([1., 0., 0., 0., 0., 1., 1., 1., 1., 0.])

In [22]:
# logits is log of odds (log odds) = log(p/((1-p))
logits = torch.randn(10,)
logits

tensor([-1.6967,  2.6871, -0.3870,  0.2841, -1.9320,  0.7431,  0.5321, -1.0695,
         0.0770,  0.3902])

In [24]:
probs = torch.sigmoid(logits)
probs

tensor([0.1549, 0.9363, 0.4044, 0.5706, 0.1265, 0.6777, 0.6300, 0.2555, 0.5192,
        0.5963])

In [31]:
torch.nn.BCELoss(reduction="mean")(probs, labels)

tensor(0.9895)

In [30]:
torch.nn.BCEWithLogitsLoss(reduction="mean")(logits, labels)

tensor(0.9895)

# NLLLoss and CrossEntropyLoss

Both these losses, output of the model to be n-dimensional, where 'n' is the number of classes.

Output of the model has to be the single dimensional. each element should represent 1 class starting from 0.

BOTH THESE LOSSES EXPECT LABELS TO BE OF INT DTYPE, AS COMPARE TO BCELOSS AND BCEWITHLOGITSLOSS WHICH EXPECTS LABELS DTYPE TO BE FLOAT32

In [36]:
## 3 classes example

model_output = torch.randn(10,3)  ##each number of each of 3 classes
model_output

tensor([[ 0.4278, -0.1769, -0.1021],
        [-3.7264, -0.8210,  1.2180],
        [ 0.9756,  0.5951, -0.0402],
        [-1.1516, -0.0206,  1.2694],
        [-0.3395, -0.4806, -0.3055],
        [-0.8345,  0.4869,  0.0430],
        [ 1.6530,  0.0384, -0.4284],
        [ 0.6374,  0.3935, -0.3398],
        [-0.9016, -0.9888,  0.9727],
        [-0.6929,  0.4684,  0.2244]])

In [73]:
# labels = torch.randn(10,3).softmax(-1)
# labels = torch.argsort(labels).float()
# labels

In [65]:
labels = torch.randint(0,3,(10,))
labels

tensor([1, 1, 1, 0, 0, 2, 1, 2, 1, 2])

In [62]:
probs = torch.softmax(model_output,-1)
probs

tensor([[0.4684, 0.2559, 0.2757],
        [0.0063, 0.1145, 0.8793],
        [0.4889, 0.3341, 0.1770],
        [0.0651, 0.2018, 0.7331],
        [0.3445, 0.2991, 0.3564],
        [0.1398, 0.5240, 0.3362],
        [0.7554, 0.1503, 0.0942],
        [0.4630, 0.3628, 0.1742],
        [0.1186, 0.1087, 0.7727],
        [0.1493, 0.4770, 0.3737]])

In [71]:
torch.nn.NLLLoss(reduction = "mean")(probs.log(),labels)

tensor(1.6360)

In [72]:
torch.nn.CrossEntropyLoss(reduction = "mean")(model_output,labels)

tensor(1.6360)

#### For 2 Classes

In [75]:
model_output = torch.randn(10,2)  ##each number of each of 3 classes
model_output

tensor([[-0.3272, -0.9772],
        [-1.0323,  0.3656],
        [ 0.7022, -0.3994],
        [ 1.0495, -0.6664],
        [ 0.3569, -0.5211],
        [-0.7566,  0.7727],
        [-0.1840,  0.3146],
        [-1.0753,  0.9329],
        [-0.6972, -0.3679],
        [ 1.1080,  0.2262]])

In [88]:
labels = torch.randint(0,2,(10,))
labels

tensor([0, 1, 1, 0, 1, 0, 0, 1, 1, 0])

In [89]:
probs = torch.softmax(model_output,-1)
probs

tensor([[0.6570, 0.3430],
        [0.1981, 0.8019],
        [0.7506, 0.2494],
        [0.8476, 0.1524],
        [0.7064, 0.2936],
        [0.1781, 0.8219],
        [0.3779, 0.6221],
        [0.1183, 0.8817],
        [0.4184, 0.5816],
        [0.7072, 0.2928]])

In [102]:
import numpy as np
np.exp(0.3430/0.6570)

1.685513078736403

In [90]:
probs1 = probs[:,1]
probs1

tensor([0.3430, 0.8019, 0.2494, 0.1524, 0.2936, 0.8219, 0.6221, 0.8817, 0.5816,
        0.2928])

In [91]:
torch.nn.NLLLoss(reduction = "mean")(probs.log(),labels)

tensor(0.7133)

In [92]:
torch.nn.CrossEntropyLoss(reduction = "mean")(model_output,labels)

tensor(0.7133)

In [93]:
torch.nn.BCELoss(reduction = "mean")(probs1,labels.float())

tensor(0.7133)

In [103]:
torch.nn.BCEWithLogitsLoss(reduction = "mean")(model_output[:,1]-model_output[:,0],labels.float())

tensor(0.7133)