# Analysis of the loss function with bootstrapping
https://arxiv.org/pdf/1412.6596

Multinomial classification task with noisy labels. Loss function modifications:

a) Soft version:

$$
L_{soft}(\mathbf{q}, \mathbf{t}) = - \sum_{k=1}^{L} (\beta t_{k} + (1 - \beta) q_{k}) \log(q_{k}),
$$
where $\mathbf{q}$ is a single image class probabilities, $\mathbf{t}$ is the ground truth, $L$ is the number of classes. Parameter $\beta$ is chosen between $0$ and $1$. 

a) Hard version:

$$
L_{hard}(\mathbf{q}, \mathbf{t}) = - \sum_{k=1}^{L} (\beta t_{k} + (1 - \beta) z_{k}) \log(q_{k}),
$$
where $z_{k}$ is argmax of $\mathbf{q}$ (similar form as $\mathbf{t}$).

In [67]:
import torch
from torch.nn import functional as F

L = 3

# Some ground truth samples:
t_1 = torch.LongTensor([0])
t_2 = torch.LongTensor([1])
q_1a = torch.FloatTensor([[0.9, 0.05, 0.05], ])
q_1b = torch.FloatTensor([[0.2, 0.5, 0.3], ])
q_2a = torch.FloatTensor([[0.33, 0.33, 0.33], ])
q_2b = torch.FloatTensor([[0.15, 0.7, 0.15], ])

#### Soft version

Let's first compute cross entropy term: $-\sum_{k} t_{k} \log(q_{k})$

In [68]:
F.cross_entropy(q_1a, t_1), F.cross_entropy(q_1b, t_1), F.cross_entropy(q_2a, t_2), F.cross_entropy(q_2b, t_2)

(tensor(0.6178), tensor(1.2398), tensor(1.0986), tensor(0.7673))

Now let's compute the second term (soft bootstrapping) : $-\sum_{k} q_{k} \log(q_{k})$

In [69]:
def soft_boostrapping(q):
    return - torch.sum(F.softmax(q, dim=1) * F.log_softmax(q, dim=1), dim=1)

In [70]:
soft_boostrapping(q_1a), soft_boostrapping(q_1b), soft_boostrapping(q_2a), soft_boostrapping(q_2b)

(tensor([ 1.0095]), tensor([ 1.0906]), tensor([ 1.0986]), tensor([ 1.0619]))

In [71]:
def soft_bootstrapping_loss(q, t, beta):
    return F.cross_entropy(q, t) * beta + (1.0 - beta) * soft_boostrapping(q)

In [72]:
soft_bootstrapping_loss(q_1a, t_1, beta=0.95), soft_bootstrapping_loss(q_1b, t_1, beta=0.95)

(tensor([ 0.6374]), tensor([ 1.2324]))

In [73]:
soft_bootstrapping_loss(q_2a, t_2, beta=0.95), soft_bootstrapping_loss(q_2b, t_2, beta=0.95)

(tensor([ 1.0986]), tensor([ 0.7820]))

#### Hard version

Let's first compute cross entropy term: $-\sum_{k} t_{k} \log(q_{k})$

In [74]:
F.cross_entropy(q_1a, t_1), F.cross_entropy(q_1b, t_1), F.cross_entropy(q_2a, t_2), F.cross_entropy(q_2b, t_2)

(tensor(0.6178), tensor(1.2398), tensor(1.0986), tensor(0.7673))

Now let's compute the second term (hard bootstrapping) : $-\sum_{k} z_{k} \log(q_{k})$

In [75]:
def hard_boostrapping(q):
    _, z = torch.max(F.softmax(q, dim=1), dim=1)
    z = z.view(-1, 1)
    return - F.log_softmax(q, dim=1).gather(1, z).view(-1)    

In [76]:
hard_boostrapping(q_1a), hard_boostrapping(q_1b), hard_boostrapping(q_2a), hard_boostrapping(q_2b)

(tensor([ 0.6178]), tensor([ 0.9398]), tensor([ 1.0986]), tensor([ 0.7673]))

In [77]:
def hard_bootstrapping_loss(q, t, beta):
    return F.cross_entropy(q, t) * beta + (1.0 - beta) * hard_boostrapping(q)

In [78]:
hard_bootstrapping_loss(q_1a, t_1, beta=0.8), hard_bootstrapping_loss(q_1b, t_1, beta=0.8)

(tensor([ 0.6178]), tensor([ 1.1798]))

In [79]:
hard_bootstrapping_loss(q_2a, t_2, beta=0.8), hard_bootstrapping_loss(q_2b, t_2, beta=0.8)

(tensor([ 1.0986]), tensor([ 0.7673]))