# Class imbalance and weighted classification

Let's consider we have a classification problem where we have 10 patients classified as stroke and 2 people classified as non-stroke. There is a bias or skewness toward the majority class present in the target. The algorithm will be more bias the prediction of the majority class. **Weighted classification cost alters the behavious of the learner classifier so that it weights points in the smaller class more and points in the larger class less**

In [184]:
import torch 
import torch.nn as nn
import math
device =torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
print(device)

cpu


# Weighted classification definition 


Class weights :

**w0** = $\frac{10}{8\times 2}=0.625$ where 2 is the number of class (0 and 1), 8 number of non stroke

**w1** = $\frac{10}{2\times 2}=2.5$ where 2 is the number of class (0 and 1)


The loss function without weights is given by: 

$Log loss = -\frac{1}{N} \sum_i^N y_i \log(\hat{y_i}) + (1 - y_i) \log(1-\hat{y_i})$ 

and with the weights by:  

$Log loss = -\frac{1}{N} \sum_i^N w_1\times y_i \log(\hat{y_i}) + w_0 \times (1 - y_i) \log(1-\hat{y_i})$ 



In [185]:
weights = [10/(2*8),10/(4)]

In [186]:
class_weights = torch.FloatTensor(weights).to(device)
class_weights

tensor([0.6250, 2.5000])

# pytorch built-in fucntion issue with weighted classification

In [187]:
Loss = nn.CrossEntropyLoss(weight=class_weights)
Loss2=nn.BCELoss(weight=class_weights)

In [188]:
inputt = torch.tensor([0.6,0.4]).to(device)
y_true = torch.tensor([1.,0.]).to(device)

In [189]:
print(f'loss given by Cross entropy {Loss(inputt,y_true)}, by BCELoss {Loss2(inputt,y_true)}')

loss given by Cross entropy 0.37383678555488586, by BCELoss 0.7981650233268738


In [190]:
Y_true =0

In [191]:
-class_weights[Y_true]*Y_true*torch.log(inputt[Y_true])

tensor(0.)

In [192]:
-class_weights[Y_true]*(1.-Y_true)*torch.log(1.-inputt).mean()

tensor(0.4460)

When we compare the result given by the built-in function to our customized function, we can't get what the built-in function is doing. This is an issue, so we will then customise our own loss function. 

# Customized weighted class function

In [193]:
def weighted_binary_cross_entropy(output, target, weights=None):
        
    if weights is not None:
        assert len(weights) == 2
        
        loss = weights[1] * (target * torch.log(output)) + \
               weights[0] * ((1 - target) * torch.log(1 - output))
    else:
        loss = target * torch.log(output) + (1 - target) * torch.log(1 - output)

    return torch.neg(torch.mean(loss))

In [194]:
weights = [10/(2*8),10/(4)]
class_weights = torch.FloatTensor(weights).to(device)
inputt = torch.tensor([0.6,0.4]).to(device)
y_true = torch.tensor(1).to(device)

In [195]:
weighted_binary_cross_entropy(inputt[1], y_true, weights=class_weights)

tensor(2.2907)

In [196]:
weights = [10/(2*8),10/(4)]
class_weights = torch.FloatTensor(weights).to(device)
inputt = torch.tensor([0.6,0.4],requires_grad=True).to(device)
y_true = torch.tensor(1).to(device)

In [197]:
weighted_binary_cross_entropy(inputt[1], y_true, weights=class_weights)

tensor(2.2907, grad_fn=<NegBackward0>)