损失函数：衡量模型输出与真实标签的差异  
损失函数、代价函数、目标函数

nn.CrossEntropyLoss  
功能：nn.LogSoftmax()与nn.NLLLoss()结合，进行交叉熵计算  
主要参数：  
* weight:各类别的loss设置权值  
* ignore_index:忽略某个类别  
* reduction:计算模式，可为none/sum/mean  
none 逐个元素计算  
sum 所有元素求和，返回标量  
mean 加权平均，返回标量  

交叉熵=信息熵+相对熵  

In [14]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import numpy as np

In [15]:
inputs = torch.tensor([[1, 2], [1, 3], [1, 3]], dtype=torch.float)
target = torch.tensor([0, 1, 1], dtype=torch.long)

In [8]:
loss_f_none = nn.CrossEntropyLoss(weight=None, reduction='none')
loss_f_sum = nn.CrossEntropyLoss(weight=None, reduction='sum')
loss_f_mean = nn.CrossEntropyLoss(weight=None, reduction='mean')

loss_none = loss_f_none(inputs, target)
loss_sum = loss_f_sum(inputs, target)
loss_mean = loss_f_mean(inputs, target)
loss_none, loss_sum, loss_mean

(tensor([1.3133, 0.1269, 0.1269]), tensor(1.5671), tensor(0.5224))

In [11]:
idx = 0
input_1 = inputs.detach().numpy()[idx]
target_1 = target.numpy()[idx]
x_class = input_1[target_1]
sigma_exp_x = np.sum(list(map(np.exp, input_1)))
log_sigma_exp_x = np.log(sigma_exp_x)
loss_1 = -x_class + log_sigma_exp_x
loss_1, x_class, input_1, target_1

(1.3132617, 1.0, array([1., 2.], dtype=float32), 0)

weight

In [12]:
weights = torch.tensor([1, 2], dtype=torch.float)
loss_f_none_w = nn.CrossEntropyLoss(weight=weights, reduction='none')
loss_f_sum = nn.CrossEntropyLoss(weight=weights, reduction='sum')
loss_f_mean = nn.CrossEntropyLoss(weight=weights, reduction='mean')
loss_none_w = loss_f_none_w(inputs, target)
loss_sum = loss_f_sum(inputs, target)
loss_mean = loss_f_mean(inputs, target)
weights, loss_none_w, loss_sum, loss_mean

(tensor([1., 2.]),
 tensor([1.3133, 0.2539, 0.2539]),
 tensor(1.8210),
 tensor(0.3642))

In [17]:
weights = torch.tensor([1, 2], dtype=torch.float)
weights_all = np.sum(list(map(lambda x: weights.numpy()[x], target.numpy())))
mean = 0
loss_sep = loss_none.detach().numpy()
for i in range(target.shape[0]):
    x_class = target.numpy()[i]
    tmp = loss_sep[i] * (weights.numpy()[x_class] / weights_all)
    mean += tmp
mean

0.3641947731375694

nn.NLLLoss  
功能：实现负对数似然函数中的负号功能  
主要参数：  
* weight:各类别的loss设置权值  
* ignore_index:忽略某个类别  
* reduction:计算模式，可为none/sum/mean  
none 逐个元素计算  
sum 所有元素求和，返回标量  
mean 加权平均，返回标量  

In [18]:
weights = torch.tensor([1, 1], dtype=torch.float)
loss_f_none_w = nn.NLLLoss(weight=weights, reduction='none')
loss_f_sum = nn.NLLLoss(weight=weights, reduction='sum')
loss_f_mean = nn.NLLLoss(weight=weights, reduction='mean')
loss_none_w = loss_f_none_w(inputs, target)
loss_sum = loss_f_sum(inputs, target)
loss_mean = loss_f_mean(inputs, target)
weights, loss_none_w, loss_sum, loss_mean

(tensor([1., 1.]), tensor([-1., -3., -3.]), tensor(-7.), tensor(-2.3333))

nn.BCELoss  
功能：二分类交叉熵  
注意事项：输入值取值在[0, 1]  
主要参数：  
* weight:各类别的loss设置权值  
* ignore_index:忽略某个类别  
* reduction:计算模式，可为none/sum/mean  
none 逐个元素计算  
sum 所有元素求和，返回标量  
mean 加权平均，返回标量  

In [19]:
inputs = torch.tensor([[1, 2], [2, 2], [3, 4], [4, 5]], dtype=torch.float)
target = torch.tensor([[1, 0], [1, 0], [0, 1], [0, 1]], dtype=torch.float)
target_bce = target
inputs = torch.sigmoid(inputs)
weights = torch.tensor([1, 1], dtype=torch.float)
loss_f_none_w = nn.BCELoss(weight=weights, reduction='none')
loss_f_sum = nn.BCELoss(weight=weights, reduction='sum')
loss_f_mean = nn.BCELoss(weight=weights, reduction='mean')
loss_none_w = loss_f_none_w(inputs, target_bce)
loss_sum = loss_f_sum(inputs, target_bce)
loss_mean = loss_f_mean(inputs, target_bce)
weights, loss_none_w, loss_sum, loss_mean

(tensor([1., 1.]),
 tensor([[0.3133, 2.1269],
         [0.1269, 2.1269],
         [3.0486, 0.0181],
         [4.0181, 0.0067]]),
 tensor(11.7856),
 tensor(1.4732))

In [21]:
idx = 0 
x_i = inputs.detach().numpy()[idx, idx]
y_i = target.numpy()[idx, idx]
l_i = -y_i * np.log(x_i) if y_i else -(1-y_i) * np.log(1-x_i)
inputs, l_i

(tensor([[0.7311, 0.8808],
         [0.8808, 0.8808],
         [0.9526, 0.9820],
         [0.9820, 0.9933]]),
 0.31326166)

nn.BCEWithLogitsLoss  
功能：结合Sigmoid与二分类交叉熵  
注意事项：网络最后不加sigmoid函数  
主要参数： 
* pos_weight:正样本的权值   
* weight:各类别的loss设置权值  
* ignore_index:忽略某个类别  
* reduction:计算模式，可为none/sum/mean  
none 逐个元素计算  
sum 所有元素求和，返回标量  
mean 加权平均，返回标量  

In [23]:
inputs = torch.tensor([[1, 2], [2, 2], [3, 4], [4, 5]], dtype=torch.float)
target = torch.tensor([[1, 0], [1, 0], [0, 1], [0, 1]], dtype=torch.float)
target_bce = target
weights = torch.tensor([1, 1], dtype=torch.float)
loss_f_none_w = nn.BCEWithLogitsLoss(weight=weights, reduction='none')
loss_f_sum = nn.BCEWithLogitsLoss(weight=weights, reduction='sum')
loss_f_mean = nn.BCEWithLogitsLoss(weight=weights, reduction='mean')
loss_none_w = loss_f_none_w(inputs, target_bce)
loss_sum = loss_f_sum(inputs, target_bce)
loss_mean = loss_f_mean(inputs, target_bce)
weights, loss_none_w, loss_sum, loss_mean


(tensor([1., 1.]),
 tensor([[0.3133, 2.1269],
         [0.1269, 2.1269],
         [3.0486, 0.0181],
         [4.0181, 0.0067]]),
 tensor(11.7856),
 tensor(1.4732))

In [25]:
inputs = torch.tensor([[1, 2], [2, 2], [3, 4], [4, 5]], dtype=torch.float)
target = torch.tensor([[1, 0], [1, 0], [0, 1], [0, 1]], dtype=torch.float)
target_bce = target
weights = torch.tensor([1, 1], dtype=torch.float)
pos_weight = torch.tensor([3], dtype=torch.float)
loss_f_none_w = nn.BCEWithLogitsLoss(weight=weights, reduction='none', pos_weight=pos_weight)
loss_f_sum = nn.BCEWithLogitsLoss(weight=weights, reduction='sum', pos_weight=pos_weight)
loss_f_mean = nn.BCEWithLogitsLoss(weight=weights, reduction='mean', pos_weight=pos_weight)
loss_none_w = loss_f_none_w(inputs, target_bce)
loss_sum = loss_f_sum(inputs, target_bce)
loss_mean = loss_f_mean(inputs, target_bce)
weights, loss_none_w, loss_sum, loss_mean

(tensor([1., 1.]),
 tensor([[0.9398, 2.1269],
         [0.3808, 2.1269],
         [3.0486, 0.0544],
         [4.0181, 0.0201]]),
 tensor(12.7158),
 tensor(1.5895))