## Cross Entropy Loss 计算原理

**torch.nn.CrossEntropyLoss 计算公式如下**
$$ loss(x, class) = -log(\frac{exp(x[class])}{\Sigma_{j}{exp(x[j])}}) = -x[class] + log(\Sigma_{j}{exp(x[j])})$$

In [1]:
import torch
import torch.nn as nn
import numpy as np

In [2]:
loss = nn.CrossEntropyLoss()

### 二维数据上的计算与理解

In [3]:
# input 3 records, 5 classes
input = torch.randn(3, 5, requires_grad=True)

# target 1x3 dimensions, 3 lables for 3 records
target = torch.empty(3, dtype=torch.long).random_(5)

In [4]:
input_array = input.detach().numpy()
target_array = target.detach().numpy()

In [5]:
loss_list = []

## 总共有3个x
for i in np.arange(3):
    
    ## 对应公式中 -x[class], i为第i个x，target_array[i]为对应的class的index
    first = -input_array[i, target_array[i]]
    second = 0
    for j in np.arange(5):
        
        ## 对应公式中 指数累积求和
        ## input_array[i, j]，针对第i个x，在第j个类别上的分布权重
        second = second + np.exp(input_array[i, j])
    second = np.log(second)
    one_loss = first + second
    
    loss_list.append(one_loss)
my_loss = np.mean(loss_list)
pytorch_loss = loss(input, target)
print("pytorch tools calculation:", pytorch_loss)
print("My calculation:", my_loss)

pytorch tools calculation: tensor(2.1582, grad_fn=<NllLossBackward>)
My calculation: 2.158160789301556


### 高维度数据的计算过程与原理

In [6]:
# batch size 6, 3 classess, 5 widht x 5 height image
input = torch.randn(6, 3, 5, 5, requires_grad=True)

# batch size 6, 1 class label for 5 x 5 image each pixel 
target = torch.empty(6, 1, 5, 5, dtype=torch.long).random_(3)

# input.shape, target.shape, input, target

In [7]:
input_array = input.detach().numpy()
target_array = target.detach().numpy()

In [8]:
loss_list = []

## 总共有epochs * r_i * c_i个x, 每个像素点都是一个x
for epoch in np.arange(6):
    for r_i in np.arange(5):
        for c_i in np.arange(5):
            ## 对应公式中 -x[class]
            first = -input_array[epoch, target_array[epoch, 0, r_i, c_i], r_i, c_i]
            
            ## 对应公式中 sum(exp(x[j])), j 为类别个数
            second = 0
            for class_i in np.arange(3):
                second = second + np.exp(input_array[epoch, class_i, r_i, c_i])
            second = np.log(second)

            one_loss = first + second
            loss_list.append(one_loss)
my_loss = np.mean(loss_list)

pytorch_loss = loss(input, target.squeeze(1))

print("pytorch tools calculation:", pytorch_loss)
print("My calculation:", my_loss)

pytorch tools calculation: tensor(1.5038, grad_fn=<NllLoss2DBackward>)
My calculation: 1.5038311872782133
