## 1. CrossEntropyLoss
### class torch.nn.CrossEntropyLoss(weight=None, size_average=True, ignore_index=-100, reduce=None, reduction='mean')[source]

作用:

- 针对单目标分类问题结合了 nn.LogSoftmax() 和 nn.NLLLoss() 来计算 loss.用于训练 C 类别classes 的分类问题.

    - 参数 weight 是 1D Tensor, 分别对应每个类别class 的权重. 对于类别不平衡的训练数据集比较有用.

    - 输入input 包含了每一类别的概率或score.

    - 输入 input Tensor 的大小是 (minibatch,C) 或 (minibatch,C,d1,d2,...,dK). K≥2 表示 K-dim 场景.

    - 输入 target 是类别class 的索引([0,C−1], C 是类别classes 总数.)

    $loss(score,target)=-log(\frac{exp(score[target])}{\sum_{j}^{C}exp(score[j]])})=-score[target]+log(\sum_{j}^{C}exp(score[j]))$
    - 带 weight形式:
    
    $weight[target](-score[target]+log(\sum_{j}^{C}exp(score[j])))$
    - losses 在 minibatch 内求平均.

- 也支持高维输入 inputs, 如 2D images, 则会逐元素计算 NLL Loss.

参数：
- weight(Tensor, optional) - 每个类别class 的权重. 默认为值为 1 的 Tensor.

- size_average(bool, optional) – 默认为 True.

    - size_average=True, 则 losses 在 minibatch 结合 weight 求平均average.

    - size_average=False, 则losses 在 minibatch 求相加和sum.

    - 当 reduce=False 时,忽略该参数.

- ignore_index(int, optional) - 指定忽略的 target 值, 不影响 input 梯度计算.

    - 当 size_average=True, 对所有非忽略的 targets 求平均.

- reduce(bool, optional) - 不推荐使用，默认为 True.

    - reduce=True, 则 losses 在 minibatch 求平均或相加和.

    - reduce=False, 则 losses 返回 per batch 值, 并忽略 size_average.
- reduction(string,optional) - 
    - reduction='none',逐个像素点求loss,输出的loss的size与target一致
    - reduction='mean',默认，输出总和除以输出元素数量(batch_size)
    - reduction='sum',输出求和

输入:input x, (N,C), C=num_classes 类别总数。输入:target y, (N), 每个值都是 0≤targets[i]≤C−1
输出:如果 reduce=True, 输出标量值. 如果 reduce=False, 输出与输入target一致, (N)(N)

输入:input x, (N,C,d1,d2,...,dK)(N,C,d1,d2,...,dK), K≥2 适用于 K-dim 场景。输入: target y, (N,d1,d2,...,dK), K≥2适用于 K-dim 场景

输出:如果 reduce=True, 输出标量值. 如果 reduce=False, 输出与输入target一致, (N,d1,d2,...,dK), K≥2 适用于 KK-dim 场景

**注意：**size_average和reduce正在被弃用，指定size_average和reduction中的任何一个都将覆盖reduce。 默认值：'mean'

### example:

In [21]:
import torch 
import torch.nn as nn

loss1 = nn.CrossEntropyLoss()
#逐个像素点求loss,输出的loss的size与target一致
loss2=nn.CrossEntropyLoss(reduction='none')

In [16]:
# input, [batch_size=5,num_classes=2,H,W]
input = torch.randn(5,2,3,4,requires_grad=True)

# target, [batch_size=5,H,W]
target = torch.ones(5,3,4, dtype=torch.long)
losses = loss1(input, target)

#输出是标量的Tensor
print(losses.size())
print(losses)


#标量的Tensor==>矩阵Tensor
losses=torch.unsqueeze(losses,0)
print(losses)

#对矩阵Tensor求平均==>标量的Tensor
print(losses.mean())


torch.Size([])
tensor(0.8765, grad_fn=<NllLoss2DBackward>)
tensor([0.8765], grad_fn=<UnsqueezeBackward0>)
tensor(0.8765, grad_fn=<MeanBackward1>)


In [24]:
# input, [batch_size=5,num_classes=2,H,W]
input = torch.randn(5,2,3,4,requires_grad=True)

# target, [batch_size=5,H,W]
target = torch.ones(5,3,4, dtype=torch.long)
losses = loss2(input, target)

#输出是矩阵的Tensor，size和target一致
print(losses.size())
print(losses)


#矩阵的Tensor==>增加1维矩阵Tensor
losses=torch.unsqueeze(losses,0)
print(losses.size())
print(losses)


#对矩阵Tensor求平均==>标量的Tensor
losses=losses.mean()
print(losses.size())
print(losses)


torch.Size([5, 3, 4])
tensor([[[2.7048, 0.1123, 0.5359, 0.3774],
         [1.6952, 2.1477, 0.6256, 0.0575],
         [0.0589, 1.3138, 1.3982, 0.1222]],

        [[0.3015, 0.8584, 0.5813, 0.4322],
         [1.0937, 0.8110, 0.6381, 0.1653],
         [1.3317, 0.7927, 1.2190, 2.3843]],

        [[1.9092, 1.0830, 0.4308, 0.4853],
         [0.0863, 0.1021, 0.1793, 0.2180],
         [0.3895, 0.3439, 0.1253, 1.0049]],

        [[0.4686, 1.0624, 0.2343, 0.2983],
         [0.5107, 0.8927, 0.3691, 0.0320],
         [1.2497, 0.8411, 1.6227, 0.1318]],

        [[0.2675, 0.8722, 0.4760, 0.3298],
         [0.3781, 0.4217, 2.3875, 1.9081],
         [1.4623, 1.5759, 0.3132, 2.7622]]], grad_fn=<NllLoss2DBackward>)
torch.Size([1, 5, 3, 4])
tensor([[[[2.7048, 0.1123, 0.5359, 0.3774],
          [1.6952, 2.1477, 0.6256, 0.0575],
          [0.0589, 1.3138, 1.3982, 0.1222]],

         [[0.3015, 0.8584, 0.5813, 0.4322],
          [1.0937, 0.8110, 0.6381, 0.1653],
          [1.3317, 0.7927, 1.2190, 2.3843]],

 

In [None]:
class CrossEntropy(nn.Module):
    def __init__(self, ignore_label=-1, weight=None):
        super(CrossEntropy, self).__init__()
        self.ignore_label = ignore_label#255
        self.criterion = nn.CrossEntropyLoss(weight=weight, #每个class的加权
                                             ignore_index=ignore_label)#指定忽略的 target 值255,不计算loss

    def forward(self, score, target):
        '''

        :param score: 模型的输出Tensor:[bs,num_classes,128,256]
        :param target: labels Tensor:[bs,512,1024]
        :return:
        '''
        ph, pw = score.size(2), score.size(3)#128,256
        h, w = target.size(1), target.size(2)#512,1024
        #如果模型输出score大小<label的大小，对score上采样至label大小
        if ph != h or pw != w:
            score = F.upsample(
                    input=score, size=(h, w), mode='bilinear')

        loss = self.criterion(score, target)

        return loss