Loss function
=============
1. 損失函數（loss function）是用來估量模型的預測值f(x)與真實值Y的不一致程度
2. 是一個非負實值函數,通常使用L(Y, f(x))來表示，損失函數越小

In [6]:
import torch

x = torch.rand(2, 2) 
y = torch.rand(2, 2)

print(x)
print(y)

tensor([[0.6480, 0.6302],
        [0.1023, 0.6085]])
tensor([[0.0649, 0.0987],
        [0.4243, 0.3479]])


torch.nn.L1Loss(size_average=True)
-----------------------------------------------------
* loss(x,y) = 1/n∑|xi−yi|，X 與 Y 之間差的絕對值總和的平均
* size_average 是 True 時會有 1/n，反之則沒有

In [8]:
lossFunc = torch.nn.L1Loss(size_average=True)
loss = lossFunc(x, y)

print(loss)

tensor(0.4243)


torch.nn.MSELoss(size_average=True)
---------------------------------------------------------
* loss(x,y) =  1/n∑(xi−yi)^2，X 與 Y 平方差總和的平均
* size_average 是 True 時會有 1/n，反之則沒有

In [9]:
lossFunc = torch.nn.MSELoss(size_average=True)
loss = lossFunc(x, y)

print(loss)

tensor(0.1985)


torch.nn.NLLLoss(weight=None, size_average=True)
--------------
* 適用於訓練一個多分類器
* weight: 為一個 1-D tensor，有 n 個元素，代表分類 n 類的權重
* 在這個 loss func 的 target (y or class) 為 1—D tensor，表示分類的 label
* 沒 weight: loss(x,class) = −x[class]
* 有 weight: loss(x,class) = −weights[class]∗x[class]

In [12]:
x = torch.randn(3, 3) 
cls = torch.tensor([0, 2, 1]) 

print(x)
print(cls)

tensor([[-0.5926, -0.2742,  0.4403],
        [-0.5851,  1.4676,  0.0196],
        [-0.4039,  0.2867, -0.3413]])
tensor([0, 2, 1])


In [16]:
lossFunc = torch.nn.NLLLoss()
loss = lossFunc(x, cls)

print(loss)  # (0.5926 - 0.0196 - 0.2867) / 3 

tensor(0.0955)


torch.nn.CrossEntropyLoss(weight=None, size_average=True)
------------------------------------------------------------
* LogSoftMax 和 NLLLoss 混合使用 
* 適用於訓練一個多分類器
* weight: 為一個 1-D tensor，有 n 個元素，代表分類 n 類的權重
* 在這個 loss func 的 target (y) 為 1—D tensor，表示分類的 label
* 沒 weight: loss(x,class) = −x[class] + log(∑exp(x[i]))
* 有 weight: loss(x,class)= weights[class]∗( −x[class] + log(∑exp(x[i])))

In [19]:
# 方法一
lossFunc = torch.nn.CrossEntropyLoss()
loss = lossFunc(x, cls)

print(loss)

# 方法二
sm = torch.nn.Softmax() # dim=1
lossFunc = torch.nn.NLLLoss()
loss = lossFunc(torch.log(sm(x)), cls)

print(sm(x))
print(loss)

tensor(1.3714)


  # Remove the CWD from sys.path while we load stuff.
  if sys.path[0] == '':


tensor([[0.1929, 0.2652, 0.5419],
        [0.0942, 0.7334, 0.1724],
        [0.2463, 0.4914, 0.2623]])
tensor(1.3714)


torch.nn.NLLLoss2d(weight=None, size_average=True)
---------------------------------------------------
* 圖片用的 NLLLoss，用於計算每個像素的 NLL loss


In [24]:
m = torch.nn.Conv2d(16, 32, (3, 3)).float()

loss = torch.nn.NLLLoss2d()

# input is of size nBatch x nClasses x height x width
input = torch.autograd.Variable(torch.randn(3, 16, 10, 10))
print(input)

# each element in target has to have 0 <= value < nclasses
target = torch.autograd.Variable(torch.LongTensor(3, 8, 8).random_(0, 4))
print(target)

output = loss(m(input), target)
print(m(input))
print(output)

output.backward()

tensor([[[[-0.3984,  0.7053,  0.1449,  ...,  2.3040,  1.0365,  1.5821],
          [ 0.9656,  0.2370, -1.6854,  ...,  1.9406, -0.3339, -0.6145],
          [-0.8265, -2.0631, -0.0204,  ...,  0.1132,  0.4101, -0.9964],
          ...,
          [-0.1297, -0.4821,  1.2247,  ...,  1.8557,  0.8339,  0.3337],
          [ 0.3653,  0.8061, -1.3524,  ..., -1.3341,  0.2629,  1.0608],
          [-1.4287,  2.7640,  0.1499,  ...,  0.3692, -1.2102, -0.6080]],

         [[-2.5143, -0.1999,  0.3791,  ...,  1.6221,  1.4458, -1.6559],
          [-0.6470, -0.5619, -0.0257,  ..., -0.6046, -0.5956, -0.6730],
          [-0.1171,  1.4623, -0.7151,  ...,  0.9473, -0.3454,  0.3218],
          ...,
          [-0.1382,  0.7862, -0.8606,  ..., -0.6653, -1.2164,  0.7180],
          [ 1.3261, -1.3438,  1.3809,  ...,  1.3329, -0.7151,  1.4915],
          [-0.7864, -0.5958, -0.8480,  ...,  0.0865, -0.7058, -1.2478]],

         [[-0.3136, -0.7026,  0.2152,  ...,  0.0718,  1.2367,  0.2311],
          [-0.8856, -0.5319, -

         [3, 1, 2, 2, 1, 2, 3, 0]]])
tensor([[[[ 0.3591,  0.1215, -0.3085,  ...,  0.8366, -0.2492,  0.2914],
          [-0.8333,  0.8457,  0.4402,  ..., -0.2482, -0.4454, -0.0206],
          [ 0.4764,  0.6606, -0.8394,  ...,  0.0243, -0.1197,  0.4161],
          ...,
          [-0.2185,  1.0834,  0.0232,  ...,  0.0164,  0.3782,  0.3701],
          [-0.9340, -0.7330,  1.7110,  ..., -0.7615,  0.2861,  0.4030],
          [ 0.2741, -1.3536, -0.3921,  ...,  1.0947,  0.0261,  0.4896]],

         [[ 0.5690, -0.0072, -0.1599,  ..., -0.2423,  0.3337, -0.0021],
          [-0.4071, -0.1257, -0.4732,  ..., -0.2291, -0.4400,  0.0704],
          [-0.1936,  0.3703,  0.4253,  ..., -0.5221, -0.2592, -0.0507],
          ...,
          [ 0.5842, -0.4688,  0.1878,  ...,  0.0640, -0.2981, -0.0038],
          [-0.1275,  1.6100, -0.0041,  ...,  0.4792,  0.4862,  0.6526],
          [ 0.1363,  0.4072, -0.4499,  ...,  0.2179, -0.4641, -0.6671]],

         [[ 0.3783, -0.9136,  0.6339,  ...,  0.3140, -1.5131, -0.

       grad_fn=<ThnnConv2DBackward>)
tensor(0.0246, grad_fn=<NllLoss2DBackward>)


torch.nn.KLDivLoss(weight=None, size_average=True)
-------------------------------------------------
* KL 散度損失，KL散度常用來描述兩個分佈的距離
* loss(x, target ) = 1/n∑(target_i ∗ (log(target_i) − x_i))

torch.nn.MarginRankingLoss(margin=0, size_average=True)
----------------------------------------------------
* loss(x,y) = max(0, −y∗(x1 − x2) + margin)，x1 與 x2 為 1-D mini-batch Tensor，y 為 -1 或 1 的 1-D mini-batch Tensor

torch.nn.HingeEmbeddingLoss(size_average=True)
------------------------------------------------
* 给定一个输入 x(2-D mini-batch tensor)和对应的 标签 y (1-D tensor,1,-1)，此函数用来计算之间的损失值。
* loss(x,y) = 1/n ∑{xi, if yi == 1 max(0, margin−xi), if yi==−1