[![](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/itmorn/AI.handbook/blob/main/DL/torch/nn/LossFunction/KLDivLoss.ipynb)

# KLDivLoss
Kullback-Leibler散度损失。
$$D_{KL}(p||q)=\sum_{i=1}^{n}{p(x_i)log(\frac{p(x_i)}{q(x_i)})}$$
在机器学习中，P往往用来表示样本的真实分布，Q用来表示模型所预测的分布，那么KL散度就可以计算两个分布的差异，也就是Loss损失值。

**定义**：  
torch.nn.KLDivLoss(size_average=None, reduce=None, reduction='mean', log_target=False)

**参数**:  
- reduction (str, optional) – Specifies the reduction to apply to the output: 'none' | 'mean' | 'sum'.   指定应用于输出的reduce方式:'none' | 'mean' | 'sum'。

- log_target (bool, optional) – Specifies whether target is the log space. Default: False  target是否为log空间

In [23]:
import torch
import torch.nn as nn
torch.manual_seed(666)

import torch.nn.functional as F
log_target = False
kl_loss = nn.KLDivLoss(reduction="batchmean", log_target=log_target)
# input should be a distribution in the log space 注意input一定是log space的
input = F.log_softmax(torch.randn(3, 5, requires_grad=True), dim=1)
print("input:\n", input, "\n")

# Sample a batch of distributions. Usually this would come from the dataset
target = F.softmax(torch.rand(3, 5), dim=1)
print("target:\n", target, "\n")

loss = kl_loss(input, target)
print("loss:\n", loss, "\n")

if (log_target):
    print((target.exp()*(target - input)).sum(dim=1).mean())
else:
    print((torch.xlogy(target, target) - target * input).sum(dim=1).mean())


input:
 tensor([[-3.3015, -1.1192, -2.6382, -1.1952, -1.3375],
        [-2.9087, -0.2243, -2.3617, -3.5050, -3.8121],
        [-1.7003, -1.7383, -1.0231, -1.6480, -2.4119]],
       grad_fn=<LogSoftmaxBackward0>) 

target:
 tensor([[0.1605, 0.1574, 0.2878, 0.2175, 0.1767],
        [0.1900, 0.1799, 0.1236, 0.3104, 0.1961],
        [0.1795, 0.1527, 0.1553, 0.2535, 0.2590]]) 

loss:
 tensor(0.5752, grad_fn=<DivBackward0>) 

tensor(0.5752, grad_fn=<MeanBackward0>)
