# KL Divergence in pytorch


The original KL divergence equation is as follows:
$$
D_{KL}(P || Q) = \int_{-\infty}^{+\infty}p(x)\mathrm{log}\frac{p(x)}{q(x)}\mathrm{d}x = \sum_{i=1}^{N}p_i\cdot(\mathrm{log}p_i-\mathrm{log}q_i)
$$


The KL divergence in pytorch is calculated by the following equation:
$$
loss(x, target) = \frac{1}{n} \sum(target_i * (log(target_i) - x_i))
$$

So, when we call the KL divergence in pytorch, the input ($(x_i)$s in the equation above) should be the log of the predicted probability, the target should be the ground truth probability, and the KL divergence should be averaged by the number of elements.

In the codes below, we first call the kl_div function, to calculate directely the kl divergence, then we calculate by ourself the kl divergence..


In [1]:
import torch
import torch.nn.functional as F

input_ = torch.rand(10,1)
target_ = torch.rand(10,1)

log_input = torch.log(input_)

KLD = F.kl_div(log_input, target_)
print KLD

Variable containing:
 0.3291
[torch.FloatTensor of size 1]



Now we calculate in seperate way

In [2]:
kld = torch.sum(target_*(torch.log(target_)-log_input))/input_.numel()
print kld

0.329109419137


The outputs are consistent.