- [How to Calculate KL Divergence in Python](https://www.statology.org/kl-divergence-python/#:~:text=In%20statistics%2C%20the%20Kullback%E2%80%93Leibler,%E2%80%9CP's%20divergence%20from%20Q.%E2%80%9D)

In [1]:
from scipy.special import rel_entr
import numpy as np

In [2]:
def natstobits(n):
    """
    Convert nats to bits
    """
    return np.log2(np.e) * n

In [3]:
#define two probability distributions
P = [.05, .1, .2, .05, .15, .25, .08, .12]
Q = [.3, .1, .2, .1, .1, .02, .08, .1]


#calculate (P || Q)
x = sum(rel_entr(P, Q))
print("{} nats = {} bits".format(x, natstobits(x)))

# 0.589885181619163 nats

0.589885181619163 nats = 0.8510244262158521 bits


Also note that the KL divergence is not a symmetric metric

In [4]:
x = sum(rel_entr(Q, P))
print("{} nats = {} bits".format(x, natstobits(x)))

# 0.497549319448034 nats

0.497549319448034 nats = 0.7178119357653573 bits


In [5]:
import torch
import torch.nn.functional as F

Pt = torch.tensor(P)
Qt = torch.tensor(Q)
out0 = (Pt * (Pt / Qt).log()).sum()
out1 = F.kl_div(Qt.log(), Pt, None, None, 'sum')
out2 = F.kl_div(Pt.log(), Qt, None, None, 'sum')

print(out0, out1, out2)

tensor(0.5899) tensor(0.5899) tensor(0.4975)


In [6]:
Pr = np.array(P)
Qr = np.array(Q)
(Pr * np.log(Pr / Qr)).sum()

0.589885181619163

In [7]:
# correlation between P and Q
torch.corrcoef(torch.stack([Pt, Qt]))

tensor([[ 1.0000, -0.3855],
        [-0.3855,  1.0000]])

- Kolmogorov-Smirnov test 

In [14]:
from scipy import stats

stats.kstest(Pr, Qr)[0], stats.kstest(Qr, Pr)[0]

(0.25, 0.25)