Skip to content

Latest commit

 

History

History
63 lines (40 loc) · 1.81 KB

kullback_leibler_divergence.rst

File metadata and controls

63 lines (40 loc) · 1.81 KB

Kullback-Leibler Divergence

The Kullback-Leibler divergence, sometimes also called the relative entropy, of a distribution p from a distribution q is defined as:

$$\DKL{p || q} = \sum_{x \in \mathcal{X}} p(x) \log_2 \frac{p(x)}{q(x)}$$

The Kullback-Leibler divergence quantifies the average number of extra bits required to represent a distribution p when using an arbitrary distribution q. This can be seen through the following identity:

$$\DKL{p || q} = \xH{p || q} - \H{p}$$

Where the cross_entropy quantifies the total cost of encoding p using q, and the ../multivariate/entropy quantifies the true, minimum cost of encoding p. For example, let's consider the cost of representing a biased coin by a fair one:

In [1]: from dit.divergences import kullback_leibler_divergence

In [2]: p = dit.Distribution(['0', '1'], [3/4, 1/4])

In [3]: q = dit.Distribution(['0', '1'], [1/2, 1/2])

@doctest float In [4]: kullback_leibler_divergence(p, q) Out[4]: 0.18872187554086717

That is, it costs us 0.1887 bits of wasted overhead by using a mismatched distribution.

Not a Metric

Although the Kullback-Leibler divergence is often used to see how "different" two distributions are, it is not a metric. Importantly, it is neither symmetric nor does it obey the triangle inequality. It does, however, have the following property:

$$\DKL{p || q} \ge 0$$

with equality if and only if p = q. This makes it a premetric.

API

kullback_leibler_divergence