In [2]:
%run ../common/import_all.py

from common.setup_notebook import set_css_style, setup_matplotlib, config_ipython
config_ipython()
setup_matplotlib()
set_css_style()

# Cross-concepts

Cross concepts are mathematical concepts shared across multiple fields which are employed in Data Science as well.

## Entropy

In *thermodynamics*, the entropy is defined as

$$
\Delta S = \int \frac{\delta q}{T}
$$

In *statistical mechanics*, Boltzmann gave the definition as a measure of uncertainty and demonstrated that it is equivalent to the thermodynamics definition:

*the entropy quantifies the degree to which the probability of the system is spread over different microstates and is proportional to the logarithm of the number of possible microconfigurations which give rise to the macrostate.*

Which written down is

$$
S = -k_B \sum_i p_i log \, p_i
$$

(sum over all the possible microstates, where $p_i$ is the probability of state $i$ to be occupied). The postulate is that the occupation of every microstate is equiprobable.

In *Information Theory*, Shannon defined the entropy as a measure of the missing information before the reception of a message:

$$
H = -\sum_i p(x_i) \, log \, p(x_i)
$$

where $p(x_i)$ is the probability that character of type $x_i$ in the string of interest. This entropy measures the number of binary (YES/NO) questions needed to determine the content of the message. The link between the statistical mechanics and the information theory concepts is debated. 

In *Ecology*, defining the diversity index $D$ as the number of different types (species) in a dataset among which individuals are distributed, so that it is maximised when all types are equally abundant, 

$$
D = e^H
$$

where $H$ is the uncertainty in predicting the species of an individual taken at random from the dataset

$$
\begin{aligned}
H &= -\sum_i p_i log \, p_i \\
  &= - \sum_i log \, p_i^{p_i} \\
  &= - log(\Pi_i p_i^{p_i}) \\
  &= - log\left(\frac{1}{\Pi_i p_i^{p_i}}\right)
\end{aligned}
$$

which at the denominator has the weighted geometric mean of the $p_i$. 

* If all types are equally occupied, $p_i = 1/k \ \forall i$, then $H = log(k)$ ($H$ max)
* If only one type is present $p_i = 0 \ \forall i \in \{1, \ldots, n-1\}$ and $p_n = 1$, then $H=0$