In [1]:
import math
import numpy as np
from typing import Tuple

## Shannon Entropy

The Shannon entropy is a statistical quantifier extensively used for the characterization of complex systems. It can be interpreted as:

- __Measure of Uncertainty:__ It quantifies the unpredictability of information content. Higher entropy indicates greater uncertainty or variability in the outcomes of a random variable.
- __Information Content:__ It represents the average number of bits needed to encode messages from a source. A source with uniform probability distribution (where all outcomes are equally likely) has maximum entropy, while a deterministic source (where one outcome is certain) has zero entropy.

$$H(s) = -\sum_{i=1} P_i \log_2 P_i$$

When observed over time, the entropy is frequently used for anomalies detection. Expressive variations in the entropy $H(s)$ levels of a system can indicate a significant change in the system itself.
$$\Delta H(s) = H(s)_{t+1} - H(s)_{t}$$

In [2]:
def message_entropy(msg: str, base: int = 2) -> Tuple:
    """Calculates the Shannon entropy of a string message."""
    add = 0
    symbols = {}
    n = len(msg)
    chars = set(list(msg))
    for char in chars:
        proba = msg.count(char) / n
        add += proba * math.log(proba, base)
        symbols[char] = proba
    return add * -1, symbols

In [3]:
h, symbols = message_entropy(msg="successful", base=2)
print(f"Entropy: {h:.4f}\nSymbols: {symbols}")

Entropy: 2.4464
Symbols: {'e': 0.1, 'f': 0.1, 's': 0.3, 'c': 0.2, 'u': 0.2, 'l': 0.1}


In [4]:
h, symbols = message_entropy(msg="successful", base=6)
print(f"Entropy: {h:.4f}\nSymbols: {symbols}")

Entropy: 0.9464
Symbols: {'e': 0.1, 'f': 0.1, 's': 0.3, 'c': 0.2, 'u': 0.2, 'l': 0.1}


In [5]:
h, symbols = message_entropy(msg="HELLO", base=2)
print(f"Entropy: {h:.4f}\nSymbols: {symbols}")

Entropy: 1.9219
Symbols: {'E': 0.2, 'L': 0.4, 'O': 0.2, 'H': 0.2}


- 1.92 (~ 2) bits needed for encode each symbol in the message.

| Symbol | Code |
|--------|------|
| H      | 00   |
| E      | 01   |
| L      | 10   |
| O      | 11   |

## Spacial Entropy

Moreover, as proposed by Von Neumann, the Shannon entropy can be used to describe the spacial etropy and thus serving as a criterion for choosing spaces.

Using normalized eigenvalues from Principal Component Analysis (PCA) as probabilities to estimate the entropy of a data space involves several key steps. This technique leverages the relationship between eigenvalues, variance, and information content in datasets.

The spacial entropy value provides insight into the complexity or disorder within the dataset. A higher entropy indicates a more complex structure with less predictability, while lower entropy suggests a more ordered and predictable structure.