Entropy, as introduced by Shannon in his “A Mathematical Theory of Communication”, is a measure of the amount of uncertainty or randomness in a message. It is defined as the negative sum of the probabilities of each symbol multiplied by the logarithm of that probability.

Shannon’s entropy provides a theoretical limit on the amount of information that can be transmitted over a channel, as well as on the best possible data compression that can be achieved.

In general, the more different letters in a message, and the longer the message, the more unpredictable the message is and the more information it contains. This is because a message with a greater variety of symbols has more possible outcomes and therefore, carries more information.

On the other hand, a message with fewer letters and shorter length is more predictable and contains less information, as there are fewer possible outcomes. This is reflected in the entropy, as a message with a greater variety of symbols will have a higher entropy and a message with fewer symbols will have a lower entropy.

Recommended: plot the entropy function according to different texts, lengths and languages.

Analysis: If the entropy of dissipative system tends to decrease, the emergence patterns are somehow related to the minimisation of the entropy function that best represents the system.

In [17]:
from math import log2

def entropy(message):
    # Create a dictionary to store the frequency of each letter
    letter_freq = {}
    for letter in message:
        if letter.isalpha():
            if letter in letter_freq:
                letter_freq[letter] += 1
            else:
                letter_freq[letter] = 1
   
    # Calculate the probability of each letter
    total_letters = sum(letter_freq.values())
    prob = {letter: freq/total_letters for letter, freq in letter_freq.items()}
   
    # Calculate the entropy
    entropy = 0
    for letter in prob:
        entropy += prob[letter] * log2(prob[letter])
   
    return -entropy

message2 = "On the other hand, a message with fewer letters and shorter length is more predictable and contains less information"
message1 = "Unfortunately Unfortunately Unfortunately Unfortunately Unfortunately "
message = "uuabcuuuuuuuuuu"

print("The entropy of the message is: ", entropy(message2))


The entropy of the message is:  3.8954201887252182
