### [Grey entropy and its application in weighting analysis,         Wen (1998) ](https://ieeexplore.ieee.org/abstract/document/728163)

During past research in weighting analysis, the con-
tinuous type of entropy is discussed frequently. How-
ever, in the practical system, there are all discrete,
if we use the continuous type of entropy to solve our
problems, although problems were solved, there still
exists some vagueness in our minds. Therefore, in this
paper, we try to let the vagueness disappear. So, we
based on the definition of entropy to derive the dis-
crete type of entropy, and it calls __grey entropy__. After
the grey entropy is derived, we also give an example
to compare the traditional entropy and the grey en-
tropy. As a results, we can find the grey entropy is
one of the suitable method in weighting analysis.

We define a new entropy under conditions based on Shannon entropy:
1. $f_i ( 0 ) = 0 $
2. $f_i ( x ) = f_i ( 1-x )$
3. $f_i(x)$ is monotonic increasing in the $x \in  (0,0.5)$.

__Definition 1:__ For the function of $x$ the entropy function is defined as:
$$W(A) = \frac{\sum^m_{i=1} W_e(X_i)}{(e^{1/2}-1)m}$$
where $W_e(x) = xe^{(1-x)}+(1-x)e^x-1$



On the mentioned above, we already developed a new
continuous type, but in the practical world, there all
are discrete. Therefore, we imply the concept “least
information theory” of grey system theory to create the grey entropy for our need. 

Grey entropy is one of
the suitable method for the weighting analyze in our
practical world.

### [Shannon Entropy](http://bearcave.com/misl/misl_tech/wavelets/compression/shannon.html)

https://en.wiktionary.org/wiki/Shannon_entropy

https://en.wikipedia.org/wiki/Entropy_(information_theory)

https://arxiv.org/pdf/1405.2061.pdf

http://www.ueltschi.org/teaching/chapShannon.pdf

#### [The intuition behind Shannon’s Entropy]()

We deﬁne the self-information of an event $X = x$ to be
$I(x) = −\log P (x)$
Our definition of $I(x)$ is therefore written in units of nats. One nat is the amount of information gained by observing an event of probability $1/e$.

$$H(x)=-\sum_xP(x)\log{P(x)}= \sum_x P(x)\log(\frac{1}{P(x)})$$

$H(X)$ is the total amount of information in an entire probability distribution. This means $1/p(x)$ should be the information of each case (the unlikely event has a higher entropy).

If $b$ is the base of the logarithm used, corresponding units of entropy are the bits for $b = 2$, nats for $b = e$, and bans for $b = 10$.

In [1]:
import math 
import random

def H(sentence): 
    """
    Equation 3.49 (Shannon's Entropy) is implemented.
    """
    entropy = 0 
    # There are 256 possible ASCII characters
    for character_i in range(256): 
        Px = sentence.count(chr(character_i))/len(sentence) 
        if Px > 0: 
            entropy += - Px * math.log(Px, 2) 
    return entropy

In [2]:
# The telegrapher creates the "encoded message" with length 10000.
# When he uses only 32 chars 
simple_message ="".join([chr(random.randint(0,32)) for i in range(10000)])
H(simple_message)

5.041086039235804

In [4]:
# When he uses all 255 chars, the entropy increases as the uncertainty of which character will be sent increases.
complex_message ="".join([chr(random.randint(0,255)) for i in range(10000)])
H(complex_message)

7.982323353868584