# Bayes' theorem and information theory

Officially, Bayes' theorem let us update our beliefs when we are given new evidence. Personally, I find Bayes as the key to flip the dependence way. In this example we start out from weather and how it conditions the clothes we are about pick, to work backwards that relation: how clothes *determine* the weather i.e., how can we guess the weather provided that we know the clothes.

**Mutual information**
That above implies that the weather carries some information about the clothes one wears and the other way arround. The information that both variables share is called mutual information.

Nice video visualizing Bayes' theorem [(3b1b)](https://www.youtube.com/watch?v=HZGCoVF3YvM)  
Nice explanation about conditional probability and information theory [(Colah's blog)](http://colah.github.io/posts/2015-09-Visual-Information)

In [None]:
import numpy as np

In [None]:
def entropy(arr):
    """Compute the entropy of a given array."""
    return -(arr * np.log2(arr)).sum()

def joint(P, Q1, Q2):
    """Compute the joint probability of two conditioned distributions."""
    return np.concatenate(((P * Q1), (P * Q2)))

def info(arr):
    I = np.absolute(arr[:2].sum() - arr[2]).round(3)
    print('H(W):               {}'.format(arr[0]))
    print('H(C):               {}'.format(arr[1]))
    print('H(H, C):            {}'.format(arr[2]))
    print('Mutual information: {}'.format(I))

In [None]:
# Probability of rain/sun P(W)
W = np.array([1/4, 3/4])

# Probability of clothes given rainy weather P(C|W=R)
C_RW = np.array([1/4, 3/4])  # T-shirt, coat

# Probability of clothes given sunny weather P(C|W=S)
C_SW = np.array([3/4, 1/4])  # T-shirt, coat

# Joint probability (W, C)
WC = joint(W, C_SW, C_RW)

"""
Unpack values for:

rain+t-shirt: RT
rain+coat:    RC
sun+coat:     SC
sun+t-shirt:  ST
So we can compute Bayes' theorem
"""
RC, SC, RT, ST = WC

# Probability of coat/t-shirt P(C)
C = np.array([RC+SC, RT+ST])

# Probability of weather given t-shirt P(W|C=T-shirt)
W_CT = np.array([ST/(ST+RT), RT/(ST+RT)]) # [.9, .1]

# Probability of weather given coat P(W|C=coat)
W_CC = np.array([SC/(SC+RC), RC/(SC+RC)])  # [.5, .5]

# Finally compute informations
e0 = np.array([entropy(p) for p in (W, C, WC)]).round(3)
info(e0)