In [25]:
import math
import numpy as np
from numpy import where
import warnings
warnings.filterwarnings('ignore')

# Worksheet 2

## Useful formula

I have written out the folowing formula to help with the work of this worksheet

In [18]:
def entropy(X: np.array) -> float:
    return - (X * where(X != 0, np.log2(X), 0)).sum()

def joint_entropy(D: np.ndarray) -> float:
    return - (D * where(D != 0, np.log2(D), 0)).sum()

def conditional_entropy(D: np.ndarray, Y: np.array) -> float:
    return joint_entropy(D) - entropy(Y)

def mutual_information(D: np.ndarray) -> float:
    return entropy(D.sum(0)) + entropy(D.sum(1)) - joint_entropy(D)

## Q1: marginal and conditional probabilities

Work out the marginal probability distributions and the $x=a$ conditional probability distribution $P(Y\,|\,X=a) for:

| Y \ X | a | b |
| ---- | ---- | ---- |
| 1 | $\frac{1}{3}$ | $\frac{1}{6}$ |
| 2 | $0$ | $\frac{1}{4}$ |
| 3 | $\frac{1}{8}$ | $\frac{1}{8}$ |

In [19]:
# Define the distribution

D = np.array([
    [1/3, 1/6],
    [0, 1/4],
    [1/8, 1/8]
])

# Marginal distributions in the order [1,2,3,'a','b']

np.concatenate((D.sum(1), D.sum(0)))

array([0.5       , 0.25      , 0.25      , 0.45833333, 0.54166667])

In [20]:
# Conditional probability, given X=a

given_X = D[:,0]

given_X / given_X.sum()

array([0.72727273, 0.        , 0.27272727])

## Q2: working out entropy

Each throw has a $\displaystyle\frac{1}{2}$ chance of being heads, constructing a table:

| throws | probability |
| ---- | ---- |
| 1 | $\frac{1}{2}$ |
| 2 | $\frac{1}{4}$ |
| 3 | $\frac{1}{8}$ |
| 4 | $\frac{1}{16}$ |
| 5 | $\frac{1}{32}$ |
| ... | ... |

As you can see from the code below as $\text{throws} \rightarrow \infty$ then $H(X) = 2$

In [37]:
# TODO:

D = np.array([1/2**x for x in range(1, 500)])

entropy(D)

2.0

## Q3: A puzzle which lends itself to information type reasoning

In [None]:
# TODO:

## Q4: Working out entropy and information

Let $p(x,y)$ be given by $p(0,0)=p(0,1)=p(1,1)=\displaystyle\frac{1}{3}$ and $p(1,0)=0$:

| Y \ X | 0 | 1 |
| ---- | ---- | ---- |
| 0 | $\frac{1}{3}$ | 0 |
| 1 | $\frac{1}{3}$ | $\frac{1}{3}$ |

Find $H(X), H(Y), H(X\,|\,Y), H(Y\,|\,X), H(X, Y), H(Y) - H(Y\,|\,X)$ and $I(X;Y)$

In [21]:
# Define the distribution

P = np.array([
    [1/3, 0],
    [1/3, 1/3]
])

In [22]:
# H(X)

entropy(P.sum(0))

0.9182958340544896

In [23]:
# H(Y)

entropy(P.sum(1))

0.9182958340544896

In [26]:
# H(X|Y)

conditional_entropy(P, P.sum(1))

0.6666666666666665

In [27]:
# H(Y|X)

conditional_entropy(P, P.sum(0))

0.6666666666666665

In [30]:
# H(X,Y)

joint_entropy(P)

1.584962500721156

In [31]:
# H(Y) - H(Y|X)

entropy(P.sum(1)) - conditional_entropy(P, P.sum(0))

0.25162916738782304

## Q5: A qeustion about information in the brain

The original idea of estimating neural information by binning spike trains was spread across several papers, but one of the main references is ... One aspect of this paper we didn't discuss is the use of extrapolation to estimate the information as the number of samples becomes large based on the behaviour for smaller numbers of samples.

Can you give a short, up to five line, summary of what this involves.