In [2]:
import numpy as np
from helpers import print_table

# 2. Pairwise vs. Mutual Independence (6 points)

**Definition**: We say that two random variables are *pairwise independent* if $$p(X_n \mid X_m) = p(X_n)$$ and hence $$p(X_m, X_n) =  p(X_n \mid X_m) p(X_m) = p(X_n) p(X_m) $$

**Definition**: We say that $n$ random variables are *mutually independent* if $$p(X_i \mid X_{S}) = p(X_i)\;\; \forall S \subseteq \{1, \dots, n\} \setminus \{ i \}$$ and hence $$p(X_{1:n}) = \prod_{i=1}^n p(X_i)$$




<div class="alert alert-warning">
Show that pairwise independence between all pairs of variables does not necessarily imply mutual independence. Come up with a minimal counter example that has exactly three binary random variables.
</div>

Specify this counterexample via its full joint distribution table (FJDT). **Briefly** outline your thought process in the text field below (use $\LaTeX$ and markdown) and store the model's full joint distribution table into the `XYZ` variable. It is sufficient to show pairwise independence and non-mutual independence by comparing products of marginals and joint distributions. 

**Hint**: Copy your implementation of `inference_by_enumeration` from Problem 1. You can use `print_table` to visualize your distribution tables such as the FJDT, products of marginals, and joint distributions.

In [35]:
help(print_table)

Help on function print_table in module helpers:

print_table(probability_table: numpy.ndarray, variable_names: str) -> None
    Prints a probability distribution table.
    
    Parameters
    ----------
    probability_table : np.ndarray
        The probability distribution table
    variable_names : str
        A string containing the variable names, e.g., 'CDE'.
    
    Returns
    -------
    None



XOR function: When we have ${X_1, X_2, X_3}$, $X_1$ and $X_2$ can rely on independent events in the real world. In contrast, $X_3$ can be formulated by looking at the realized values of $X_1$ and $X_2$ and making a COMBINATION out of it. So, $X_3$ would be independent of $X_1$ and $X_2$ (pairwise independence fulfilled) but not of a combination out of $X_1$ and $X_2$ (mutual independence NOT fulfilled).
More precisely:

\begin{align*}
X_1 = 0, X_2 = 0 -> X_3 = 0\\
X_1 = 1, X_2 = 0 -> X_3 = 1\\
X_1 = 0, X_2 = 1 -> X_3 = 1\\
X_1 = 1, X_2 = 1 -> X_3 = 0\\
\end{align*}

In [19]:
# again: array shape AxBxC: A as number of 2d arrays, B as rows, C as columns

# Tossing a coin: X as coin1, Y as coin2, Z as XOR function of X and Y. Therefore, when X and Y event 
# already known Z is no surprise anymore, e.g. P(Z=0|X=0,Y=0)=1. From this it follows 
# P(Z=0,X=0,Y=0) = P(X=0,Y=0) = 1/2 * 1/2 = 1/4,
# P(Z=1,X=0,Y=0)=0. Through this the table sums up to 1.

XYZ = np.zeros((2,2,2))
XYZ[0,0,0]=1/4
XYZ[0,1,1]=1/4
XYZ[1,0,1]=1/4
XYZ[1,1,0]=1/4

print_table(XYZ, 'XYZ')


0,1,2,3,4
,$y_0$,$y_0$,$y_1$,$y_1$
,$z_0$,$z_1$,$z_0$,$z_1$
$x_0$,0.250,0.000,0.000,0.250
$x_1$,0.000,0.250,0.250,0.000


In [69]:
# copy inference_by_enumeration from Problem 1 & print and compare the probability tables here!
def inference_by_enumeration(
    FJDT: np.ndarray, 
    query_variable_indices: tuple, 
    evidence_variable_indices: tuple=tuple()
) -> np.ndarray:
    '''
    Computes the answer to a probabilistic query exactly from the full joint distribution table.
    :param table: The full joint distribution table as a np.ndarray.
    :param query_variable_indices: A tuple containing the indices of the query variables in the FJDT.
    :param evidence_variable_indices: A tuple containing the indices of the evidence variables in the FJDT.
    :returns: The answer to the probabilistic query; a `np.ndarray`.
    ''' 
    assert type(FJDT) == np.ndarray, "FJDT must be a np.ndarray"
    assert type(query_variable_indices) == tuple, "query_variable_indices must be a tuple"
    assert type(evidence_variable_indices) == tuple, "evidence_variable_indices must be a tuple"
        
    # VL 2, slide 40:
    
    # 1. P(X,y)=
    # Find out the variable indices which not belong to the query and evidence:
    set_merged = set(query_variable_indices + evidence_variable_indices)
    set_full = set(range(len(FJDT.shape)))
    tuple_non_evidence = tuple(set_full-set_merged)
    
    PXAndy = np.sum(FJDT, axis=tuple_non_evidence, keepdims=True) # each Y dimension should be still there

    # 2. Normalizaztion constant Z=P(y)=
    Py = np.sum(PXAndy, axis=query_variable_indices, keepdims=True) # each Y dimension should be still there
    
    # 3. P(X|y)=
    PXgiveny = PXAndy/Py

    return PXgiveny


# Joints:
PXY = inference_by_enumeration(XYZ, (0,1), ()).squeeze()
PYZ = inference_by_enumeration(XYZ, (1,2), ()).squeeze()
PXZ = inference_by_enumeration(XYZ, (0,2), ()).squeeze()
print('Table joint P(X,Y):')
print_table(PXY, 'XY')
print('Table joint P(Y,Z):')
print_table(PYZ, 'YZ')
print('Table joint P(X,Z):')
print_table(PXZ, 'XZ')


# Marginals:
PX = inference_by_enumeration(XYZ, (0,), ()).squeeze()
PY = inference_by_enumeration(XYZ, (1,), ()).squeeze()
PZ = inference_by_enumeration(XYZ, (2,), ()).squeeze()

PXPY = np.asarray([[PX[0]*PY[0],PX[0]*PY[1]],[PX[1]*PY[0],PX[1]*PY[1]]])
print('Product table joint P(X)*P(Y):')
print_table(PXPY, 'XY')

PYPZ = np.asarray([[PY[0]*PZ[0],PY[0]*PZ[1]],[PY[1]*PZ[0],PY[1]*PZ[1]]])
print('Product table joint P(Y)*P(Z):')
print_table(PYPZ, 'YZ')

PXPZ = np.asarray([[PX[0]*PZ[0],PX[0]*PZ[1]],[PX[1]*PZ[0],PX[1]*PZ[1]]])
print('Product table joint P(X)*P(Z):')
print_table(PXPZ, 'XZ')


# Cartesian product:
product_ar = np.asarray([[[PX[0]*PY[0]*PZ[0], PX[0]*PY[0]*PZ[1]],[PX[0]*PY[1]*PZ[0], PX[0]*PY[1]*PZ[1]]],
                         [[PX[1]*PY[0]*PZ[0], PX[1]*PY[0]*PZ[1]],[PX[1]*PY[1]*PZ[0], PX[1]*PY[1]*PZ[1]]]])

print('Product table P(X)*P(Y)*P(Z):')
print_table(product_ar, 'XYZ')


PXYZ = inference_by_enumeration(XYZ, (0,1,2), ()).squeeze()
print('Table full joint distribution of X, Y, and Z:')
print_table(PXYZ, 'XYZ')

print('''
We observe what is already described above. We have: P(X,Y,Z) != P(X)*P(Y)*P(Z), NO mutual independence. 
However, pairwise independence is given with, e.g. P(X,Y) = P(X)*P(Y). The regarding tables contain
identical values. 
It follows that pairwise independence for all variables can be given but out of this DOESN'T follow
the mutual independence.
''')

Table joint P(X,Y):


0,1,2
,$y_0$,$y_1$
$x_0$,0.250,0.250
$x_1$,0.250,0.250


Table joint P(Y,Z):


0,1,2
,$z_0$,$z_1$
$y_0$,0.250,0.250
$y_1$,0.250,0.250


Table joint P(X,Z):


0,1,2
,$z_0$,$z_1$
$x_0$,0.250,0.250
$x_1$,0.250,0.250


Product table joint P(X)*P(Y):


0,1,2
,$y_0$,$y_1$
$x_0$,0.250,0.250
$x_1$,0.250,0.250


Product table joint P(Y)*P(Z):


0,1,2
,$z_0$,$z_1$
$y_0$,0.250,0.250
$y_1$,0.250,0.250


Product table joint P(X)*P(Z):


0,1,2
,$z_0$,$z_1$
$x_0$,0.250,0.250
$x_1$,0.250,0.250


Product table P(X)*P(Y)*P(Z):


0,1,2,3,4
,$y_0$,$y_0$,$y_1$,$y_1$
,$z_0$,$z_1$,$z_0$,$z_1$
$x_0$,0.125,0.125,0.125,0.125
$x_1$,0.125,0.125,0.125,0.125


Table full joint distribution of X, Y, and Z:


0,1,2,3,4
,$y_0$,$y_0$,$y_1$,$y_1$
,$z_0$,$z_1$,$z_0$,$z_1$
$x_0$,0.250,0.000,0.000,0.250
$x_1$,0.000,0.250,0.250,0.000



We observe what is already described above. We have: P(X,Y,Z) != P(X)*P(Y)*P(Z), NO mutual independence. 
However, pairwise independence is given with, e.g. P(X,Y) = P(X)*P(Y). The regarding tables contain
identical values. 
It follows that pairwise independence for all variables can be given but out of this DOESN'T follow
the mutual independence.



In [70]:
assert XYZ.shape == (2, 2, 2), 'FJDT must have shape (2,2,2)'
assert np.isclose(XYZ.sum(), 1), 'Probabilites in FJDT must sum to one'
