# Homework 03 - Sebastiano Zagatti

## Exercise 1:
1. Draw the Bayesian Network representing the joint distribution

$$P(A,B,C,D,E,F,G)=P(A)P(B|A)P(F|B)P(C|A)P(D|B)P(E|D,F)P(G).$$

2. Indicate whether the following statements on (conditional) independence are True or False and motivate your answer.

 a. $A\perp \!\!\! \perp  D$
 
 b. $F \perp \!\!\! \perp  D$
 
 c. $A\perp \!\!\! \perp  B | C$
 
 d. $A\perp \!\!\! \perp  D | B$
 
 e. $D\perp \!\!\! \perp  F | E$

 f. $B\perp \!\!\! \perp F| E$
 
 g. $A\perp \!\!\! \perp  D | \{B, F\}$

#### Solution
1.

![](graph.png)

2.

a. $A\perp \!\!\! \perp  D$: False
$$ P(A,D) = \text{ Head to Tail } = \sum_B p(D|B)p(B|A)p(A) = p(D|A)p(A) \neq p(D)p(A)$$
$$\Longrightarrow{} \text{ $A$ and $D$ are not independent}$$

b. $F \perp \!\!\! \perp  D$: True
$$ p(D,F)=\text{ Head to Head }=\sum_{E} p(D)p(F)p(E|D,F) = p(F)p(D)$$
$$\Longrightarrow{} \text{ $D$ and $F$ are independent}$$

c. $A\perp \!\!\! \perp  B | C$: False
$$ p(A,B|C) = \frac{p(A,B,C)}{p(C)} = \frac{p(A)p(B|A)p(C|A)}{p(C)}=\text{ Bayes' Theorem }$$
$$=p(B|A)p(A|C) \neq p(A|C)p(B|C)$$
$$\Longrightarrow{} \text{ $A$ and $B$ are not conditionally independent given $C$}$$

d. $A\perp \!\!\! \perp  D | B$: True
$$ p(A,D|B) = \frac{p(A,B,D)}{p(B)} =\text{ Head to Tail }= \frac{p(A)p(B|A)p(D|B)}{p(B)}=\text{ Bayes' Theorem }$$
$$= p(A|B)p(D|B) \Longrightarrow{} \text{ $A$ and $D$ are conditionally independent given $B$}$$

e. $D\perp \!\!\! \perp  F | E$: False
$$ p(D,F|E) = \frac{p(D,F,E)}{p(E)} = \text{ Head to Head } = \frac{p(D)p(F)p(E|D,F)}{p(E)} \neq p(D|E) p(F|E)$$
$$\Longrightarrow{}\text{ $D$ and $F$ are not conditionally independent given $E$}$$

f. $B\perp \!\!\! \perp F| E$: False
$$ p(B,F|E) = \frac{p(B,F,E)}{p(E)} = \frac{p(B)p(F|B)p(E|F)}{p(E)}=\text{ Bayes' Theorem }$$
$$=\frac{p(B)p(F|E)p(F|B)}{p(F)} = \text{ Bayes' Theorem } = p(B|F)p(F|B) \neq p(B|E)p(F|E)$$
$$\Longrightarrow{}\text{ $B$ and $F$ are not conditionally independent given $E$}$$

g. $A\perp \!\!\! \perp  D | \{B, F\}$: True
$$ p(A,D|\{B,F\}) = \frac{p(A,D,B,F)}{p(B,F)} = \frac{p(A)p(B|A)p(F|B)p(D|B)}{p(B)p(F|B)} = \text{ Bayes' Theorem }$$
$$= p(A|B)p(D|B) \Longrightarrow{}\text{ $A$ and $D$ are conditionally independent given $B$ and $F$}$$

### Exercise 2

1. Write the generative model represented by the following directed graph, knowing that:
    - $p$ and $\pi_j$ are sampled from Beta distributions;
    - $r_i$ is sampled from a Bernoulli distribution;
    - $u_{ij}$ is sampled from a Bernoulli distribution with parameter $r_i (1 - \pi_j) + (1 - r_i)\pi_j$.

![](image.png)

2. Implement the generative model using `pyro`. 

    Set the hyperparameters to $\alpha_p=1,\beta_p=1,\alpha_\pi=1,\beta_\pi=5$ and evaluate your model on the observations `data = dist.Bernoulli(0.6).sample((12,6))`.

    Remember to use plate notation and to condition on the observed data!

#### Solution

1. The generative model is:
    * $p\sim Beta(\alpha_p, \beta_p)$
    * $\pi_j \sim Beta(\alpha_\pi, \beta_\pi)$
    * $r_i \sim Bernoulli(p)$
    * $u_{i,j} = Bernoulli \left(r_i (1 - \pi_j) + (1 - r_i)\pi_j\right)$

with $j=1, \dots, N$ and $i = 1, \dots, S$

In [3]:
import pyro
import torch
import pyro.distributions as dist
pyro.set_rng_seed(1)

# hyperparameters
alpha_p = 1.0
beta_p = 1.0
alpha_pi = 1.0
beta_pi = 5.0

def model(data):
    S = len(data)
    N = len(data[0])
    
    # Global variables
    p = pyro.sample('p', dist.Beta(alpha_p, beta_p))
    
    with pyro.plate('i', S):
        r = pyro.sample('r', dist.Bernoulli(p))
    
    with pyro.plate('j', N):
        pi = pyro.sample('pi', dist.Beta(alpha_pi, beta_pi))
    
    for i in range(S):
        for j in range(N):
            u = pyro.sample('u', dist.Bernoulli(r[i]*(1-pi[j])+(1-r[i])*pi[j]), obs=data)
            
    
    print("p =", p, "\nr =", r, "\npi =", pi, "\nu =",  u)
    
model(data = dist.Bernoulli(0.6).sample((12,6)))

p = tensor(0.7678) 
r = tensor([1., 1., 1., 1., 0., 1., 1., 0., 1., 0., 1., 1.]) 
pi = tensor([0.2320, 0.1038, 0.1691, 0.2006, 0.1639, 0.0195]) 
u = tensor([[0., 1., 1., 0., 1., 0.],
        [1., 0., 1., 1., 0., 1.],
        [0., 1., 1., 1., 1., 1.],
        [0., 0., 1., 1., 1., 1.],
        [1., 1., 0., 0., 1., 0.],
        [0., 0., 0., 1., 1., 1.],
        [0., 1., 0., 1., 0., 1.],
        [0., 1., 1., 0., 1., 0.],
        [1., 1., 1., 1., 0., 1.],
        [0., 0., 1., 1., 0., 0.],
        [1., 0., 0., 0., 1., 1.],
        [1., 1., 1., 1., 0., 1.]])


In [4]:
import pyro
import torch
import pyro.distributions as dist
pyro.set_rng_seed(1)

# hyperparameters
alpha_p = 1.0
beta_p = 1.0
alpha_pi = 1.0
beta_pi = 5.0

def model(data):
    S = len(data)
    N = len(data[0])
    
    # Global variables
    p = pyro.sample('p', dist.Beta(alpha_p, beta_p))
    
    x_axis = pyro.plate("x_axis", S, dim=-2)
    y_axis = pyro.plate("y_axis", N, dim=-3)
    with x_axis:
         r = pyro.sample('r', dist.Bernoulli(p))
    with y_axis:
        pi = pyro.sample('pi', dist.Beta(alpha_pi, beta_pi))
    with x_axis, y_axis:
        u = pyro.sample('u', dist.Bernoulli(r*(1-pi)+(1-r)*pi), obs=data)
            
    print("p =", p, "\nr =", r, "\npi =", pi, "\nu =",  u)
    
model(data = dist.Bernoulli(0.6).sample((12,6)))

p = tensor(0.7678) 
r = tensor([[1.],
        [1.],
        [1.],
        [1.],
        [0.],
        [1.],
        [1.],
        [0.],
        [1.],
        [0.],
        [1.],
        [1.]]) 
pi = tensor([[[0.2320]],

        [[0.1038]],

        [[0.1691]],

        [[0.2006]],

        [[0.1639]],

        [[0.0195]]]) 
u = tensor([[0., 1., 1., 0., 1., 0.],
        [1., 0., 1., 1., 0., 1.],
        [0., 1., 1., 1., 1., 1.],
        [0., 0., 1., 1., 1., 1.],
        [1., 1., 0., 0., 1., 0.],
        [0., 0., 0., 1., 1., 1.],
        [0., 1., 0., 1., 0., 1.],
        [0., 1., 1., 0., 1., 0.],
        [1., 1., 1., 1., 0., 1.],
        [0., 0., 1., 1., 0., 0.],
        [1., 0., 0., 0., 1., 1.],
        [1., 1., 1., 1., 0., 1.]])
