# Homework 2

## Directed Graphical Models

In the previous homework, you computed the posterior probabilities of either the cook (C), the butler (B), or both, being a murdered given the choice of weapons (K = knife, P = poison). In this exercise you will construct a Directed Bayesian Graphical Model or belief network for the available evidence. 

As Inspector Markov has continued her investigation she has discovered an unexplained set of footprints, evidence that a third person may have been involved in the crime. Given that there is no evidence of a break in, she realizes that if a third person was involved they must have had assistance from either the cook, the butler or both. In other words, the cook and the butler may be guilty even if they did not commit the actual killing with the knife. 

The inspector also learns that further tests on the body have confirmed Dr Turing's conclusion that the cause of death was only a knife. 

Given this evidence, Inspector Markov must update her beliefs. 

As a first step in creating the belief network, import the packages you will need for this analysis.

In [1]:
from pgmpy.models import BayesianModel
from pgmpy.factors.discrete import TabularCPD

The joint probability distribution is:

$$p(B,C,W,BK,CK,M)$$   
where the letters indicate the following variables;   
$B = $ butler committed the crime,   
$C = $ cook committed the crime,   
$W = $ weapon K = knife, P = poison,   
$BK = $ butler committed the crime with a knife conditional on butler and weapon,   
$CK = $ cook committed the crime with a knife, conditional on cook and weapon,   
$M = $ butler, cook or both committed the crime, conditional on BK, CK.    

Notice that because of the evidence provided by Dr Turing we can neglect and conditional distribution where the weapon was poison. Also, it is possible the cook and butler are guilty without having actually used the knife; $p(BK_0,CK_0) \ne 0$.

Given the independencies, this distribution can be factorized in the following manner:

$$p(B,C,W,BK,CK,M) = p(B)\ p(C)\ p(W)\ p(BK\ |\ B,K)\ p(CK\ |\ C,K)\ p(M\ |\ BK,CK)$$

Now you will need to define the skeleton of the graph. Given the independency relationships of the factorized probability distribution define the skeleton of the model (`m_model`) using the `BayesianModel` function.

>**Hint:** Using paper and pencil make a sketch of the graph before you commit your skeleton structure to code. 

In [2]:
m_model = BayesianModel([('B', 'BK'), ('C', 'CK'), ('W', 'BK'), ("W", 'CK'),
                        ('BK', 'M'), ('CK', 'M')])

Your next step to create you model is to define the CDP for each independent variable using the `TabularCDP` function. The tables for these variables are:    


$p(B)$   

| Case | p |
|---|---|
|$B_0$ | 0.4 |
|$B_1$ | 0.6 |    

$p(C)$   

| Case | p |
|---|---|
|$C_0$ | 0.7 |
|$C_1$ | 0.3 |

$p(W)$   

| Case | p |
|---|---|
|$W_0$ | 1.0 |

Notice that since the Inspector is sure the weapon was a knife, the cardinality of $W = 1$, $p(K) = 1.0$. This fact reduces the cardinality of other variables as you will see. 

Using the above tables define the CDPs. Make sure you use variable names consistent with your model.

In [3]:
CDP_B = TabularCPD(variable='B', variable_card=2, values=[[0.4, 0.6]])
CDP_C = TabularCPD(variable='C', variable_card=2, values=[[0.7, 0.3]])
CDP_W = TabularCPD(variable='W', variable_card=1, values=[[1.0]])
print(CDP_B)
print(CDP_C)
print(CDP_W)

╒═════╤═════╕
│ B_0 │ 0.4 │
├─────┼─────┤
│ B_1 │ 0.6 │
╘═════╧═════╛
╒═════╤═════╕
│ C_0 │ 0.7 │
├─────┼─────┤
│ C_1 │ 0.3 │
╘═════╧═════╛
╒═════╤═══╕
│ W_0 │ 1 │
╘═════╧═══╛


Next, define the variables $BK$ and $CK$, the conditional probabilities that the butler or the cook are guilty, given the murder weapon. You need not consider cases of poison, $P$, as the probabilities are zero, reducing the number of states with non-zero probabilities. Thus, the probability tables for these variables are:

$$p(BK)$$

| | p | p |
|---|---|
| | $B_0$ | $B_1$|
| | $K_1$ | $K_1$ |
|$BK_0$ | 1.0 | 1.0 |

$$p(CK)$$

| | p | p |
|---|---|
| | $C_0$ | $C_1$|
| | $K_1$ | $K_1$ |
|$CK_0$ | 1.0 | 1.0 |

There are two odd aspects of these tables. First, convention is broken by having the positive case of a knife labeled as 0. Second, probabilities are all 1.0 since a knife was used and this fact is independent of the perpetrator. 

Give the above tables define the CDPs. 

In [4]:

CDP_BK = TabularCPD(variable='BK', variable_card=1, values=[[1.0, 1.0]],
                    evidence=['W', 'B'],evidence_card=[1, 2])
print(CDP_BK)
CDP_CK = TabularCPD(variable='CK', variable_card=1, values=[[1.0,1.0]],
                   evidence= ['W','C'], evidence_card=[1,2])
print(CDP_CK)


╒══════╤═════╤═════╕
│ W    │ W_0 │ W_0 │
├──────┼─────┼─────┤
│ B    │ B_0 │ B_1 │
├──────┼─────┼─────┤
│ BK_0 │ 1.0 │ 1.0 │
╘══════╧═════╧═════╛
╒══════╤═════╤═════╕
│ W    │ W_0 │ W_0 │
├──────┼─────┼─────┤
│ C    │ C_0 │ C_1 │
├──────┼─────┼─────┤
│ CK_0 │ 1.0 │ 1.0 │
╘══════╧═════╧═════╛


**Question:** If $p(Poison) \ne 0$ how many possible states would each of these CDPs have?  

ANS: 4 

Because $N_{B} * N_{W(K,P)} = 2 * 2 = 4$

Finally, you must define the probability of the murder; butler = 0, cook = 1, butler and cook = 2. This CDP is conditional on both $BK$ and $CK$. There are 12 possible states; $N_{BK} * N_{CK} * N_{M} = 2 * 2 * 4 = 12$ as shown here:

| | p | p | p |
|---|---|---|---|
|| $CK_0$ | $CK_0$ | $CK_1$ | $CK_1$|
|| $BK_0$ | $BK_1$ | $BK_0$ | $BK_1$ |
|$M_0$| 0.4 | 0.7 | 0.1 | 0.3 |
|$M_1$| 0.4  | 0.1 | 0.7 | 0.3 |
|$M_2$| 0.2 | 0.2 | 0.2 | 0.5 |

In [12]:
CDP_M = TabularCPD(variable='M', variable_card=3, values=[[0.4, 0.7,0.1,0.3],[0.4,0.1,0.7,0.3]
                                                         ,[0.2,0.2,0.2,0.4]],
                    evidence=['CK', 'BK'],evidence_card=[2, 2])
print(CDP_M)

╒═════╤══════╤══════╤══════╤══════╕
│ CK  │ CK_0 │ CK_0 │ CK_1 │ CK_1 │
├─────┼──────┼──────┼──────┼──────┤
│ BK  │ BK_0 │ BK_1 │ BK_0 │ BK_1 │
├─────┼──────┼──────┼──────┼──────┤
│ M_0 │ 0.4  │ 0.7  │ 0.1  │ 0.3  │
├─────┼──────┼──────┼──────┼──────┤
│ M_1 │ 0.4  │ 0.1  │ 0.7  │ 0.3  │
├─────┼──────┼──────┼──────┼──────┤
│ M_2 │ 0.2  │ 0.2  │ 0.2  │ 0.4  │
╘═════╧══════╧══════╧══════╧══════╛


**Question:** There are 12 possible states of the variable $M$. If $p(Poison) \ne 0$ how many possible states would there be?

ANS: It will bw 24.

$N_{BK} * N_{CK} * N_{M} * N_{W} = 2 * 2 * 4 *2 = 24$ 

To complete your belief network, use the `add_cpds` method. 

> **Hint:** Before going any further make sure you apply the `check_model` method to your complete model. 

In [13]:
m_model.add_cpds(CDP_B,CDP_BK,CDP_C,CDP_CK,CDP_M,CDP_W)
m_model.get_cpds()
m_model.check_model()



True

Next investigate the independencies of all the variables in your model using the `local_independencies` method. 

In [14]:
m_model.local_independencies(['W','B','C','BK','CK','M'])

(W _|_ C, B)
(B _|_ CK, C, W)
(C _|_ BK, W, B)
(BK _|_ CK, C | W, B)
(CK _|_ BK, B | C, W)
(M _|_ C, W, B | BK, CK)

**Question:** Is this graphical model an I-map of the distribution discussed at the start of this lab and why?

ANS: It is slightly different from our expected distribution based on below's independency.

$$p(B,C,W,BK,CK,M) = p(B)\ p(C)\ p(W)\ p(BK\ |\ B,K)\ p(CK\ |\ C,K)\ p(M\ |\ BK,CK)$$

Because they could be not same, even if there are the same independencies. 

It is not completly represent our distribution.

However, this model can be I-map when we apply to calcuate or do some inference.

The the trails that are not active from the following pairs of variables:

- B and C
- B and W
- C and W
- C and CK
- B and BK
- BK and CK
- B and M
- C and M

Create and execute the code using the `is_active_trail` method on the model object. 

In [18]:
 def test_active(start,end):
    print('Active trail between ' + start + ' and ' + end + ' -> '
          + str(m_model.is_active_trail(start,end)))

starts = ['B','B','C','C','B','BK','B','C']
ends = ['C','W','W','CK','BK','CK','M','M']

for s,e in zip(starts,ends): 
    test_active(s,e)    


Active trail between B and C -> False
Active trail between B and W -> False
Active trail between C and W -> False
Active trail between C and CK -> True
Active trail between B and BK -> True
Active trail between BK and CK -> True
Active trail between B and M -> True
Active trail between C and M -> True


**Question:** How can you best explain the blocked trails given the evidence variables and V-structures in the graph? What are the trials with V-structures which are blocked? 

ANS: When the trail is active(True) it means that there is no independency between the variables.

Also,once we observed the evidence variables, the bayes ball's constrain is slightly changed.

Common cause, one of the V-structures in the graph, is active when B is not observed.

Common effect is active when B is observerd. 

Therefore, in this case, we can say given distribution is not observed. 

So, trails between B, C and W is not active as a common effect.
