# The Objectives

1. Graphical model concepts
    - Conditionally independence.
    - Bayes theorm. 
    - Joint distributions.
2. Bayesian Network
3. Examples
4. Pomegranate for Bayesian Network in Python


# [Graphical Model (GM)][1]


In [probability theory][3], two random events $A$ and $B$ are conditionally independent given a third event $C$ precisely if the occurrence of $A$ and the occurrence of $B$ are independent events in their conditional probability distribution given $C$.
![image.png](attachment:image.png)



 
A graphical model or probablisitic graphical model (PGM) is a probabilistic model for which a graph expresses the conditional dependence structure between random variables. [Probabilistic graphical models][2] are graphs in which nodes represent random variables, and the (lack of) arcs represent conditional independence assumptions. Hence they provide a compact representation of joint probability distributions.

An example of <b><u> directed cyclic graphs (DAGs) </u></b>. Each arrow indicates a dependency. In this example: D depends on A, B, and C; and C depends on B and D; whereas A and B are each independent (See the figure)

<img style="float:center" src="./GM.png" alt="drawing" height="200" width="200"/>



[1]:https://en.wikipedia.org/wiki/Graphical_model
[2]:https://www.cs.ubc.ca/~murphyk/Bayes/bnintro.html#repr
[3]:https://en.wikipedia.org/wiki/Conditional_independence

# [Bayesian Network][1] 

Let us go through this example [this example][2]

A Bayesian Network is a
1. Directed acyclic graph (DAG) where each node represents a random variable (RV) Xi
3. For each node, we have a conditional probability distribution $p(X_{i}|parents(X_{i}))$
3. In the simplest case (discrete RVs), the conditional distribution is represented as a conditional probability table (CPT).

<br/>

## [Bayes Nets & conditional independence][3]


<b>Independence</b>: $p(X,Y)=P(X)p(Y)$ <br/>
<b>Conditional independence</b>: $p(X,Y|Z)=p(X|Z)p(Y|Z)$ 


## Bayes Nets & joint probabilities & chain rule:

Recall that by the chain rule, we can write any joint probability $p$ as: <br/><br/>

<center> $\large p(x_{1},x_{2},\dots,x_{n})=p(x_{1})p(x_{2}|x_{1})p(x_{3}|x_{1},x_{2})\dots p(x_{n}|x_{n-1},\dots,x_{3}x_{2}x_{1})$</center>



## Bayes Nets & Bayes theorem:

Given $X$ and $Y$  are two events (class and observation)

<center> $\large p(X|Y)=\large \frac{p(X,Y)}{p(Y)}$  </center>

$p(X|Y)$ is a conditional probability named the posterior probability.

$p(X,Y)=p(Y|X)p(X)$ is joint probability.

$p(Y|X)$ is a conditional probability named the likelihood probability.

$p(Y)$ and $p(X)$  are marginal probabilities but $p(Y)$ is evidence  and $p(X)$ prior probability.



## Possible dependencies:


<img style="float:center" src="./IndependenceXYZ.png" alt="drawing" height="200" width="400"/>


Head-to-head: Indep(X, Y )      
$P(X,Y,Z) = P(X)P(Y) P(Z|X,Y)$   
$P(X, Y) = P(X)P(Y)\sum_{z}{P(Z|X,Y)} = P(X)P(Y)$.       



Tail-to-tail: Indep(X, Y |Z)    
$P(X,Y,Z)= P(Z)P(X|Z)P(Y|Z)$       
$P(X,Y|Z)=\frac{P(X,Y,Z)}{P(Z)}=P(X|Z)P(Y|Z)$      




Head-to-tail: Indep(X, Y |Z)    
$P(X,Y,Z) = P(X)P(Z|X)P(Y|Z)$    
$P(X,Y|Z) = \frac{P(X,Y,Z)}{P(Z)}=\frac{P(X,Z)P(Y|Z)}{P(Z)}$   



[1]:https://ermongroup.github.io/cs228-notes/representation/directed/
[2]:https://www.youtube.com/watch?v=4fcqyzVJwHM
[3]:https://ipvs.informatik.uni-stuttgart.de/mlr/marc/teaching/15-MachineLearning/07-graphicalModels.pdf


## [Example of Bayes Nets and joint probability:][5]


In the following figure, obtain $p(A,B,C,D,E,F)$ by the chain rule:

<img style="float:center" src="./BNEx.png" alt="drawing" height="100" width="200"/>


### Solution:

<img style="float:center" src="./jointprob.png" alt="drawing" height="300" width="500"/>

[5]:http://www.cse.unsw.edu.au/~cs9417ml/Bayes/Pages/Bayesian_Networks_Definition.html

# [Example (Sprinkler/Grass Wet/Rain)][1]

1. Two events can cause grass(G) to be wet: an active sprinkler(S) or rain (R). 
2. Rain has a direct effect on the use of the sprinkler (namely that when it rains, the sprinkler usually is not active).


## [In the training dataset:][2]

1. Find out the random variables-- here we have three variables.
2. Sort the variables and build Bayesian Network.
3. Build Conditional Probability Table (CPT) in the below figure.
 - Using domain expert or 
 - Using the data

<img style="float:center" src="./SRG.png" alt="drawing" height="100" width="500"/>


## In the test dataset:

We may answer the following queries:
1. What is P(G,S,R)?
 - Answer: we need to calculate: $P(G,S,R)=P(G|S,R)P(S|R)P(R)$ for every $G,S,R\in \{T,F\}$ </br>
    - Example: $P(G=T,S=T,R=T)=P(G=T|S=T,R=T)P(S=T|R=T)p(R=T)=0.99x0.01x0.2=0.00198$
2. Given G=T, what is the probility that $R$ is $T$? == $P(R=T|G=T)$
    - Answer: <br/>
<center>$P(R=T|G=T)=\large \frac{P(G=T,R=T)}{P(G=T)}=\frac{\sum_{S\in\{T,F\}}P(G=T,S,R=T)}{\sum_{S,R\in\{T,F\}}P(G=T,S,R)}$ </center> <br/>
    - To complete the answer, goto to [this link][1]
3. More complicated queries that depends on variable elimination algorithms and non-exact matching inference.

[1]:https://en.wikipedia.org/wiki/Bayesian_network
[2]:https://ipvs.informatik.uni-stuttgart.de/mlr/marc/teaching/15-MachineLearning/07-graphicalModels.pdf

## pomegranate for Bayesian Network in Python

In [2]:
# Build Bayesian Networks

from pomegranate import *

rain = DiscreteDistribution({'T': 0.2, 'F': 0.8})
sprinkler=ConditionalProbabilityTable([['F','T',0.4],
                                      ['F','F',0.6],
                                      ['T','T',0.01],
                                      ['T','F',0.99]
                                     ],[rain])
grasswater=ConditionalProbabilityTable([['F','F','T',0.0],
                                        ['F','F','F',1.0],
                                        ['F','T','T',0.8],
                                        ['F','T','F',0.2],
                                        ['T','F','T',0.9],
                                        ['T','F','F',0.1],
                                        ['T','T','T',0.99],
                                        ['T','T','F',0.01]
                                     ],[sprinkler,rain])

s1 = Node(rain, name="rain")
s2 = Node(sprinkler, name="sprinkler")
s3 = Node(grasswater, name="grasswater")

model = BayesianNetwork("RSG")
model.add_states(s1, s2, s3)
model.add_edge(s1, s2)
model.add_edge(s1, s3)
model.add_edge(s2, s3)
model.bake()

In [3]:
# calculate joint probability of p('T','T','T')
print(model.probability([['T', 'T', 'T']]))

0.0019800000000000013


In [4]:
#predict the value of grasswater given the false of the others
print(model.predict([['F','F',None]]))

[array(['F', 'F', 'F'], dtype=object)]


In [5]:
# predict the probaility values of sprinkler given rain and grasswater are trues.
observations = { 'rain' : 'T','grasswater' : 'T' }
beliefs = map( str, model.predict_proba( observations ) )
print("\n".join( "{}\t{}".format( state.name, belief ) for state, belief in zip( model.states, beliefs ) ))

rain	T
sprinkler	{
    "class" : "Distribution",
    "dtype" : "str",
    "name" : "DiscreteDistribution",
    "parameters" : [
        {
            "T" : 0.012345679012346221,
            "F" : 0.9876543209876538
        }
    ],
    "frozen" : false
}
grasswater	T


# [Other BN package](https://docs.pymc.io/en/v3/)

# [Another Example][1]


Watch [this video][2]

[1]:http://staff.utia.cas.cz/vomlel/mh-puzzle.html
[2]:https://www.youtube.com/watch?v=SkC8S3wuIfg