# Bayesian Networks

![Cricket_model](https://drive.google.com/file/d/1HlgBjBUDtbCKgyWZ0rGwNdlb50K_MHay/preview)

## Installing pgmpy

In [0]:
!pip install pgmpy
!pip install wrapt

Collecting pgmpy
[?25l  Downloading https://files.pythonhosted.org/packages/96/9c/4b1e07564d8160838d0472728746f3ea3725ced41e43ac05486a328ee78e/pgmpy-0.1.6.tar.gz (218kB)
[K    100% |████████████████████████████████| 225kB 7.5MB/s 
Building wheels for collected packages: pgmpy
  Running setup.py bdist_wheel for pgmpy ... [?25l- \ done
[?25h  Stored in directory: /content/.cache/pip/wheels/5e/5d/c5/81dd9fc173c4b56cc6f38b943d3d73b81f1096d67c52ae278a
Successfully built pgmpy
Installing collected packages: pgmpy
Successfully installed pgmpy-0.1.6
Collecting wrapt
  Downloading https://files.pythonhosted.org/packages/a0/47/66897906448185fcb77fc3c2b1bc20ed0ecca81a0f2f88eda3fc5a34fc3d/wrapt-1.10.11.tar.gz
Building wheels for collected packages: wrapt
  Running setup.py bdist_wheel for wrapt ... [?25l- \ done
[?25h  Stored in directory: /content/.cache/pip/wheels/48/5d/04/22361a593e70d23b1f7746d932802efe1f0e523376a74f321e
Successfully built wrapt
Installing collected packages: wra

## Creating Bayesian Object  
###  import pgmpy to create Bayesian Object

In [0]:
from pgmpy.models import BayesianModel

### Specifying the structure of our Model
To create a Bayesain Model , you need to specify Nodes (Random Variables) in order of influences. 

In this playing_model example, **weather_outlook** is a child node for both **humidity** and **wind**. Also, **playing_cricket** is a child node for **weather_outlook**.

In [0]:
playing_model = BayesianModel([('H', 'WO'),('W', 'WO'), ('WO', 'PC')])

This creates nodes and directed edges of the Bayesian network

### Specifying Conditional probabilites of each Nodes (Random Variables)  
CPD's are given using tabular CPD

In [0]:
from pgmpy.factors.discrete import TabularCPD

#### for root nodes . 
These nodes have no evidence


In [0]:
humidity_cpd = TabularCPD(variable='H',
                       variable_card=2,
                       values=[[.75, .25]])

This code specifies CPD for node 'H' (Humidity). 

Similarly create CPD's of node 'W' (Wind)

In [0]:
## create cpd for wind
#wind_cpd = ?

## added to run further excerise ,  
wind_cpd = TabularCPD(variable='W',
                         variable_card=2,
                         values=[[ .4, .6]])

#### for nodes which are influenced by other nodes

These nodes have evidences 

In [0]:
weather_outlook_cpd = TabularCPD(variable='WO',
                         variable_card=3,
                         values=[[.8, .6, .3, .1 ],
                        [.16, .25, .4, .2],
                        [.04, .15, .3, .7]],
                         evidence=['H', 'W'],
                         evidence_card=[2, 2])

Here , **evidence** specify nodes which influence 'W' node, and **evidence_card** specify number of states of those nodes .

Similary create CPD of node 'PC'

In [0]:
## create cpd for 'PC', playing cricket
#playing_cricket_cpd = 

## added to run further excerise ,  
playing_cricket_cpd = TabularCPD(variable='PC',
                         variable_card=2,
                         values=[[.2, .6, .95 ],
                        [.8, .4, .05]],
                         evidence=[ 'WO'],
                         evidence_card=[ 3 ])

Add the pre-defined CPDs using BayesianModel's add_cpds() method

In [0]:
playing_model.add_cpds ( humidity_cpd, wind_cpd, weather_outlook_cpd, playing_cricket_cpd)

###To Verify cpd of any random variable we can use get_cpds  method


In [0]:
print(playing_model.get_cpds())


[<TabularCPD representing P(H:2) at 0x7f1bd1e46a10>, <TabularCPD representing P(W:2) at 0x7f1bd1e46fd0>, <TabularCPD representing P(WO:3 | H:2, W:2) at 0x7f1bd1e46f90>, <TabularCPD representing P(PC:2 | WO:3) at 0x7f1bd2074190>]


In [0]:
# Iterate over playing_model.get_cpds()
for cpd in playing_model.get_cpds():
    print("CPD of {variable}:".format(variable=cpd.variable))
    print(cpd)


CPD of H:
+-----+------+
| H_0 | 0.75 |
+-----+------+
| H_1 | 0.25 |
+-----+------+
CPD of W:
+-----+-----+
| W_0 | 0.4 |
+-----+-----+
| W_1 | 0.6 |
+-----+-----+
CPD of WO:
+------+------+------+-----+-----+
| H    | H_0  | H_0  | H_1 | H_1 |
+------+------+------+-----+-----+
| W    | W_0  | W_1  | W_0 | W_1 |
+------+------+------+-----+-----+
| WO_0 | 0.8  | 0.6  | 0.3 | 0.1 |
+------+------+------+-----+-----+
| WO_1 | 0.16 | 0.25 | 0.4 | 0.2 |
+------+------+------+-----+-----+
| WO_2 | 0.04 | 0.15 | 0.3 | 0.7 |
+------+------+------+-----+-----+
CPD of PC:
+------+------+------+------+
| WO   | WO_0 | WO_1 | WO_2 |
+------+------+------+------+
| PC_0 | 0.2  | 0.6  | 0.95 |
+------+------+------+------+
| PC_1 | 0.8  | 0.4  | 0.05 |
+------+------+------+------+


### Computation of Probabilities using evidence



The next logical step will be the computation of probabilities and CPDs of various nodes within the Bayesian Model by specifying evidence. This will give us inferences of different variables based on the evidences observed.

####Here we import VariableElimination module to compute probabilities

In [0]:
from pgmpy.inference import VariableElimination

In [0]:
# our infering Object
playing_cricket_infer = VariableElimination(playing_model)

*query()* method of our infer object is used to do inference, by specifying what we need to infer in **variables** and what we observe in **evidence** along with there states.

### Find the probability of playing Cricket given that Wind  speed is low

In [0]:
#prob_playing_cricket_W0 = ?
#To compute Probability of Playing cricket when Humidity is low

prob_playing_cricket_h0 = playing_cricket_infer.query(variables=['PC'],evidence={'H':0})
print(prob_playing_cricket_h0['PC'])

+------+-----------+
| PC   |   phi(PC) |
|------+-----------|
| PC_0 |    0.3651 |
| PC_1 |    0.6349 |
+------+-----------+


### Without evidence

We can also infer without giving any evidence, Let us try this and get to know our overall probability of playing cricket in our model.

In [0]:
prob_playing_cricket = playing_cricket_infer.query(variables=['PC'])
print(prob_playing_cricket['PC'])

+------+-----------+
| PC   |   phi(PC) |
|------+-----------|
| PC_0 |    0.4531 |
| PC_1 |    0.5469 |
+------+-----------+


Vola! , PC_1 = 0.54 . we can Play cricket!

Whatever inference we did till now is called **CAUSAL Reasoning** , we are querying the model top to bottom . 

### Evidential Reasoning
pgmpy also allows us to query the model bottom up which is called *evidential reasoning*  , example:
#### What is probability of Day being windy given that We played cricket

In [0]:
prob_wind_pc1 = playing_cricket_infer.query(variables=['W'],evidence={'PC':1})
print(prob_wind_pc1['W'])

+-----+----------+
| W   |   phi(W) |
|-----+----------|
| W_0 |   0.4631 |
| W_1 |   0.5369 |
+-----+----------+


Therefore, probability that we played cricket when weather is windy more. 

#### What is probability of less Humidity given that it was sunny day (*WO_1*)

In [0]:
#prob_humidity_WO1 = ?
prob_wind_h0 = playing_cricket_infer.query(variables=['H'],evidence={'WO':1})
print(prob_wind_h0['H'])

+-----+----------+
| H   |   phi(H) |
|-----+----------|
| H_0 |   0.6963 |
| H_1 |   0.3037 |
+-----+----------+


Hence, if it is sunny day humidity is less.

### Conditional Independence

Two events A and B are independent if:

**P(A∩B)=P(A)P(B) P(A∩B)=P(A)P(B) **
 
and by extension

**P(A|B)=P(A)
P(A|B)=P(A)**
 
We can extend this to conditional independence. Two events A and B are conditionally independent given an event C with P(C)>0 if

**P(A∩B|C)=P(A|C)P(B|C)
P(A∩B|C)=P(A|C)P(B|C)**
 
 


#### Conditional Independence in Bayesian structure

Let us say we have 3 random variables X, Y and Z.

By definition, X and Y are conditionally independent [given Z] if given the knowledge of Z, probability of X gives no information on the probability of Y, and vice versa.

In our Bayesian Model we can say that **Humidity** is independent of  **Playing  Cricket** given **Weather Outlook** , that is 'H'⊥'PC'|'WO'  . Now we shall verify this by doing inference

Now we shall infer Probability of Playing cricket **'PC'** given Weather Outlook **'WO'** and Humidity **'H'**  


In [0]:
prob_playing_criket_given_weather0_humidity0 = playing_cricket_infer.query(variables=['PC'],evidence={'H':0, 'WO':0})
print(prob_playing_criket_given_weather_humidity['PC'])

+------+-----------+
| PC   |   phi(PC) |
|------+-----------|
| PC_0 |    0.2000 |
| PC_1 |    0.8000 |
+------+-----------+


Observe the results PC_0 and PC_1 , for given evidence 'H'=0 and 'WO'=0

In [0]:
prob_playing_criket_given_weather0_humidity1 = playing_cricket_infer.query(variables=['PC'],evidence={'H':1, 'WO':0})
print(prob_playing_criket_given_weather0_humidity1['PC'])

+------+-----------+
| PC   |   phi(PC) |
|------+-----------|
| PC_0 |    0.2000 |
| PC_1 |    0.8000 |
+------+-----------+


Now look at result PC_0 and PC_1 , it's still the  same for evidence 'H'=1 and 'WO'=0 , which means that **'PC'** is not influcened by **'H'** , when **'WO'** is observed . In other words  'PC' is independent of 'H' given 'WO'. Mathematically 'H'⊥'PC'|'WO'

### To get all Independencies of our model

In [0]:
print("Independencies")
print(playing_model.get_independencies())

Independencies
(H _|_ W)
(H _|_ PC | WO)
(H _|_ PC | WO, W)
(W _|_ H)
(W _|_ PC | WO)
(W _|_ PC | WO, H)
(PC _|_ H, W | WO)
(PC _|_ W | WO, H)
(PC _|_ H | WO, W)


Now using playing_cricket_infer.query() verify any one of the independecies by yourself

In [0]:
# Hint : Do infer using evidence , when the result doesn't change when you change the evidence then those two are independent


### Active Trails
Active trial is an opposite of indepencies, A trail X1 --- X2 is active trail if the influence flows from X1 to X2 (i,e they are dependent)  
To check a trial is active trail , use **is_active_trail (start, end, observed=None)** function. 

In [0]:
playing_model.is_active_trail('H', 'PC')

True

Check other trails , by giving evidence in *observed *

In [0]:
playing_model.is_active_trail('H', 'PC', observed='WO')

False

#### To know all active trails from a Node
use **active_trail_nodes(variables, observed=None)**, To get dictionary with the given variables as keys and all the nodes reachable from that respective variable as values.

In [0]:
## Use active_trail_nodes(variables, observed=None) with some nodes


##V-Structures
###Let us see how the conditional probabilities for V-Structures work

In [0]:
#Let us observe if Wind influences Humidity

prob_Humidity_given_Wind = playing_cricket_infer.query(variables=['H'],evidence={'W':0})
print(prob_Humidity_given_Wind['H'])

prob_Humidity_given_Wind = playing_cricket_infer.query(variables=['H'],evidence={'W':1})
print(prob_Humidity_given_Wind['H'])



+-----+----------+
| H   |   phi(H) |
|-----+----------|
| H_0 |   0.7500 |
| H_1 |   0.2500 |
+-----+----------+
+-----+----------+
| H   |   phi(H) |
|-----+----------|
| H_0 |   0.7500 |
| H_1 |   0.2500 |
+-----+----------+


We can see that the conditional probabilities of Humidity 'H' remains same given Wind 'W'
i.e Humidity is independent of Wind (H⊥W).

###Let us see what happens when Weather Outlook 'WO' is observed

In [0]:
prob_Humidity_given_Wind = playing_cricket_infer.query(variables=['H'],evidence={'W':0,'WO':0})
print(prob_Humidity_given_Wind['H'])

prob_Humidity_given_Wind = playing_cricket_infer.query(variables=['H'],evidence={'W':1,'WO':0})
print(prob_Humidity_given_Wind['H'])

+-----+----------+
| H   |   phi(H) |
|-----+----------|
| H_0 |   0.8889 |
| H_1 |   0.1111 |
+-----+----------+
+-----+----------+
| H   |   phi(H) |
|-----+----------|
| H_0 |   0.9474 |
| H_1 |   0.0526 |
+-----+----------+


We can see that if Weather Outlook 'WO' is observed to be bleak then the Wind 'W' influences 'H' random variable.

** Question to solve: **

In [0]:
#Similarly find what happens when Weather Outlook is favourable
#prob_Humidity_given_Wind=


##D-Separation

###To nodes X,Y in a given graph are said to to be D-separated if there is no active trail between X and Y given Z i.e X ⊥Y|Z

For Example, let us take check how Playing Cricket varies for different conditions of Humidity,

**a) When Wind is given**

In [0]:
#Let us if Humidity 'H' and Playing_Cricket 'PC' is D-separated given Wind 'W'
prob_Playing_Cricket_given_Wind = playing_cricket_infer.query(variables=['PC'],evidence={'H':0,'W':0})
print(prob_Playing_Cricket_given_Wind['PC'])

prob_Playing_Cricket_given_Wind = playing_cricket_infer.query(variables=['PC'],evidence={'H':1,'W':0})
print(prob_Playing_Cricket_given_Wind['PC'])

+------+-----------+
| PC   |   phi(PC) |
|------+-----------|
| PC_0 |    0.2940 |
| PC_1 |    0.7060 |
+------+-----------+
+------+-----------+
| PC   |   phi(PC) |
|------+-----------|
| PC_0 |    0.5850 |
| PC_1 |    0.4150 |
+------+-----------+


We can see that  the **probability of playing cricket changes with Humidity** i.e **there is an active trail** hence  'PC' and  'H' are not D-separated given 'W'

**b)When Weather Outlook is given**

In [0]:
#Let us if Humidity 'H' and Playing_Cricket 'PC' is D-separated given Weather Outlook 'WO'
prob_Playing_Cricket_given_Wind = playing_cricket_infer.query(variables=['PC'],evidence={'H':0,'WO':0})
print(prob_Playing_Cricket_given_Wind['PC'])

prob_Playing_Cricket_given_Wind = playing_cricket_infer.query(variables=['PC'],evidence={'H':1,'WO':0})
print(prob_Playing_Cricket_given_Wind['PC'])

+------+-----------+
| PC   |   phi(PC) |
|------+-----------|
| PC_0 |    0.2000 |
| PC_1 |    0.8000 |
+------+-----------+
+------+-----------+
| PC   |   phi(PC) |
|------+-----------|
| PC_0 |    0.2000 |
| PC_1 |    0.8000 |
+------+-----------+


We can see that  the **probability of playing cricket doesn' t changes with Humidity** i.e there is **no active trail** hence random variables  'PC' and  'H' are D-separated given  Weather Outlook 'WO'

**c) How Wind and Playing Cricket related when Humidity is known**

In [0]:
#Find if Wind 'W'and Playing cricket 'PC' are D-separated given Humidity 'H'
#prob_Playing_Cricket_given_Humidity = ?

**Questions:**
 
 1) Is active trail found for above case?
 
 2) Whether Probability of Playing Cricket changes on Wind when Humidity is known?
 
 3) Whether Probability of Playing Cricket depends on Wind when Weather Outlook is given?