# Bayesian Network

- Structured, graphical representation of probabilistic relationships between features

- Features are considered as Random Variables (RVS)

- We are interested to answer some question like: `P(lung cancer=yes | smoking=no, positive X-ray=yes ) = ?`

<img src="Bayesian_Network.png" width="600" height="600">

## Generally:
```
P(S, C, B, X, D) = P(D | S, C, B, X) P(S, C, B, X)

                 = P(D | S, C, B, X) P(X| S, C, B) P(S, C, B)
                 
                 = P(D | S, C, B, X) P(X| S, C, B) P(B| S, C) P(S, C)
                 
                 = P(D | S, C, B, X) P(X| S, C, B) P(B| S, C)P(C | S) p(S)```
                    


```
P(S, C, B, X, D) = P(S) P(C | S) P(B| S, C) P(X|S,C,B) p(D|S,C,B,X)

```

We say `P(B| S, C) = P(B| S)` or B $\perp$ C | S



## Bayesian Network for Student Data

- Probabilistic Graphical Model has two major types: Bayesian Networks and Markov Networks

<img src="Student_Bayesian_Network.png" width="600" height="600">

# Notation Explanation:

- $g^0$: Good Grade
- $g^1$: Normal Grade
- $g^2$: Bad Grade

- $l^0$: Not Recommended
- $l^1$: Recommended

## Questions:

1- What is the probability that students get Good Grade in this class? -> $P(G=g^0)$

Hint: $P(G=g^0) =$

$P(G=g^0 | D = d^0, I=i^0)p(D = d^0)p(I=i^0) + $

$P(G=g^0 | D = d^1, I=i^0)p(D = d^1)p(I=i^0) + $
                  
$P(G=g^0 | D = d^0, I=i^1)p(D = d^0)p(I=i^1) + $
                  
$P(G=g^0 | D = d^1, I=i^1)p(D = d^1)p(I=i^1)$

2- $P(G | D= d^0, I = i^1)$

3- $P(G = g^0| I = i^1)$

In [None]:
from pgmpy.models import BayesianModel
from pgmpy.factors.discrete import TabularCPD
from pgmpy.inference import VariableElimination
student_model = BayesianModel([('D', 'G'), ('I', 'G'), ('G', 'L'), ('I', 'S')])
grade_cpd = TabularCPD(variable='G', variable_card=3,
                        values=[[0.3, 0.05, 0.9, 0.5],
                        [0.4, 0.25, 0.08, 0.3],
                        [0.3, 0.7, 0.02, 0.2]],
                        evidence=['I', 'D'],
                        evidence_card=[2, 2])
difficulty_cpd = TabularCPD(
                            variable='D',
                            variable_card=2,
                            values=[[0.6, 0.4]])
intel_cpd = TabularCPD(
                        variable='I',
                        variable_card=2,
                        values=[[0.7, 0.3]])
letter_cpd = TabularCPD(
                        variable='L',
                        variable_card=2,
                        values=[[0.1, 0.4, 0.99],
                        [0.9, 0.6, 0.01]],
                        evidence=['G'],
                        evidence_card=[3])
sat_cpd = TabularCPD(
                        variable='S',
                        variable_card=2,
                        values=[[0.95, 0.2],
                        [0.05, 0.8]],
                        evidence=['I'],
                        evidence_card=[2])
student_model.add_cpds(grade_cpd, difficulty_cpd,
intel_cpd, letter_cpd,
sat_cpd)
print(student_model.get_independencies())
student_infer = VariableElimination(student_model)
prob_G = student_infer.query(variables=['G'])
print(prob_G['G'])
prob_G = student_infer.query(variables=['G'], evidence={'I': 1, 'D': 0})
print(prob_G['G'])
prob_G = student_infer.query(variables=['G'], evidence={'I': 1})
print(prob_G['G'])

In [None]:
## Applications:

