# Bayesian Network

<div style="display: flex; align-items: center;">
    <img src="../imgs/BN.jpg" alt="Your Image" width="400" style="margin-right: 20px;">
    <div>
        <p>Before introduce the Bayesian Network, Let's first briefly introduce the probability graph model: Probability Graph Models (PGM) are models used to represent random variables and their probability dependency relationships. PGM is mainly divided into two types:</p>
        <p>Directed Graphical Models, also known as Bayesian Networks, where nodes (variables) are connected by directed edges to represent causal relationships between variables.</p>
        <p>Undirected Graphical Models, also known as Markov Random Fields, where nodes are connected by undirected edges to represent the correlations between variables.</p>
        <p>In this chapter, we focus on Bayesian Network and learn how to perform parameter and structure learning on Bayesian networks with a given dataset. And infer under known models.</p>
    </div>
</div>

## Parameter Learning
- **Condition**: sample dataset and Bayesian network structure

- **Output**: (conditional) probability distribution table for each node in the network

The two methods used in parameter learning are `Maximum Likelihood Estimation` and `Bayesian Estimation`.

In [39]:
import pandas as pd

data = pd.DataFrame(data={
                          'course': ["math", "math", "math", "math", "stat", "stat", "stat", "stat", "comp", "comp", "comp", "comp"], 
                          'school': ["science", "science", "science", "engineering", "science", "science", "science", "engineering", "science", "science", "science", "engineering"], 
                            'pass': ["No", "No", "No", "Yes", "No", "No", "Yes", "Yes", "No", "Yes", "Yes", "Yes"],
                           'letter':["No", "No", "Yes", "Yes", "No", "No", "Yes", "Yes", "No", "Yes", "No", "Yes"]
                         })


print(data)

   course       school pass letter
0    math      science   No     No
1    math      science   No     No
2    math      science   No    Yes
3    math  engineering  Yes    Yes
4    stat      science   No     No
5    stat      science   No     No
6    stat      science  Yes    Yes
7    stat  engineering  Yes    Yes
8    comp      science   No     No
9    comp      science  Yes    Yes
10   comp      science  Yes     No
11   comp  engineering  Yes    Yes


In [40]:
from pgmpy.estimators import MaximumLikelihoodEstimator
from pgmpy.models import BayesianNetwork

model = BayesianNetwork([('course', 'pass'), ('school', 'pass'), ('pass', 'letter')]) 
mle = MaximumLikelihoodEstimator(model, data)
print(mle.estimate_cpd('course'))

+--------------+----------+
| course(comp) | 0.333333 |
+--------------+----------+
| course(math) | 0.333333 |
+--------------+----------+
| course(stat) | 0.333333 |
+--------------+----------+


In [41]:
print(mle.estimate_cpd('school'))

+---------------------+------+
| school(engineering) | 0.25 |
+---------------------+------+
| school(science)     | 0.75 |
+---------------------+------+


In [56]:
print(mle.estimate_cpd('pass'))

+-----------+---------------------+--------------------+-----+---------------------+--------------------+
| course    | course(comp)        | course(comp)       | ... | course(stat)        | course(stat)       |
+-----------+---------------------+--------------------+-----+---------------------+--------------------+
| school    | school(engineering) | school(science)    | ... | school(engineering) | school(science)    |
+-----------+---------------------+--------------------+-----+---------------------+--------------------+
| pass(No)  | 0.0                 | 0.3333333333333333 | ... | 0.0                 | 0.6666666666666666 |
+-----------+---------------------+--------------------+-----+---------------------+--------------------+
| pass(Yes) | 1.0                 | 0.6666666666666666 | ... | 1.0                 | 0.3333333333333333 |
+-----------+---------------------+--------------------+-----+---------------------+--------------------+


In [43]:
print(mle.estimate_cpd('letter'))

+-------------+---------------------+---------------------+
| pass        | pass(No)            | pass(Yes)           |
+-------------+---------------------+---------------------+
| letter(No)  | 0.8333333333333334  | 0.16666666666666666 |
+-------------+---------------------+---------------------+
| letter(Yes) | 0.16666666666666666 | 0.8333333333333334  |
+-------------+---------------------+---------------------+


The `Maximum Likelihood Estimation` method may have overfitting issues when the sample dataset is relatively small. To address the above issues, `Bayesian Estimation` methods can be used.

In [44]:
from pgmpy.estimators import BayesianEstimator

bayes = BayesianEstimator(model, data)
print(bayes.estimate_cpd('pass', prior_type='BDeu', equivalent_sample_size=10))

+-----------+---------------------+---------------------+-----+---------------------+---------------------+
| course    | course(comp)        | course(comp)        | ... | course(stat)        | course(stat)        |
+-----------+---------------------+---------------------+-----+---------------------+---------------------+
| school    | school(engineering) | school(science)     | ... | school(engineering) | school(science)     |
+-----------+---------------------+---------------------+-----+---------------------+---------------------+
| pass(No)  | 0.3125              | 0.39285714285714285 | ... | 0.3125              | 0.6071428571428571  |
+-----------+---------------------+---------------------+-----+---------------------+---------------------+
| pass(Yes) | 0.6875              | 0.6071428571428571  | ... | 0.6875              | 0.39285714285714285 |
+-----------+---------------------+---------------------+-----+---------------------+---------------------+


## Structure Learning
- **Condition**: sample dataset

- **Output**: Bayesian network structure

The solving process is mainly based on a rating function. We first define a rating function that can evaluate the degree of fit between the Bayesian network and the training data. Then, based on this rating function, we find the most structurally optimal Bayesian network

In [45]:
import numpy as np
import pandas as pd
from pgmpy.estimators import HillClimbSearch
from pgmpy.estimators import BicScore


data = pd.DataFrame(data={
                          'course': ["math", "math", "math", "math", "stat", "stat", "stat", "stat", "comp", "comp", "comp", "comp"], 
                          'school': ["science", "science", "science", "engineering", "science", "science", "science", "engineering", "science", "science", "science", "engineering"], 
                            'pass': ["No", "No", "No", "Yes", "No", "No", "Yes", "Yes", "No", "Yes", "Yes", "Yes"],
                           'letter':["No", "No", "Yes", "Yes", "No", "No", "Yes", "Yes", "No", "Yes", "No", "Yes"]
                         })

hc = HillClimbSearch(data)
best_model = hc.estimate(scoring_method=BicScore(data))
best_model = hc.estimate()
print(best_model.edges())

  0%|          | 0/1000000 [00:00<?, ?it/s]

  0%|          | 0/1000000 [00:00<?, ?it/s]

[('school', 'pass'), ('school', 'course'), ('pass', 'letter'), ('pass', 'course'), ('letter', 'course')]


## Inference

In [74]:
import pandas as pd
from pgmpy.estimators import MaximumLikelihoodEstimator
from pgmpy.models import BayesianNetwork
from pgmpy.inference import VariableElimination
from pgmpy.factors.discrete import TabularCPD

model = BayesianNetwork([('course', 'pass'), ('school', 'pass'), ('pass', 'letter')]) 
cpd_couse = TabularCPD(variable='course', variable_card=3, values=[[1/3], [1/3], [1/3]])
cpd_school = TabularCPD(variable='school', variable_card=2, values=[[0.75], [0.25]])
cpd_pass = TabularCPD(variable='pass', variable_card=2,
                      values=[[0.0, 1.0, 1/3, 1.0, 2/3, 1.0] , #yes
                              [1.0, 0.0, 2/3, 0.0, 1/3, 0.0]], #no
                      evidence=['course', 'school'], evidence_card=[3, 2])
cpd_letter = TabularCPD(variable='letter', variable_card=2, 
                      values=[[5/6, 1/6],  #yes
                              [1/6, 5/6]], #no
                      evidence=['pass'], evidence_card=[2])

model.add_cpds(cpd_couse)
model.add_cpds(cpd_school)
model.add_cpds(cpd_pass)
model.add_cpds(cpd_letter)

infer = VariableElimination(model)
query_1 = infer.query(variables=['course'], evidence={'pass': 0}) #yes
query_2 = infer.query(variables=['school'], evidence={'pass': 0}) #yes
print(query_1)
print(query_2)

+-----------+---------------+
| course    |   phi(course) |
| course(0) |        0.1667 |
+-----------+---------------+
| course(1) |        0.3333 |
+-----------+---------------+
| course(2) |        0.5000 |
+-----------+---------------+
+-----------+---------------+
| school    |   phi(school) |
| school(0) |        0.5000 |
+-----------+---------------+
| school(1) |        0.5000 |
+-----------+---------------+
