# Bayesian Networks in Python

I will build a Bayesian (Belief) Network for the Alarm example in the textbook using the Python library [pgmpy](https://pgmpy.org/).


In [83]:
! pip install -q pgmpy

## Defining the  Bayesian Network

![The Alarm Bayes Network](Alarm_BN.png)

In [84]:
# pgmpy currently uses a pandas feature that will be deprecated in the future.
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)

import pandas as pd

from pgmpy.models import BayesianNetwork
from pgmpy.inference import VariableElimination

model = BayesianNetwork(
    [
        ("Burglary", "Alarm"),
        ("Earthquake", "Alarm"),
        ("Alarm", "JohnCalls"),
        ("Alarm", "MaryCalls"),
    ]
)

# Defining the parameters using CPT
from pgmpy.factors.discrete import TabularCPD

cpd_burglary = TabularCPD(
    variable="Burglary", variable_card=2, values=[[0.999], [0.001]]
)
cpd_earthquake = TabularCPD(
    variable="Earthquake", variable_card=2, values=[[0.998], [0.002]]
)
cpd_alarm = TabularCPD(
    variable="Alarm",
    variable_card=2,
    values=[[0.999, 0.71, 0.06, 0.05], [0.001, 0.29, 0.94, 0.95]],
    evidence=["Burglary", "Earthquake"],
    evidence_card=[2, 2],
)
cpd_johncalls = TabularCPD(
    variable="JohnCalls",
    variable_card=2,
    values=[[0.95, 0.1], [0.05, 0.9]],
    evidence=["Alarm"],
    evidence_card=[2],
)
cpd_marycalls = TabularCPD(
    variable="MaryCalls",
    variable_card=2,
    values=[[0.99, 0.3], [0.01, 0.7]],
    evidence=["Alarm"],
    evidence_card=[2],
)

# Associating the parameters with the model structure
model.add_cpds(
    cpd_burglary, cpd_earthquake, cpd_alarm, cpd_johncalls, cpd_marycalls
)

In [85]:
model.get_independencies()

(Burglary ⟂ Earthquake)
(Burglary ⟂ MaryCalls, JohnCalls | Alarm)
(Burglary ⟂ MaryCalls, JohnCalls | Alarm, Earthquake)
(Burglary ⟂ JohnCalls | Alarm, MaryCalls)
(Burglary ⟂ MaryCalls | Alarm, JohnCalls)
(Burglary ⟂ JohnCalls | Alarm, Earthquake, MaryCalls)
(Burglary ⟂ MaryCalls | Alarm, Earthquake, JohnCalls)
(Earthquake ⟂ Burglary)
(Earthquake ⟂ MaryCalls, JohnCalls | Alarm)
(Earthquake ⟂ JohnCalls | Alarm, MaryCalls)
(Earthquake ⟂ MaryCalls, JohnCalls | Alarm, Burglary)
(Earthquake ⟂ MaryCalls | Alarm, JohnCalls)
(Earthquake ⟂ JohnCalls | Alarm, MaryCalls, Burglary)
(Earthquake ⟂ MaryCalls | Alarm, Burglary, JohnCalls)
(MaryCalls ⟂ Earthquake, Burglary, JohnCalls | Alarm)
(MaryCalls ⟂ Burglary, JohnCalls | Alarm, Earthquake)
(MaryCalls ⟂ Earthquake, JohnCalls | Alarm, Burglary)
(MaryCalls ⟂ Earthquake, Burglary | Alarm, JohnCalls)
(MaryCalls ⟂ JohnCalls | Alarm, Earthquake, Burglary)
(MaryCalls ⟂ Burglary | Alarm, Earthquake, JohnCalls)
(MaryCalls ⟂ Earthquake | Alarm, Burglary, Joh

See: [pmgpy: Bayesian Networks](https://pgmpy.org/models/bayesiannetwork.html)

# Approximate Inference: Sample from the Network

See: [pmgpy Approximate Inference Using Sampling](https://pgmpy.org/approx_infer/approx_infer.html)

We will call here directly the sampling methods a more convenient interface is provided as `model.simulate()` which will automatically choose the correct sampling method.

In [86]:
from pgmpy.factors.discrete import State

inference = BayesianModelSampling(model)

## Prior sampling

Samples a complete event from the network.
An event is an assignment for each variable sampled from the network. Prior sampling is called `forward_sample` in `pgmpy`.

In [87]:
inference.forward_sample(size=10)


[A
[A
[A
[A
[A
Generating for node: MaryCalls: 100%|██████████| 5/5 [00:00<00:00, 465.20it/s]


Unnamed: 0,Burglary,Alarm,Earthquake,JohnCalls,MaryCalls
0,0,0,0,0,0
1,0,0,0,0,0
2,0,0,0,0,0
3,0,0,0,0,0
4,0,0,0,0,0
5,0,0,0,0,0
6,0,0,0,0,0
7,0,0,0,0,0
8,0,0,0,0,0
9,0,0,0,0,0


Convenient version:
`model.simulate(n_samples=10)`

## Sampling with evidence

### Rejection sampling
Uses rejection sampling by ignoring the samples that are not consistent with the evidence.

Fixing `Burglary` is easy since it is an unconditional node.

In [88]:

evidence = [State(var='Burglary', state=1)]
inference.rejection_sample(evidence = evidence, size = 10)


100%|██████████| 10/10 [00:00<00:00, 119.99it/s]


Unnamed: 0,Burglary,Alarm,Earthquake,JohnCalls,MaryCalls
0,1,1,0,1,0
1,1,1,0,1,1
2,1,1,0,1,1
3,1,1,0,1,1
4,1,1,0,1,1
5,1,1,0,1,0
6,1,1,0,1,1
7,1,1,0,0,1
8,1,1,0,1,1
9,1,1,0,1,0


Convenient version `model.simulate(n_samples = 10, evidence = {'Burglary': 1})`

### Importance sampling
Sampling with a given value for `Alarm` is more difficult since it depends on `Burglary` and `Earthquake`. We use importance sampling here.

In [89]:
evidence = [State(var='Alarm', state=1)]
inference.likelihood_weighted_sample(evidence = evidence, size = 10)


[A
[A
[A
[A
[A
Generating for node: MaryCalls: 100%|██████████| 5/5 [00:00<00:00, 338.41it/s]


Unnamed: 0,Burglary,Alarm,Earthquake,JohnCalls,MaryCalls,_weight
0,0,1,0,1,1,0.001
1,0,1,0,1,1,0.001
2,0,1,0,1,1,0.001
3,0,1,0,1,1,0.001
4,0,1,0,1,1,0.001
5,0,1,0,1,1,0.001
6,0,1,0,0,1,0.001
7,0,1,0,1,1,0.001
8,0,1,0,1,1,0.001
9,0,1,0,1,1,0.001


## Gibbs Sampling

Looks like `pmgpy` does not implement Gibbs Sampling with evidence.

In [90]:
evidence = [State(var='Alarm', state=1)]
from pgmpy.sampling import GibbsSampling
gibbs_chain = GibbsSampling(model)
gibbs_chain.sample(size=10)


100%|██████████| 9/9 [00:00<00:00, 2024.82it/s]


Unnamed: 0,Burglary,Alarm,Earthquake,JohnCalls,MaryCalls
0,1,0,1,0,0
1,0,0,0,0,0
2,0,0,0,0,0
3,0,0,0,0,0
4,0,0,0,0,0
5,0,0,0,0,0
6,0,0,0,0,0
7,0,0,0,0,0
8,0,0,0,1,0
9,0,0,0,0,0


# Estimating Probabilities

The library provides functions to calculate/estimate probabilities.

## Joint probability

The following calculates $P(B=false, E=false, A=true, J=true, M=false)$,
$P(B)$, $P(E)$, and $P(B, E)$.

In [91]:
[model.get_state_probability({'Burglary': 0,
                              'Earthquake': 0,
                              "Alarm": 1,
                              "JohnCalls": 1,
                              "MaryCalls": 0}
                             ),
 model.get_state_probability({'Burglary': 1}),
 model.get_state_probability({'Earthquake': 1}),
 model.get_state_probability({'Burglary': 1, 'Earthquake': 1})]

[0.00026919053999999995, 0.001, 0.002, 2e-06]

In [92]:
# B and E are independent... P(B=false AND E=false) = P(B=false) * P(E=false)
0.001 * 0.002

2e-06

## Conditional probabilities given evidence

What is the chance of an `Earthquake` if the `Alarm` goes off $P(E | A)$? Below we see it is `E=true` `Earthquake : 0.23`.

In [93]:
model.predict_probability(pd.DataFrame([{'Alarm': 1}]))

Unnamed: 0,Earthquake_0,Earthquake_1,MaryCalls_0,MaryCalls_1,Burglary_0,Burglary_1,JohnCalls_0,JohnCalls_1
0,0.768991,0.231009,0.3,0.7,0.626449,0.373551,0.1,0.9


What is the chance of an ongoing `Burglary` if both neighbors call $P(B | J, M)$? The result below for `B=true` is `Burglary_1 : 0.28`. 

In [94]:
model.predict_probability(pd.DataFrame([{'JohnCalls': 1, 'MaryCalls': 1}]))

Unnamed: 0,Alarm_0,Alarm_1,Earthquake_0,Earthquake_1,Burglary_0,Burglary_1
0,0.239308,0.760692,0.823933,0.176067,0.715828,0.284172


## Learning Bayes Networks from Data

`pgmpy` provides a `fit` function to learn the model.