# Bayesian Networks in Python

In this tutorial, we will explore how to use Bayesian networks in Python using the `pgmpy` library.

See documentation at:
- https://pgmpy.org/
- https://pgmpy.org/models/bayesiannetwork.html
- https://pgmpy.org/factors/discrete.html
- https://pgmpy.org/exact_infer/ve.html

## Implementation in Python using `pgmpy`.

Let's start by installing the `pgmpy` library.

In [1]:
!pip install -q pgmpy networkx matplotlib plotly daft

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.9 MB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.4/1.9 MB[0m [31m11.2 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m1.9/1.9 MB[0m [31m31.7 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.9/1.9 MB[0m [31m24.0 MB/s[0m eta [36m0:00:00[0m
[?25h

Import the required libraries.

In [2]:
import numpy as np
from pgmpy.models import BayesianNetwork
from pgmpy.factors.discrete import TabularCPD
import networkx as nx
import matplotlib.pyplot as plt

Create the Bayesian network:

In [3]:
# Define the network structure
model = BayesianNetwork([('TrainStrike', 'JohannaLate'), ('TrainStrike', 'PeterLate'), ('PetersAlarmFails', 'PeterLate')])

statenames = {
    'TrainStrike': ['True', 'False'],
    'PeterLate': ['True', 'False'],
    'JohannaLate': ['True', 'False'],
    'PetersAlarmFails': ['True', 'False']
}

# Define the probability distribution for 'TrainStrike' (P(TrainStrike))
cpd_trainStrike = TabularCPD(variable='TrainStrike', variable_card=2,
                             state_names=statenames,
                             values=[[0.05], [0.95]])

# Define the conditional probability distribution for 'JohannaLate' (P(JohannaLate | TrainStrike))
cpd_johannaLate = TabularCPD(variable='JohannaLate', variable_card=2,
                             state_names=statenames,
                             evidence=['TrainStrike'], evidence_card=[2],
                             values=[[0.5, 0.02],
                                     [0.5, 0.98]])

# Define the conditional probability distribution for 'PeterLate' (P(PeterLate | TrainStrike, PetersAlarmFails ))
cpd_peterLate = TabularCPD(variable='PeterLate', variable_card=2,
                           state_names=statenames,
                           evidence=['TrainStrike', 'PetersAlarmFails'], evidence_card=[2, 2],
                           values=[[0.7, 0.3, 0.4, 0.01],
                                   [0.3, 0.7, 0.6, 0.99]])

# Define the conditional probability distribution for 'PetersAlarmFails' (P(PetersAlarmFails))
cpd_petersAlarmFails = TabularCPD(variable='PetersAlarmFails', variable_card=2,
                                  state_names=statenames,
                                  values=[[0.1],[0.9]])

# Add the probability distributions to the model
model.add_cpds(cpd_trainStrike, cpd_johannaLate, cpd_peterLate, cpd_petersAlarmFails)
model.check_model()

print(cpd_trainStrike)
print(cpd_johannaLate)
print(cpd_peterLate)
print(cpd_petersAlarmFails)




+--------------------+------+
| TrainStrike(True)  | 0.05 |
+--------------------+------+
| TrainStrike(False) | 0.95 |
+--------------------+------+
+--------------------+-------------------+--------------------+
| TrainStrike        | TrainStrike(True) | TrainStrike(False) |
+--------------------+-------------------+--------------------+
| JohannaLate(True)  | 0.5               | 0.02               |
+--------------------+-------------------+--------------------+
| JohannaLate(False) | 0.5               | 0.98               |
+--------------------+-------------------+--------------------+
+------------------+-----+-------------------------+
| TrainStrike      | ... | TrainStrike(False)      |
+------------------+-----+-------------------------+
| PetersAlarmFails | ... | PetersAlarmFails(False) |
+------------------+-----+-------------------------+
| PeterLate(True)  | ... | 0.01                    |
+------------------+-----+-------------------------+
| PeterLate(False) | ... | 0.99

B) Wahrscheinlichkeit - Johanna zu spät zur Arbeit kommt

In [4]:
from pgmpy.inference import VariableElimination
inference = VariableElimination(model)

print(inference.query(variables=['JohannaLate']))

+--------------------+--------------------+
| JohannaLate        |   phi(JohannaLate) |
| JohannaLate(True)  |             0.0440 |
+--------------------+--------------------+
| JohannaLate(False) |             0.9560 |
+--------------------+--------------------+


B) Wahrscheinlichkeit - Peter zu spät zur Arbeit kommt

In [5]:
print(inference.query(variables=['PeterLate']))

+------------------+------------------+
| PeterLate        |   phi(PeterLate) |
| PeterLate(True)  |           0.0635 |
+------------------+------------------+
| PeterLate(False) |           0.9365 |
+------------------+------------------+


C) Wie hoch ist die Wahrscheinlichkeit, dass Johanna zu spät in die Arbeit kommt, falls Peters Wecker
nicht funktioniert?

In [7]:
inference = VariableElimination(model)
evidence = {'PetersAlarmFails': 'True'}
print(inference.query(variables=['JohannaLate'], evidence=evidence))

+--------------------+--------------------+
| JohannaLate        |   phi(JohannaLate) |
| JohannaLate(True)  |             0.0440 |
+--------------------+--------------------+
| JohannaLate(False) |             0.9560 |
+--------------------+--------------------+


D) Wie hoch ist die Wahrscheinlichkeit, dass Johanna zu spät in die Arbeit kommt, wenn Peter zu spät
in die Arbeit kommt?

In [8]:
inference = VariableElimination(model)
evidence = {'PeterLate': 'True'}
print(inference.query(variables=['JohannaLate'], evidence=evidence))

+--------------------+--------------------+
| JohannaLate        |   phi(JohannaLate) |
| JohannaLate(True)  |             0.1484 |
+--------------------+--------------------+
| JohannaLate(False) |             0.8516 |
+--------------------+--------------------+


e) Wie hoch ist die Wahrscheinlichkeit, dass Johanna zu spät in die Arbeit kommt, wenn Peter zu
spät in die Arbeit kommt obwohl Peters Wecker funktioniert?

In [9]:
inference = VariableElimination(model)
evidence = {'PeterLate': 'True', 'PetersAlarmFails': 'False' }
print(inference.query(variables=['JohannaLate'], evidence=evidence))

+--------------------+--------------------+
| JohannaLate        |   phi(JohannaLate) |
| JohannaLate(True)  |             0.3139 |
+--------------------+--------------------+
| JohannaLate(False) |             0.6861 |
+--------------------+--------------------+
