# UE09: Beispiel zu Bayesian Networks Aufgabe 2

## Implementation in Python using `pgmpy`.

Let's start by installing the `pgmpy` library.

In [None]:
!pip install -q pgmpy networkx matplotlib plotly daft

Import the required libraries.

In [None]:
import numpy as np
from pgmpy.models import BayesianNetwork
from pgmpy.factors.discrete import TabularCPD
import networkx as nx
import matplotlib.pyplot as plt

## (a) Erstellung des Bayes’schen Netzes (Struktur + Wahrscheinlichkeiten)
Create the Bayesian network:

In [None]:
# Define the network structure
# "model" first holds the structure of the bayesian network. Every TUPLE in the list shows one PATH through the whole network --> tuple of EDGE CHAINS
model = BayesianNetwork([
    ('PetersAlarmFails', 'PeterLate'),
    ('TrainStrike', 'PeterLate'),
    ('TrainStrike', 'JohannaLate')
    ])

# "statenames" is a DICTIONARY which holds as KEYS the event nodes and as VALUES a list of state names for the specific key (always binary - T/F in this case)
statenames = {
    'TrainStrike': ['True', 'False'],
    'PetersAlarmFails': ['True', 'False'],
    'PeterLate': ['True', 'False'],
    'JohannaLate': ['True', 'False'],
}

# render the bayesian network graph (shown as last output)
model.to_daft().render()

model.add_cpds(

    # Define the probability distribution for 'TrainStrike' --> P(TrainStrike)
    TabularCPD(
      variable='TrainStrike',
      variable_card=2,
      state_names=statenames,
      values=[[0.05], [0.95]]
    ),

    # Define the probability distribution for 'PetersAlarmFails' --> P(PetersAlarmFails)
    # CPDs == Conditional Probability Tables
    TabularCPD(
      variable='PetersAlarmFails',
      variable_card=2,
      state_names=statenames,
      values=[[0.1], [0.9]]
    ),

    # Define the conditional probability distribution for 'JohannaLate' --> P(JohannaLate | TrainStrike)
    TabularCPD(
      variable='JohannaLate',
      variable_card=2,
      state_names=statenames,
      evidence=['TrainStrike'],
      evidence_card=[2],
      values=[[0.5, 0.02],
              [0.5, 0.98]]
    ),

    # Define the conditional probability distribution for 'PeterLate' --> P(PeterLate | PetersAlarmFails, TrainStrike))
    # the order of the table is: [left to right] in ecidence list goes [top to bottom] in value matrix
    # --> P(PeterLate | PetersAlarmFails,TrainStrike), P(PeterLate | PetersAlarmFails,NOT(TrainStrike)), P(PeterLate | NOT(PetersAlarmFails),TrainStrike), P(PeterLate | NOT(PetersAlarmFails),NOT(TrainStrike))
    # --> same but always for NOT(PeterLate)
    TabularCPD(
      variable='PeterLate',
      variable_card=2,
      state_names=statenames,
      evidence=['PetersAlarmFails', 'TrainStrike'],
      evidence_card=[2, 2],
      values=[[0.7, 0.4, 0.3, 0.01],
              [0.3, 0.6, 0.7, 0.99]]
      )


)

for cpd in model.get_cpds():
  print(cpd, '\n\n')


## (b) Wie hoch ist die Wahrscheinlichkeit, dass Johanna bzw Peter zu spät in die Arbeit kommen?


- Lt. untenstehender CPD ist P(JohannaLate) = 4 % // P(NOT(JohannaLate)) = 96 %

- Lt. untenstehender CPD ist P(PeterLate) = 60%  // P(NOT(PeterLate)) = 40%

Compute the probabilities of JohannaLate being TRUE or FALSE.

In [None]:
from pgmpy.inference import VariableElimination
inference = VariableElimination(model)

In [None]:
print(inference.query(variables=['JohannaLate']))

and the probability that someone is infected given a positive test result.

Compute the probabilities of PeterLate being TRUE or FALSE.

In [None]:
print(inference.query(variables=['PeterLate']))

## (c) Wie hoch ist die Wahrscheinlichkeit, dass Johanna zu spät in die Arbeit kommt, falls Peters Wecker nicht funktioniert?

- lt. untenstehendem Ergebnis ist P(JohannaLate | PetersAlarmFails) weiterhin 4%

In [None]:
evidence={'PetersAlarmFails':'True'}
print(evidence, '\n\n', inference.query(variables=['JohannaLate'], evidence=evidence))

## (d) Wie hoch ist die Wahrscheinlichkeit, dass Johanna zu spät in die Arbeit kommt, wenn Peter zu spät in die Arbeit kommt?

- lt. untenstehendem Ergebnis ist P(JohannaLate | PeterLate) = 15%

In [None]:
evidence={'PeterLate':'True'}
print(evidence, '\n\n', inference.query(variables=['JohannaLate'], evidence=evidence))

## (e) Wie hoch ist die Wahrscheinlichkeit, dass Johanna zu spät in die Arbeit kommt, wenn Peter zu spät in die Arbeit kommt obwohl Peters Wecker funktioniert?

- lt. untenstehendem Ergebnis steigt P(JohannaLate | PeterLate, NOT(PetersAlarmFails)) auf 31%

In [None]:
evidence={'PeterLate':'True', 'PetersAlarmFails': 'False'}
print(evidence, '\n\n', inference.query(variables=['JohannaLate'], evidence=evidence))