<a href="https://colab.research.google.com/github/GabeMaldonado/UoL_Study_Materials/blob/main/Bayesian_networks_1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# This is an example of domain knowledge that can be represented as a Bayesian Network:

I can contact my boss by email or skype. I can get a quick response or not. If she/he is in office (only 50% of time), she/he will respond in 50% cases. The boss usually does not check skype messages when in office, but 50% of the time checks when not in office. The boss usually checks email messages when in office and is unlikely to check it when not in office. The boss very likely responses, if have checked both skype and e-mail. The boss likely responses, if have checked one of skype or email. The boss does not response, if have not checked neither skype nor email.

In this lab, we will see how to implement this example in Python.


In [None]:
#Importing Library

import numpy as np

In [None]:
!pip install pgmpy

Collecting pgmpy
  Downloading pgmpy-0.1.17-py3-none-any.whl (1.9 MB)
[K     |████████████████████████████████| 1.9 MB 6.5 MB/s 
Installing collected packages: pgmpy
Successfully installed pgmpy-0.1.17


In [None]:

from pgmpy.factors.discrete import TabularCPD

In [None]:
from pgmpy.models import BayesianModel

Pgmpy is a pure python implementation for Bayesian Networks with a focus on modularity and extensibility:
https://pgmpy.org/
The pgmpy website provides great tutorials and examples of BN.

We define the network structure, named office_model, using BayesianModel

In [None]:
office_model = BayesianModel([('InOffice', 'Skype'),
                              ('InOffice', 'Email'),
                              ('Skype', 'Response'),
                              ('Email', 'Response')])



TabularCPD defines the conditional probability distribution table (cpd table). We need these tables for each node. After defining them, we add them all to the model.

In [None]:
inoffice_cpd = TabularCPD(
    variable = 'InOffice',
    variable_card = 2,   # cardinality
    values = [[0.5], [0.5]])  # ['yes', 'no']
    


In [None]:
skype_cpd = TabularCPD(
    variable = 'Skype',
    variable_card = 2,
    values = [[.1, .5],
              [.9, .5]],
    evidence = ['InOffice'],
    evidence_card = [2])

In [None]:
email_cpd = TabularCPD(
    variable = 'Email',
    variable_card = 2,
    values = [[.8, .2],
              [.2, .8]],
    evidence = ['InOffice'],
    evidence_card = [2])

In [None]:
response_cpd = TabularCPD(
    variable = 'Response',
    variable_card = 2,
    values = [[.99, .90, .90, 0.0],
              [.01, .10, .10, 1.0]],
    evidence = ['Skype', 'Email'],
    evidence_card = [2,2])

In [None]:
office_model.add_cpds(inoffice_cpd, skype_cpd, email_cpd, response_cpd)

it is always good to check if got all CPDs correctly, ehat dependences are, etc.

In [None]:
office_model.get_cpds()

[<TabularCPD representing P(InOffice:2) at 0x7f0ff3b4f050>,
 <TabularCPD representing P(Skype:2 | InOffice:2) at 0x7f0ff3b30c50>,
 <TabularCPD representing P(Email:2 | InOffice:2) at 0x7f0ff3b4fb90>,
 <TabularCPD representing P(Response:2 | Skype:2, Email:2) at 0x7f0ff3b30bd0>]

In [None]:
office_model.active_trail_nodes('InOffice')

{'InOffice': {'Email', 'InOffice', 'Response', 'Skype'}}

In [None]:
office_model.local_independencies('InOffice')



In [None]:
office_model.get_independencies()

(Email ⟂ Skype | InOffice)
(Response ⟂ InOffice | Email, Skype)
(Skype ⟂ Email | InOffice)
(InOffice ⟂ Response | Email, Skype)


We import VariableElimination, so we can start probabilistic inference to calculate various the probabilities. For example, what is the probability of receiving a response? what a probability of receiving a response by skype? by email? and so on


In [None]:
from pgmpy.inference import VariableElimination

  import pandas.util.testing as tm


In [None]:
office_infer = VariableElimination(office_model)

In [None]:
prob_response = office_infer.query(variables = ['Response'], joint=False)

  0%|          | 0/3 [00:00<?, ?it/s]

  0%|          | 0/3 [00:00<?, ?it/s]

In [None]:
print(prob_response['Response'])

+-------------+-----------------+
| Response    |   phi(Response) |
| Response(0) |          0.6471 |
+-------------+-----------------+
| Response(1) |          0.3529 |
+-------------+-----------------+


In [None]:
prob_skype = office_infer.query(variables = ['Skype'], joint=False)

  0%|          | 0/1 [00:00<?, ?it/s]

  0%|          | 0/1 [00:00<?, ?it/s]

In [None]:
print(prob_skype['Skype'])

+----------+--------------+
| Skype    |   phi(Skype) |
| Skype(0) |       0.3000 |
+----------+--------------+
| Skype(1) |       0.7000 |
+----------+--------------+


In [None]:
prob_email = office_infer.query(variables = ['Email'], joint=False)

  0%|          | 0/1 [00:00<?, ?it/s]

  0%|          | 0/1 [00:00<?, ?it/s]

In [None]:
print(prob_email['Email'])

+----------+--------------+
| Email    |   phi(Email) |
| Email(0) |       0.5000 |
+----------+--------------+
| Email(1) |       0.5000 |
+----------+--------------+


In [None]:
prob_inoffice = office_infer.query(variables = ['InOffice'], joint=False)

0it [00:00, ?it/s]

0it [00:00, ?it/s]

In [None]:
print(prob_inoffice['InOffice'])

+-------------+-----------------+
| InOffice    |   phi(InOffice) |
| InOffice(0) |          0.5000 |
+-------------+-----------------+
| InOffice(1) |          0.5000 |
+-------------+-----------------+


In [None]:
prob_response_inoffice = office_infer.query(
        variables = ['Response'], joint=False,
        evidence = {'InOffice':0})

  0%|          | 0/2 [00:00<?, ?it/s]

  0%|          | 0/2 [00:00<?, ?it/s]

In [None]:
print(prob_response_inoffice['Response'])

+-------------+-----------------+
| Response    |   phi(Response) |
| Response(0) |          0.7452 |
+-------------+-----------------+
| Response(1) |          0.2548 |
+-------------+-----------------+


In [None]:
prob_email_inoffice = office_infer.query(
        variables = ['Email'], joint=False,
        evidence = {'InOffice':0})

0it [00:00, ?it/s]

0it [00:00, ?it/s]

In [None]:
print(prob_email_inoffice['Email'])

+----------+--------------+
| Email    |   phi(Email) |
| Email(0) |       0.8000 |
+----------+--------------+
| Email(1) |       0.2000 |
+----------+--------------+


In [None]:
prob_skype_inoffice = office_infer.query(
        variables = ['Skype'], joint=False,
        evidence = {'InOffice':0})

0it [00:00, ?it/s]

0it [00:00, ?it/s]

In [None]:
print(prob_skype_inoffice['Skype'])

+----------+--------------+
| Skype    |   phi(Skype) |
| Skype(0) |       0.1000 |
+----------+--------------+
| Skype(1) |       0.9000 |
+----------+--------------+


In [None]:
prob_inoffice_response = office_infer.query(
        variables = ['InOffice'], joint=False,
        evidence = {'Response':0})

  0%|          | 0/2 [00:00<?, ?it/s]

  0%|          | 0/2 [00:00<?, ?it/s]

In [None]:
print(prob_inoffice_response['InOffice'])

+-------------+-----------------+
| InOffice    |   phi(InOffice) |
| InOffice(0) |          0.5758 |
+-------------+-----------------+
| InOffice(1) |          0.4242 |
+-------------+-----------------+


In [None]:
prob_email_response = office_infer.query(
        variables = ['Email'], joint=False,
        evidence = {'Response':0})

  0%|          | 0/2 [00:00<?, ?it/s]

  0%|          | 0/2 [00:00<?, ?it/s]

In [None]:
print(prob_email_response['Email'])

+----------+--------------+
| Email    |   phi(Email) |
| Email(0) |       0.7079 |
+----------+--------------+
| Email(1) |       0.2921 |
+----------+--------------+


In [None]:
prob_skype_response = office_infer.query(
        variables = ['Skype'], joint=False,
        evidence = {'Response':0})

  0%|          | 0/2 [00:00<?, ?it/s]

  0%|          | 0/2 [00:00<?, ?it/s]

In [None]:
print(prob_skype_response['Skype'])

+----------+--------------+
| Skype    |   phi(Skype) |
| Skype(0) |       0.4298 |
+----------+--------------+
| Skype(1) |       0.5702 |
+----------+--------------+


In [None]:
prob_inoffice_response_skype = office_infer.query(
        variables = ['InOffice'], joint=False,
        evidence = {'Response':0, 'Skype':0})

  0%|          | 0/1 [00:00<?, ?it/s]

  0%|          | 0/1 [00:00<?, ?it/s]

In [None]:
print(prob_inoffice_response_skype['InOffice'])

+-------------+-----------------+
| InOffice    |   phi(InOffice) |
| InOffice(0) |          0.1748 |
+-------------+-----------------+
| InOffice(1) |          0.8252 |
+-------------+-----------------+


In [None]:
prob_response_inoffice_email = office_infer.query(
        variables = ['Email'], joint=False,
        evidence = {'Response':0, 'Skype':0})

  0%|          | 0/1 [00:00<?, ?it/s]

  0%|          | 0/1 [00:00<?, ?it/s]

In [None]:
print(prob_response_inoffice_email['Email'])

+----------+--------------+
| Email    |   phi(Email) |
| Email(0) |       0.3204 |
+----------+--------------+
| Email(1) |       0.6796 |
+----------+--------------+
