# Binary Bayes Net Inference

This is a quick notebook exercise to exemplify Bayes Net (BN) inference. 

Consider the following BN:  

![Imaginary SuperBowl Bayes Net Diagram](BN-NFL.png "Imaginary SuperBowl Bayes Net Diagram")


----
We can use the `BayesianNetwork` module from [pgmpy](https://pgmpy.org/) to construct this network:

In [2]:
pip install pgmpy

Collecting pgmpy
  Obtaining dependency information for pgmpy from https://files.pythonhosted.org/packages/c7/e6/e451590c2341b3d59d7b613e1af80daefd9e2873f7c9ad3d498ad84e7f44/pgmpy-0.1.26-py3-none-any.whl.metadata
  Downloading pgmpy-0.1.26-py3-none-any.whl.metadata (9.1 kB)
Collecting xgboost (from pgmpy)
  Obtaining dependency information for xgboost from https://files.pythonhosted.org/packages/f5/b6/653a70910739f127adffbefb688ebc22b51139292757de7c22b1e04ce792/xgboost-2.1.4-py3-none-macosx_12_0_arm64.whl.metadata
  Downloading xgboost-2.1.4-py3-none-macosx_12_0_arm64.whl.metadata (2.1 kB)
Collecting google-generativeai (from pgmpy)
  Obtaining dependency information for google-generativeai from https://files.pythonhosted.org/packages/9b/b0/6c6af327a8a6ef3be6fe79be1d6f1e2914d6c363aa6b081b93396f4460a7/google_generativeai-0.8.4-py3-none-any.whl.metadata
  Downloading google_generativeai-0.8.4-py3-none-any.whl.metadata (4.2 kB)
Collecting google-ai-generativelanguage==0.6.15 (from google-

Collecting uritemplate<5,>=3.0.1 (from google-api-python-client->google-generativeai->pgmpy)
  Obtaining dependency information for uritemplate<5,>=3.0.1 from https://files.pythonhosted.org/packages/81/c0/7461b49cd25aeece13766f02ee576d1db528f1c37ce69aee300e075b485b/uritemplate-4.1.1-py2.py3-none-any.whl.metadata
  Downloading uritemplate-4.1.1-py2.py3-none-any.whl.metadata (2.9 kB)
Collecting annotated-types>=0.6.0 (from pydantic->google-generativeai->pgmpy)
  Obtaining dependency information for annotated-types>=0.6.0 from https://files.pythonhosted.org/packages/78/b6/6307fbef88d9b5ee7421e68d78a9f162e0da4900bc5f5793f6d3d0e34fb8/annotated_types-0.7.0-py3-none-any.whl.metadata
  Downloading annotated_types-0.7.0-py3-none-any.whl.metadata (15 kB)
Collecting pydantic-core==2.27.2 (from pydantic->google-generativeai->pgmpy)
  Obtaining dependency information for pydantic-core==2.27.2 from https://files.pythonhosted.org/packages/9e/e3/71fe85af2021f3f386da42d291412e5baf6ce7716bd7101ea49c810e

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow 2.16.1 requires protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.20.3, but you have protobuf 5.29.3 which is incompatible.[0m[31m
[0mSuccessfully installed annotated-types-0.7.0 cachetools-5.5.1 google-ai-generativelanguage-0.6.15 google-api-core-2.24.1 google-api-python-client-2.161.0 google-auth-2.38.0 google-auth-httplib2-0.2.0 google-generativeai-0.8.4 googleapis-common-protos-1.67.0 grpcio-1.70.0 grpcio-status-1.70.0 httplib2-0.22.0 pgmpy-0.1.26 proto-plus-1.26.0 protobuf-5.29.3 pydantic-2.10.6 pydantic-core-2.27.2 rsa-4.9 typing-extensions-4.12.2 uritemplate-4.1.1 xgboost-2.1.4
Note: you may need to restart the kernel to use updated packages.


In [3]:
import numpy as np
import pandas as pd

from pgmpy.models import BayesianNetwork
from pgmpy.factors.discrete import TabularCPD

In [4]:
# Define Bayesian Network structure
model = BayesianNetwork([('F', 'Q'), ('F', 'D'), ('Q', 'W'), ('D', 'W')])

# Define CPDs
cpd_f = TabularCPD(variable='F', variable_card=2, values=[[0.3], [0.7]], state_names={"F":["low", "high"]})
cpd_q = TabularCPD(variable='Q', variable_card=2, values=[[0.5, 0.2], [0.5, 0.8]],
                    evidence=['F'], evidence_card=[2], state_names={"F":["low", "high"], "Q": ["bad", "good"]})
cpd_d = TabularCPD(variable='D', variable_card=2, values=[[0.6, 0.3], [0.4, 0.7]],
                    evidence=['F'], evidence_card=[2], state_names={"F":["low", "high"], "D": ["weak", "strong"]})
cpd_w = TabularCPD(variable='W', variable_card=2, 
                    values=[[0.30, 0.5, 0.15, 0.25], [0.70, 0.5, 0.85, 0.75]],
                    evidence=['Q', 'D'], evidence_card=[2, 2], state_names={"Q":["bad", "good"], "D": ["weak", "strong"], "W": ["lose", "win"]})

# Add CPDs to model
model.add_cpds(cpd_f, cpd_q, cpd_d, cpd_w)

# Check Model
assert model.check_model()

In [5]:
_ = [print (cpd) for cpd in model.get_cpds()]

+---------+-----+
| F(low)  | 0.3 |
+---------+-----+
| F(high) | 0.7 |
+---------+-----+
+---------+--------+---------+
| F       | F(low) | F(high) |
+---------+--------+---------+
| Q(bad)  | 0.5    | 0.2     |
+---------+--------+---------+
| Q(good) | 0.5    | 0.8     |
+---------+--------+---------+
+-----------+--------+---------+
| F         | F(low) | F(high) |
+-----------+--------+---------+
| D(weak)   | 0.6    | 0.3     |
+-----------+--------+---------+
| D(strong) | 0.4    | 0.7     |
+-----------+--------+---------+
+---------+---------+-----------+---------+-----------+
| Q       | Q(bad)  | Q(bad)    | Q(good) | Q(good)   |
+---------+---------+-----------+---------+-----------+
| D       | D(weak) | D(strong) | D(weak) | D(strong) |
+---------+---------+-----------+---------+-----------+
| W(lose) | 0.3     | 0.5       | 0.15    | 0.25      |
+---------+---------+-----------+---------+-----------+
| W(win)  | 0.7     | 0.5       | 0.85    | 0.75      |
+---------+---

----
Calculate $P(W|F=\text{high})$

$$
\begin{align}
P(W|F=\text{high}) & = \\
& \propto P(W,F=\text{high}) \\
& = \sum_{q\in Q,d \in D} P(F=\text{high}, Q, D, W) \\
& = \sum_{q,d} P(F=\text{high})P(q|F=\text{high})P(d|F=\text{high})P(W|q,d) 
\end{align}
$$

In [1]:
P_w = None

# Calcuate the probabiity of winning and losing 
# and put it in a the P_w variable

# YOUR CODE HERE
P_w = (.7 * .2 * .3 * .7) + (.7 * .2 * .7 * .5)  + (.7 * .8 * .3 * .85) + (.7 * .8 * .7 * .75)

P_L = (.7 * .2 * .3 * .3) + (.7 * .2 * .7 * .5) + (.7 * .8 * .3 * .15) + (.7 * .8 * .7 * .25)

print(P_w)
print(P_L)
#ans should be [.736, .26]

0.5152
0.18479999999999996


In [None]:
# This cell intentionaly left empty


----
Then we can use Variable Elimination to do the same inference. 

Variable Elimination is based on the following insight:

$$
\begin{align}
& \sum_{q,d} P(F=\text{high})P(q|F=\text{high})P(d|F=\text{high})P(W|q,d) \\
& = P(F=\text{high}) \sum_{q,d} P(q|F=\text{high})P(d|F=\text{high})P(W|q,d) \\
& = P(F=\text{high}) \sum_{q} P(q|F=\text{high})\sum_{d}P(d|F=\text{high})P(W|q,d) \\
\end{align}
$$

----

Now use the `VariabeElimination` functionalityin `pgmpy` to calcuate the same probability.


In [None]:
# YOUR CODE HERE
raise NotImplementedError()

----
Here's a more complex example, using the indurance BN:

In [None]:
from pgmpy.utils import get_example_model

model = get_example_model('insurance')
model.get_cardinality()

In [None]:
print(model.get_cpds(node="Age"))

In [None]:
print(model.get_cpds(node="DrivQuality"))

Can you calculate the probability of `DrivQuality` given `Age` for both `Adolescent` and `Senior` values of `Age`? 

In [None]:
# YOUR CODE HERE
raise NotImplementedError()