## Making Inferences in the ASIA graph

Notebook layout:
1. Load ASIA model
2. Make inferences

## Graph
![ASIA Graph](ASIA.jpg)

## 1. Load ASIA Model

In [12]:
# import essential libraries to manipulate data
import numpy as np
import pandas as pd

# import pgmpy functions
from pgmpy.readwrite import BIFReader
from pgmpy.inference import VariableElimination

# ignore warnings
import warnings
warnings.filterwarnings('ignore')

In [13]:
# load model
data = BIFReader('asia.bif')
asia_model = data.get_model()

In [14]:
# print nodes of the model
asia_model.nodes()

NodeView(('asia', 'tub', 'smoke', 'lung', 'bronc', 'either', 'xray', 'dysp'))

In [15]:
# print edges of the model
asia_model.edges()

OutEdgeView([('asia', 'tub'), ('tub', 'either'), ('smoke', 'lung'), ('smoke', 'bronc'), ('lung', 'either'), ('bronc', 'dysp'), ('either', 'xray'), ('either', 'dysp')])

In [16]:
# get model parameters
asia_model.get_cpds()

[<TabularCPD representing P(asia:2) at 0x24f7bcdd9e8>,
 <TabularCPD representing P(bronc:2 | smoke:2) at 0x24f7bcddc18>,
 <TabularCPD representing P(dysp:2 | bronc:2, either:2) at 0x24f7bcdddd8>,
 <TabularCPD representing P(either:2 | lung:2, tub:2) at 0x24f7bcddc50>,
 <TabularCPD representing P(lung:2 | smoke:2) at 0x24f7bcddc88>,
 <TabularCPD representing P(smoke:2) at 0x24f7bcddcf8>,
 <TabularCPD representing P(tub:2 | asia:2) at 0x24f7bcdd828>,
 <TabularCPD representing P(xray:2 | either:2) at 0x24f7bcdd860>]

In [36]:
# print the CPDs by iterating through them
for cpd in asia_model.get_cpds():
    print("CPD of {variable}:".format(variable=cpd.variable))
    print(cpd)
    print(cpd.name_to_no)

CPD of asia:
+-----------+------+
| asia(yes) | 0.01 |
+-----------+------+
| asia(no)  | 0.99 |
+-----------+------+
{'asia': {'yes': 0, 'no': 1}}
CPD of bronc:
+------------+------------+-----------+
| smoke      | smoke(yes) | smoke(no) |
+------------+------------+-----------+
| bronc(yes) | 0.6        | 0.3       |
+------------+------------+-----------+
| bronc(no)  | 0.4        | 0.7       |
+------------+------------+-----------+
{'smoke': {'yes': 0, 'no': 1}, 'bronc': {'yes': 0, 'no': 1}}
CPD of dysp:
+-----------+-------------+------------+-------------+------------+
| bronc     | bronc(yes)  | bronc(yes) | bronc(no)   | bronc(no)  |
+-----------+-------------+------------+-------------+------------+
| either    | either(yes) | either(no) | either(yes) | either(no) |
+-----------+-------------+------------+-------------+------------+
| dysp(yes) | 0.9         | 0.8        | 0.7         | 0.1        |
+-----------+-------------+------------+-------------+------------+
| dysp(n

#### NOTE : Here the 'yes' denotes 0 and 'no' denotes 1. Because of the recent updates in pgmpy, the way it stores the variables changed too.

In [7]:
# get independencies
asia_model.get_independencies()

(asia _|_ smoke, lung, bronc)
(asia _|_ smoke, lung | bronc)
(asia _|_ either, smoke, lung, xray, dysp, bronc | tub)
(asia _|_ xray | either)
(asia _|_ lung, bronc | smoke)
(asia _|_ smoke, bronc | lung)
(asia _|_ either, smoke, lung, xray, dysp | tub, bronc)
(asia _|_ dysp, xray | either, bronc)
(asia _|_ lung | smoke, bronc)
(asia _|_ smoke | lung, bronc)
(asia _|_ smoke, lung, xray, dysp, bronc | either, tub)
(asia _|_ either, lung, xray, dysp, bronc | smoke, tub)
(asia _|_ either, smoke, xray, dysp, bronc | lung, tub)
(asia _|_ either, smoke, lung, dysp, bronc | xray, tub)
(asia _|_ either, smoke, lung, xray, bronc | dysp, tub)
(asia _|_ dysp, xray, bronc | either, smoke)
(asia _|_ dysp, smoke, xray, bronc | either, lung)
(asia _|_ xray | dysp, either)
(asia _|_ bronc | smoke, lung)
(asia _|_ bronc | smoke, xray)
(asia _|_ smoke, bronc | lung, xray)
(asia _|_ dysp, smoke, lung, xray | either, tub, bronc)
(asia _|_ dysp, either, lung, xray | smoke, tub, bronc)
(asia _|_ dysp, either

## Make Inferences

**Q1.** What is the probability of a person having bronchitis given that he/she doesn't smoke?

In [39]:
# initialise inference class
infer = VariableElimination(asia_model)

# make query
bronchitis = infer.query(variables=['bronc'], evidence={'smoke': 0},joint=False )

# print result
print(bronchitis['bronc'])

Finding Elimination Order: : 100%|█████████████████████████████████████████████████████| 6/6 [00:00<00:00, 1500.38it/s]
Eliminating: tub: 100%|█████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 239.99it/s]


+------------+--------------+
| bronc      |   phi(bronc) |
| bronc(yes) |       0.6000 |
+------------+--------------+
| bronc(no)  |       0.4000 |
+------------+--------------+


What is the probability that the person is a smoker given that their x-ray result is positive?

**Q2.** What is the probability that the person is a smoker given that their x-ray result is positive

In [38]:
smoker = infer.query(variables=['smoke'], evidence={'xray': 1},joint=False )
print(smoker['smoke'])

Finding Elimination Order: : 100%|█████████████████████████████████████████████████████| 6/6 [00:00<00:00, 1200.09it/s]
Eliminating: tub: 100%|█████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 222.22it/s]


+------------+--------------+
| smoke      |   phi(smoke) |
| smoke(yes) |       0.4767 |
+------------+--------------+
| smoke(no)  |       0.5233 |
+------------+--------------+


**Q3.** Consider the following two probabilities:

1. The probability that a person doesn't have dyspnea given that he has bronchitis. 
2. The probability that a person doesn't have dyspnea given that he has bronchitis and doesn't have lung cancer.

By what percent does the probability increase or decrease from situation 1 to situation 2?

In [40]:
dyspnea_b = infer.query(variables=['dysp'], evidence={'bronc': 1},joint=False )
print(dyspnea_b['dysp'])

dyspnea_b_l = infer.query(variables=['dysp'], evidence={'bronc': 1, 'lung': 0},joint=False)
print(dyspnea_b_l['dysp'])

Finding Elimination Order: : 100%|██████████████████████████████████████████████████████| 6/6 [00:00<00:00, 797.70it/s]
Eliminating: tub: 100%|██████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 99.99it/s]


+-----------+-------------+
| dysp      |   phi(dysp) |
| dysp(yes) |      0.1316 |
+-----------+-------------+
| dysp(no)  |      0.8684 |
+-----------+-------------+


Finding Elimination Order: : 100%|██████████████████████████████████████████████████████| 5/5 [00:00<00:00, 714.48it/s]
Eliminating: smoke: 100%|███████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 172.41it/s]


+-----------+-------------+
| dysp      |   phi(dysp) |
| dysp(yes) |      0.7000 |
+-----------+-------------+
| dysp(no)  |      0.3000 |
+-----------+-------------+


Probability of not having Dyspnea changed from 13% to 70%. The percentage change is ((0.70-0.1316)*100)/(0.1316) = +431.91%