## Making Inferences in the ASIA graph

Notebook layout:
1. Load ASIA model
2. Make inferences

## Graph
![ASIA Graph](resources/ASIA.jpg)

## 1. Load ASIA Model

In [1]:
# import essential libraries to manipulate data
import numpy as np
import pandas as pd

# import pgmpy functions
from pgmpy.readwrite import BIFReader
from pgmpy.inference import VariableElimination

# ignore warnings
import warnings
warnings.filterwarnings('ignore')

In [2]:
# load model
data = BIFReader('asia.bif')
asia_model = data.get_model()

In [3]:
# print nodes of the model
asia_model.nodes()

NodeView(('asia', 'tub', 'smoke', 'lung', 'bronc', 'either', 'xray', 'dysp'))

In [4]:
# print edges of the model
asia_model.edges()

OutEdgeView([('asia', 'tub'), ('tub', 'either'), ('smoke', 'lung'), ('smoke', 'bronc'), ('lung', 'either'), ('bronc', 'dysp'), ('either', 'xray'), ('either', 'dysp')])

In [5]:
# get model parameters
asia_model.get_cpds()

[<TabularCPD representing P(asia:2) at 0x2bcc8d5b2b0>,
 <TabularCPD representing P(bronc:2 | smoke:2) at 0x2bcc8d7cfd0>,
 <TabularCPD representing P(dysp:2 | bronc:2, either:2) at 0x2bcc8d6ee48>,
 <TabularCPD representing P(either:2 | lung:2, tub:2) at 0x2bcc8d86da0>,
 <TabularCPD representing P(lung:2 | smoke:2) at 0x2bcc8d86e80>,
 <TabularCPD representing P(smoke:2) at 0x2bcc8d86eb8>,
 <TabularCPD representing P(tub:2 | asia:2) at 0x2bcc8d86c18>,
 <TabularCPD representing P(xray:2 | either:2) at 0x2bcc8d86d68>]

In [6]:
# print the CPDs by iterating through them
for cpd in asia_model.get_cpds():
    print("CPD of {variable}:".format(variable=cpd.variable))
    print(cpd)

CPD of asia:
╒════════╤══════╕
│ asia_0 │ 0.01 │
├────────┼──────┤
│ asia_1 │ 0.99 │
╘════════╧══════╛
CPD of bronc:
╒═════════╤═════════╤═════════╕
│ smoke   │ smoke_0 │ smoke_1 │
├─────────┼─────────┼─────────┤
│ bronc_0 │ 0.6     │ 0.3     │
├─────────┼─────────┼─────────┤
│ bronc_1 │ 0.4     │ 0.7     │
╘═════════╧═════════╧═════════╛
CPD of dysp:
╒════════╤══════════╤══════════╤══════════╤══════════╕
│ bronc  │ bronc_0  │ bronc_0  │ bronc_1  │ bronc_1  │
├────────┼──────────┼──────────┼──────────┼──────────┤
│ either │ either_0 │ either_1 │ either_0 │ either_1 │
├────────┼──────────┼──────────┼──────────┼──────────┤
│ dysp_0 │ 0.9      │ 0.7      │ 0.8      │ 0.1      │
├────────┼──────────┼──────────┼──────────┼──────────┤
│ dysp_1 │ 0.1      │ 0.3      │ 0.2      │ 0.9      │
╘════════╧══════════╧══════════╧══════════╧══════════╛
CPD of either:
╒══════════╤════════╤════════╤════════╤════════╕
│ lung     │ lung_0 │ lung_0 │ lung_1 │ lung_1 │
├──────────┼────────┼────────┼────────

In [7]:
# get independencies
asia_model.get_independencies()

(asia _|_ smoke, bronc, lung)
(asia _|_ xray | either)
(asia _|_ bronc, lung | smoke)
(asia _|_ smoke, lung | bronc)
(asia _|_ smoke, bronc | lung)
(asia _|_ either, smoke, xray, dysp, bronc, lung | tub)
(asia _|_ xray, dysp, bronc | either, smoke)
(asia _|_ xray, dysp | either, bronc)
(asia _|_ xray | either, dysp)
(asia _|_ smoke, xray, dysp, bronc | either, lung)
(asia _|_ smoke, xray, dysp, bronc, lung | either, tub)
(asia _|_ lung | smoke, bronc)
(asia _|_ bronc | smoke, xray)
(asia _|_ bronc | smoke, lung)
(asia _|_ either, xray, dysp, bronc, lung | smoke, tub)
(asia _|_ smoke | bronc, lung)
(asia _|_ either, smoke, xray, dysp, lung | tub, bronc)
(asia _|_ smoke, bronc | xray, lung)
(asia _|_ either, smoke, dysp, bronc, lung | xray, tub)
(asia _|_ either, smoke, xray, bronc, lung | dysp, tub)
(asia _|_ either, smoke, xray, dysp, bronc | tub, lung)
(asia _|_ xray, dysp | either, smoke, bronc)
(asia _|_ dysp, bronc | either, smoke, xray)
(asia _|_ xray, bronc | either, smoke, dysp)

## Make Inferences

**Q1.** What is the probability of a person having bronchitis given that he/she doesn't smoke?

In [8]:
# initialise inference class
infer = VariableElimination(asia_model)

# make query
bronchitis = infer.query(variables=['bronc'], evidence={'smoke': 0})

# print result
print(bronchitis['bronc'])

╒═════════╤══════════════╕
│ bronc   │   phi(bronc) │
╞═════════╪══════════════╡
│ bronc_0 │       0.6000 │
├─────────┼──────────────┤
│ bronc_1 │       0.4000 │
╘═════════╧══════════════╛


What is the probability that the person is a smoker given that their x-ray result is positive?

**Q2.** What is the probability that the person is a smoker given that their x-ray result is positive

In [9]:
smoker = infer.query(variables=['smoke'], evidence={'xray': 1})
print(smoker['smoke'])

╒═════════╤══════════════╕
│ smoke   │   phi(smoke) │
╞═════════╪══════════════╡
│ smoke_0 │       0.4767 │
├─────────┼──────────────┤
│ smoke_1 │       0.5233 │
╘═════════╧══════════════╛


**Q3.** Consider the following two probabilities:

1. The probability that a person doesn't have dyspnea given that he has bronchitis. 
2. The probability that a person doesn't have dyspnea given that he has bronchitis and doesn't have lung cancer.

By what percent does the probability increase or decrease from situation 1 to situation 2?

In [10]:
dyspnea_b = infer.query(variables=['dysp'], evidence={'bronc': 1})
print(dyspnea_b['dysp'])

dyspnea_b_l = infer.query(variables=['dysp'], evidence={'bronc': 1, 'lung': 0})
print(dyspnea_b_l['dysp'])

╒════════╤═════════════╕
│ dysp   │   phi(dysp) │
╞════════╪═════════════╡
│ dysp_0 │      0.1369 │
├────────┼─────────────┤
│ dysp_1 │      0.8631 │
╘════════╧═════════════╛
╒════════╤═════════════╕
│ dysp   │   phi(dysp) │
╞════════╪═════════════╡
│ dysp_0 │      0.8000 │
├────────┼─────────────┤
│ dysp_1 │      0.2000 │
╘════════╧═════════════╛


Probability of not having Dyspnea changed from 14% to 80%. The percentage change is ((0.80-0.14)*100)/(0.14) = +471.43%