# Title: Heart Disease Databases
The Cleveland database contains 76 attributes, but all published experiments refer to using a
subset of 14 of them. In particular, the Cleveland database is the only one that has been used
by ML researchers to this date. The "Heartdisease" field refers to the presence of heart disease
in the patient. It is integer valued from 0 (no presence) to 4.

Database: 0 1 2 3 4 Total
Cleveland: 164 55 36 35 13 303
    
Attribute Information:
    
1. age: age in years
2. gender: gender (1 = male; 0 = female)
3. cp: chest pain type
 Value 1: typical angina
 Value 2: atypical angina
 Value 3: non-anginal pain
 Value 4: asymptomatic
4. trestbps: resting blood pressure (in mm Hg on admission to the hospital)
5. chol: serum cholestoral in mg/dl
6. fbs: (fasting blood sugar > 120 mg/dl) (1 = true; 0 = false)
7. restecg: resting electrocardiographic results
 Value 0: normal
 Value 1: having ST-T wave abnormality (T wave inversions and/or ST elevation
or depression of > 0.05 mV)
 Value 2: showing probable or definite left ventricular hypertrophy by Estes'
criteria
8. thalach: maximum heart rate achieved
9. exang: exercise induced angina (1 = yes; 0 = no)
10. oldpeak = ST depression induced by exercise relative to rest
11.slope: the slope of the peak exercise ST segment
 Value 1: upsloping
 Value 2: flat
 Value 3: downsloping
12. ca = number of major vessels (0-3) colored by flourosopy
13. thal: 3 = normal; 6 = fixed defect; 7 = reversable defect
14.Heartdisease: It is integer valued from 0 (no presence) to 4. Diagnosis of heart disease
(angiographic disease status)

In [2]:
import numpy as np
import csv
import pandas as pd
from pgmpy.models import BayesianModel
from pgmpy.estimators import MaximumLikelihoodEstimator

In [3]:
# In case "No module named 'pgmpy'" then go to anaconda prompt and in terminal type the below command: 
#pip install pgmpy
#pip install torch

In [4]:
#read Cleveland Heart Disease data
heartDisease = pd.read_csv('heart.csv')
heartDisease.head()

Unnamed: 0,age,gender,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,heartdisease
0,63,1,3,145,233,1,0,150,0,2.3,0,0,1,1
1,37,1,2,130,250,0,1,187,0,3.5,0,0,2,1
2,41,0,1,130,204,0,0,172,0,1.4,2,0,2,1
3,56,1,1,120,236,0,1,178,0,0.8,2,0,2,1
4,57,0,0,120,354,0,1,163,1,0.6,2,0,2,1


In [5]:
#Incase of missing data

del heartDisease['oldpeak']
del heartDisease['slope']
del heartDisease['ca']
del heartDisease['thal']

heartDisease = heartDisease.replace('?',np.nan)
heartDisease.head()

Unnamed: 0,age,gender,cp,trestbps,chol,fbs,restecg,thalach,exang,heartdisease
0,63,1,3,145,233,1,0,150,0,1
1,37,1,2,130,250,0,1,187,0,1
2,41,0,1,130,204,0,0,172,0,1
3,56,1,1,120,236,0,1,178,0,1
4,57,0,0,120,354,0,1,163,1,1


In [6]:
heartDisease.columns

Index(['age', 'gender', 'cp', 'trestbps', 'chol', 'fbs', 'restecg', 'thalach',
       'exang', 'heartdisease'],
      dtype='object')

In [7]:
#Model Bayesian Network

model = BayesianModel([('age', 'trestbps'), ('age', 'fbs'), ('gender', 'trestbps'), ('gender', 'fbs'), 
                       ('exang', 'trestbps'),('trestbps','heartdisease'),('fbs','heartdisease'),
                      ('heartdisease','restecg'),('heartdisease','thalach'),('heartdisease','chol')])

In [8]:
#Learning CPDs using Maximum Likelihood Estimators
print('\n Learning CPD using Maximum likelihood estimators')
model.fit(heartDisease,estimator=MaximumLikelihoodEstimator)


 Learning CPD using Maximum likelihood estimators


In [9]:
#computing the Probability of HeartDisease given Age
print('\n Probability of HeartDisease for given Age')
print(model.get_cpds('age'))


 Probability of HeartDisease for given Age
+---------+------------+
| age(29) | 0.00330033 |
+---------+------------+
| age(34) | 0.00660066 |
+---------+------------+
| age(35) | 0.0132013  |
+---------+------------+
| age(37) | 0.00660066 |
+---------+------------+
| age(38) | 0.00990099 |
+---------+------------+
| age(39) | 0.0132013  |
+---------+------------+
| age(40) | 0.00990099 |
+---------+------------+
| age(41) | 0.0330033  |
+---------+------------+
| age(42) | 0.0264026  |
+---------+------------+
| age(43) | 0.0264026  |
+---------+------------+
| age(44) | 0.0363036  |
+---------+------------+
| age(45) | 0.0264026  |
+---------+------------+
| age(46) | 0.0231023  |
+---------+------------+
| age(47) | 0.0165017  |
+---------+------------+
| age(48) | 0.0231023  |
+---------+------------+
| age(49) | 0.0165017  |
+---------+------------+
| age(50) | 0.0231023  |
+---------+------------+
| age(51) | 0.039604   |
+---------+------------+
| age(52) | 0.0429043  |
+-----

In [10]:
print('\n Probability of HeartDisease for given Gender')
print(model.get_cpds('gender'))


 Probability of HeartDisease for given Gender
+-----------+----------+
| gender(0) | 0.316832 |
+-----------+----------+
| gender(1) | 0.683168 |
+-----------+----------+


In [11]:
print("Inferencing with Bayesian Network")

from pgmpy.inference import VariableElimination
HeartDisease_infer = VariableElimination(model)

# Computing the probability of bronc given smoke.
q = HeartDisease_infer.query(variables=['heartdisease'], evidence={'age': 28})
print(q)

Inferencing with Bayesian Network


  "Found unknown state name. Trying to switch to using all state names as state numbers"
Finding Elimination Order: : 100%|██████████| 7/7 [00:00<00:00, 1315.59it/s]
Eliminating: restecg: 100%|██████████| 7/7 [00:00<00:00, 334.05it/s]


+-----------------+---------------------+
| heartdisease    |   phi(heartdisease) |
| heartdisease(0) |              0.4001 |
+-----------------+---------------------+
| heartdisease(1) |              0.5999 |
+-----------------+---------------------+


In [12]:
q = HeartDisease_infer.query(variables=['heartdisease'], evidence={'chol': 100})
print(q)

Finding Elimination Order: : 100%|██████████| 7/7 [00:00<00:00, 7023.95it/s]
Eliminating: restecg: 100%|██████████| 7/7 [00:00<00:00, 254.27it/s]


+-----------------+---------------------+
| heartdisease    |   phi(heartdisease) |
| heartdisease(0) |              0.0000 |
+-----------------+---------------------+
| heartdisease(1) |              1.0000 |
+-----------------+---------------------+
