#Model 1 trained by algo 2

**#pgmy library is installed to work with graphical models**

In [1]:
!pip install pgmpy
!pip install pandas
!pip install numpy

Collecting pgmpy
[?25l  Downloading https://files.pythonhosted.org/packages/a3/0e/d9fadbfaa35e010c04d43acd3ae9fbefec98897dd7d61a6b7eb5a8b34072/pgmpy-0.1.14-py3-none-any.whl (331kB)
[K     |█                               | 10kB 12.0MB/s eta 0:00:01[K     |██                              | 20kB 13.5MB/s eta 0:00:01[K     |███                             | 30kB 10.0MB/s eta 0:00:01[K     |████                            | 40kB 8.7MB/s eta 0:00:01[K     |█████                           | 51kB 5.5MB/s eta 0:00:01[K     |██████                          | 61kB 6.1MB/s eta 0:00:01[K     |███████                         | 71kB 6.2MB/s eta 0:00:01[K     |████████                        | 81kB 6.8MB/s eta 0:00:01[K     |█████████                       | 92kB 6.2MB/s eta 0:00:01[K     |█████████▉                      | 102kB 6.7MB/s eta 0:00:01[K     |██████████▉                     | 112kB 6.7MB/s eta 0:00:01[K     |███████████▉                    | 122kB 6.7MB/s eta 0:0

**#Model trained with the help of heart.csv file and it follows Algorithm 2**

In [13]:
import pandas as pd

data = pd.read_csv("/content/heart.csv")    #read dataset

from pgmpy.models import BayesianModel
from pgmpy.estimators import ParameterEstimator
from pgmpy.estimators import MaximumLikelihoodEstimator
from pgmpy.inference import VariableElimination


# Defining the model structure. We can define the network by just passing a list of edges"

model = BayesianModel([('age', 'trestbps'), ('age', 'fbs'), ('sex', 'trestbps'),
                       ('exang', 'trestbps'),('trestbps','target'),('fbs','target'),
                      ('target','restecg'),('target','thalach'),('target','chol')])

# EX : age -> trestbps -> target <- chol & age -> fbs -> target <- restecg
 


**Maximum Likelihood Estimation**

In [14]:
#According to MLE, we should fill the CPDs in such a way, that  P(data|model)  is maximal

model.fit(data,estimator=MaximumLikelihoodEstimator)  

#Variable Elimination

HeartDiseasetest_infer=VariableElimination(model) 

####  **Predicting values from new data points**

We need to query for the variable that we need to predict given all the other features. 

In [17]:
#query 1 -> it takes values of restecg and trestbps and predict the target probabbilty

q1=HeartDiseasetest_infer.query(variables=['target'],evidence={'restecg':1,'trestbps':120}, joint=False)

print("\n\nAs an example, the following is the marginal probability when restecg is set to 1 and trestbps is set to 120:")
print(q1['target'])

# if probability of getting 0 in target is more than getting 1 then it will follow 'else' statement otherwise if statement

if q1['target'].values[0] > q1['target'].values[1]:
    print("You are not suffering from heart disease")  
else:
    print("You are suffering from heart disease, please visit  doctor")   

Finding Elimination Order: : 100%|██████████| 6/6 [00:00<00:00, 1541.93it/s]
Eliminating: age: 100%|██████████| 6/6 [00:00<00:00, 112.52it/s]



As an example, the following is the marginal probability when restecg is set to 1 and trestbps is set to 120:
+-----------+---------------+
| target    |   phi(target) |
| target(0) |        0.2628 |
+-----------+---------------+
| target(1) |        0.7372 |
+-----------+---------------+
You are suffering from heart disease, please visit  doctor





#MODEL 2 trained by algo 1

**Reading dataset "environmental_factors_past_medical_records" that will be used for implementing Algorithm 1** 

#**We have created this dataset on our own. It contains the combined values of environmental factors and past medical records**

In [20]:
import pandas as pd

#reading the dataset for algorithm 2:
environmental_factors_past_medical_records = pd.read_csv("/content/environmental_factors_past_medical_records.csv")
print(environmental_factors_past_medical_records.head(5))

   Temperature  Humidity  Noise  ...  cholestrol  heart rate(BPM)  output
0           22        20     66  ...         233              150       1
1           22        26     66  ...         250              187       1
2           22        26     66  ...         204              172       1
3           22        20     67  ...         236              178       1
4           22        23     67  ...         354              163       1

[5 rows x 7 columns]


Here we have created the DAG by exploiting the attribute dependence.Attribute output is dependent on Temperature Humidity and Noise and then a bayesian network is created by calling function Bayesian model prsent in pgmpy.models

In [6]:
from pgmpy.models import BayesianModel
from pgmpy.estimators import MaximumLikelihoodEstimator
from pgmpy.inference import VariableElimination

#We know that the variables relate as follows:
model = BayesianModel([('Temperature', 'output'), ('Humidity', 'output'), ('Noise', 'output')])

# Learning CPDs using Maximum Likelihood Estimators(uses relative frequencies)
# This calculate the CPD of each attribute of environmental_factors_past_medical_records dataset  
model.fit(environmental_factors_past_medical_records, estimator=MaximumLikelihoodEstimator)

#here, we are printing the probalities of distinct values of each attribute. 
print(model.get_cpds('Temperature'))
print(model.get_cpds('Humidity'))
print(model.get_cpds('Noise'))

model.get_independencies()


# Doing exact inference using Variable Elimination inference class
# here we have made a inference object on which we can run the queries to get our results 
environmental_factors_past_medical_records_infer = VariableElimination(model)


+-----------------+----------+
| Temperature(19) | 0.122449 |
+-----------------+----------+
| Temperature(20) | 0.346939 |
+-----------------+----------+
| Temperature(21) | 0.367347 |
+-----------------+----------+
| Temperature(22) | 0.163265 |
+-----------------+----------+
+---------------+-----------+
| Humidity(20)  | 0.306122  |
+---------------+-----------+
| Humidity(21)  | 0.346939  |
+---------------+-----------+
| Humidity(22)  | 0.102041  |
+---------------+-----------+
| Humidity(23)  | 0.0612245 |
+---------------+-----------+
| Humidity(24)  | 0.0204082 |
+---------------+-----------+
| Humidity(26)  | 0.0612245 |
+---------------+-----------+
| Humidity(27)  | 0.0612245 |
+---------------+-----------+
| Humidity(33)  | 0.0204082 |
+---------------+-----------+
| Humidity(140) | 0.0204082 |
+---------------+-----------+
+-----------+-----------+
| Noise(53) | 0.0204082 |
+-----------+-----------+
| Noise(54) | 0.0816327 |
+-----------+-----------+
| Noise(55) | 0.14285

Here we are running the query method on the inference object cretaed 

In [7]:
# Computing the probability of some evidences like Temperature:22 Humidity:20 Noise:67 given the target class 
query_result = environmental_factors_past_medical_records_infer.query(variables=['output'], evidence={'Temperature': 22,'Humidity':20, 'Noise':67},joint=False)
print("**************")
print(query_result['output'])

Finding Elimination Order: : : 0it [00:00, ?it/s]
0it [00:00, ?it/s]

**************
+-----------+---------------+
| output    |   phi(output) |
| output(0) |        0.0000 |
+-----------+---------------+
| output(1) |        1.0000 |
+-----------+---------------+





# **FINAL OUTPUT FUNCTION**
this function will output the result for particular value of feature 'restecg' and 'trestbps' passed by the patient same as previous example {in cell 17} [enter restecg as 1 and trestbps as 120] 

**This function will be called in last cell because model(Algo 2) will be triggered by the model (Algo 1)**


In [24]:
def call():
    print("ENTER restecg")
    X=int(input())
    print("ENTER trestbps")
    Y=int(input())

    q1=HeartDiseasetest_infer.query(variables=['target'],evidence={'restecg':X,'trestbps':Y},joint=False)
    print(q1['target'])

# if probability of getting 0 in target is more than getting 1 then it will follow 'else' statement otherwise if statement

    if q1['target'].values[0] > q1['target'].values[1]:
        print("You are not suffering from heart disease")  
    else:
        print("You are suffering from heart disease, please visit doctor")     


**Based on the output of the query passed we will trigger the algorithm 2(by calling call() function, which is running on the cloud**

Dataset - When system ask you to enter restecg and trestbps then test the model with these values- 

(restecg, trestbps) -> (1, 130) and (2, 114)

**if probability of getting 0 in output is more than getting 1 then it will follow 'else' statement otherwise if statement and then "call()" function will be called**





In [25]:
if query_result['output'].values[0] < query_result['output'].values[1]:
    call()   
else:
    print("You are safe....")    

ENTER restecg
2
ENTER trestbps
114


Finding Elimination Order: : 100%|██████████| 6/6 [00:00<00:00, 606.73it/s]
Eliminating: age: 100%|██████████| 6/6 [00:00<00:00, 169.13it/s]

+-----------+---------------+
| target    |   phi(target) |
| target(0) |        0.9735 |
+-----------+---------------+
| target(1) |        0.0265 |
+-----------+---------------+
You are not suffering from heart disease



