<a href="https://colab.research.google.com/github/KondepudiPrasanna/ML-LAB/blob/main/MLLab13.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Bayesian Network

A Bayesian Network (also known as a Bayesian Belief Network or a Probabilistic Graphical Model) is a graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Bayesian networks are widely used in machine learning for various applications, including reasoning, decision-making, and prediction.

Key Concepts of Bayesian Networks

Nodes and Edges:

Nodes: Represent random variables, which can be discrete or continuous.

Edges: Directed edges (arrows) indicate the relationships between the nodes, where an edge from node A to node B suggests that A has a direct influence on B.

Conditional Probability Distribution (CPD):

Each node has an associated CPD that quantifies the effect of the parent nodes on that node. If a node has no parents, it has a prior probability distribution.

Directed Acyclic Graph (DAG):

The structure of a Bayesian network is a directed acyclic graph, meaning there are no cycles or loops. This ensures a clear direction of influence among the variables.

Inference:

Bayesian networks enable reasoning about the relationships between variables. Inference involves computing the probability distribution of a subset of variables given evidence about other variables. Common inference algorithms include Variable Elimination and Belief Propagation.

Learning:

Bayesian networks can be learned from data.

There are two main types oflearning: Parameter Learning: Estimating the CPDs given a structure. **Structure Learning: **Identifying the network structure from data.

**13. Write a Python program to construct a Bayesian network considering medical data. Use this model to demonstrate the diagnosis of heart patients using standard Heart Disease Data Set**

In [1]:
!pip install pgmpy

Collecting pgmpy
  Downloading pgmpy-0.1.26-py3-none-any.whl.metadata (9.1 kB)
Downloading pgmpy-0.1.26-py3-none-any.whl (2.0 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.0/2.0 MB[0m [31m8.8 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pgmpy
Successfully installed pgmpy-0.1.26


In [2]:
import numpy as np
import pandas as pd
from pgmpy.estimators import MaximumLikelihoodEstimator
from pgmpy.models import BayesianModel
from pgmpy.inference import VariableElimination

# Read Cleveland Heart Disease data
heartDisease = pd.read_csv('/content/Lab13.csv')
heartDisease = heartDisease.replace('?', np.nan)

# Display the data
print('Few examples from the dataset are given below:')
print(heartDisease.head())

# Display the Attributes names and datatypes
print('\nAttributes and datatypes:')
print(heartDisease.dtypes)

# Check the unique values in the 'restecg' column
print('\nUnique values in restecg:')
unique_restecg_values = heartDisease['restecg'].unique()
print(unique_restecg_values)  # Print unique values

FileNotFoundError: [Errno 2] No such file or directory: '/content/Lab13.csv'

In [None]:
# Create Model - Bayesian Network
model = BayesianModel([
    ('age', 'Heartdisease'),
    ('sex', 'Heartdisease'),
    ('exang', 'Heartdisease'),
    ('cp', 'Heartdisease'),
    ('Heartdisease', 'restecg'),
    ('Heartdisease', 'chol')
])

# Learning CPDs using Maximum Likelihood Estimators
print('\nLearning CPD using Maximum Likelihood Estimators:')
model.fit(heartDisease, estimator=MaximumLikelihoodEstimator)

# Inferencing with Bayesian Network
print('\nInferencing with Bayesian Network:')
HeartDiseasetest_infer = VariableElimination(model)

# Use valid value from unique values of restecg
# Here, you can replace 1 with a valid value from unique_restecg_values
# Ensure that the value you use is one of the valid states
# For example, if unique_restecg_values outputs [0, 1, 2], you can use any of those as valid evidence.
if len(unique_restecg_values) > 0:
    valid_restecg_value = unique_restecg_values[0]  # Change this if needed based on your dataset
    print(f'\n1. Probability of HeartDisease given evidence= restecg: {valid_restecg_value}')
    q1 = HeartDiseasetest_infer.query(variables=['Heartdisease'], evidence={'restecg': valid_restecg_value})
    print(q1)
else:
    print("No unique values found for restecg.")

# Computing the Probability of HeartDisease given cp
print('\n2. Probability of HeartDisease given evidence= cp: 2')
q2 = HeartDiseasetest_infer.query(variables=['Heartdisease'], evidence={'cp': 2})
print(q2)
