# Title: **Write a Program to Construct a Bayesian Network from Given Data**

## Objective: To develop a Python program that constructs a Bayesian Network using given dataset inputs, demonstrating the use of probabilistic graphical models in representing and reasoning about uncertainties.

### Theory
Bayesian Networks are probabilistic graphical models that represent a set of variables and their conditional dependencies via a directed acyclic graph (DAG). These networks are used extensively in various fields such as bioinformatics, risk management, and machine learning to model uncertain systems and reason about them efficiently.

The nodes in a Bayesian Network represent different variables, and the edges depict the conditional dependencies between these variables. By applying Bayesian inference, one can perform reasoning and make predictions based on observed data.

### Materials/Tools Required
- Python 3.x installed on a computer
- Python libraries: `pandas` for data manipulation, `pgmpy` for constructing Bayesian Networks
- Text editor or Integrated Development Environment (IDE) like PyCharm, Visual Studio Code, or Jupyter Notebook

### Procedure
1. Install the required Python libraries using pip:
   ```bash
   pip install pandas pgmpy
   ```
2. Open your Python development environment.
3. Type the provided code into the editor.
4. Save the file with a `.py` extension, for example, `construct_bayesian_network.py`.
5. Run the program in your development environment.
6. Observe how the Bayesian Network is constructed and visualize the network structure.

In [2]:
!pip install pandas pgmpy

Defaulting to user installation because normal site-packages is not writeable
Collecting pgmpy
  Using cached pgmpy-0.1.26-py3-none-any.whl (2.0 MB)
Collecting torch (from pgmpy)
  Using cached torch-2.6.0-cp311-cp311-win_amd64.whl (204.2 MB)
Collecting opt-einsum (from pgmpy)
  Using cached opt_einsum-3.4.0-py3-none-any.whl (71 kB)
Collecting google-generativeai (from pgmpy)
  Using cached google_generativeai-0.8.4-py3-none-any.whl (175 kB)
Collecting google-ai-generativelanguage==0.6.15 (from google-generativeai->pgmpy)
  Using cached google_ai_generativelanguage-0.6.15-py3-none-any.whl (1.3 MB)
Collecting google-api-core (from google-generativeai->pgmpy)
  Using cached google_api_core-2.24.1-py3-none-any.whl (160 kB)
Collecting google-api-python-client (from google-generativeai->pgmpy)
  Using cached google_api_python_client-2.160.0-py2.py3-none-any.whl (12.8 MB)
Collecting google-auth>=2.15.0 (from google-generativeai->pgmpy)
  Using cached google_auth-2.38.0-py2.py3-none-any.whl (

In [3]:
### Python Program Code

import pandas as pd
from pgmpy.models import BayesianModel
from pgmpy.estimators import MaximumLikelihoodEstimator

# Example dataset: Simple student model with variables Grade (G), Intelligence (I), and SAT (S)
data = pd.DataFrame(data={'I': ['low', 'low', 'high', 'high'],
                          'S': ['good', 'poor', 'good', 'poor'],
                          'G': ['A', 'B', 'A', 'C']})

# Constructing the Bayesian Network
# Defining the structure with edges representing conditional dependencies
model = BayesianModel([('I', 'G'), ('I', 'S'), ('G', 'S')])

# Fitting the model using Maximum Likelihood Estimation
model.fit(data, estimator=MaximumLikelihoodEstimator)

# Output the constructed model parameters
for cpd in model.get_cpds():
    print("CPD of {0}:".format(cpd.variable))
    print(cpd)

# Optional: visualize the network
from pgmpy.visualization import plot_model
import matplotlib.pyplot as plt

plot_model(model)
plt.show()



CPD of I:
+---------+-----+
| I(high) | 0.5 |
+---------+-----+
| I(low)  | 0.5 |
+---------+-----+
CPD of G:
+------+---------+--------+
| I    | I(high) | I(low) |
+------+---------+--------+
| G(A) | 0.5     | 0.5    |
+------+---------+--------+
| G(B) | 0.0     | 0.5    |
+------+---------+--------+
| G(C) | 0.5     | 0.0    |
+------+---------+--------+
CPD of S:
+---------+---------+--------+---------+--------+---------+--------+
| G       | G(A)    | G(A)   | G(B)    | G(B)   | G(C)    | G(C)   |
+---------+---------+--------+---------+--------+---------+--------+
| I       | I(high) | I(low) | I(high) | I(low) | I(high) | I(low) |
+---------+---------+--------+---------+--------+---------+--------+
| S(good) | 1.0     | 1.0    | 0.5     | 0.0    | 0.0     | 0.5    |
+---------+---------+--------+---------+--------+---------+--------+
| S(poor) | 0.0     | 0.0    | 0.5     | 1.0    | 1.0     | 0.5    |
+---------+---------+--------+---------+--------+---------+--------+


ModuleNotFoundError: No module named 'pgmpy.visualization'

In [5]:
import pandas as pd
from pgmpy.models import BayesianModel
from pgmpy.estimators import MaximumLikelihoodEstimator
import networkx as nx
import matplotlib.pyplot as plt

# Example dataset: Simple student model with variables Grade (G), Intelligence (I), and SAT (S)
data = pd.DataFrame(data={'I': ['low', 'low', 'high', 'high'],
                          'S': ['good', 'poor', 'good', 'poor'],
                          'G': ['A', 'B', 'A', 'C']})

# Constructing the Bayesian Network
model = BayesianModel([('I', 'G'), ('I', 'S'), ('G', 'S')])

# Fitting the model using Maximum Likelihood Estimation
model.fit(data, estimator=MaximumLikelihoodEstimator)

# Output the constructed model parameters
for cpd in model.get_cpds():
    print("CPD of {0}:".format(cpd.variable))
    print(cpd)

# Visualize the network
G = nx.DiGraph()
G.add_edges_from(model.edges())
pos = nx.spring_layout(G)
nx.draw(G, pos, with_labels=True, node_size=2000, node_color='skyblue', font_size=10, font_color='black')
plt.title('Bayesian Network Visualization')
plt.show()



CPD of I:
+---------+-----+
| I(high) | 0.5 |
+---------+-----+
| I(low)  | 0.5 |
+---------+-----+
CPD of G:
+------+---------+--------+
| I    | I(high) | I(low) |
+------+---------+--------+
| G(A) | 0.5     | 0.5    |
+------+---------+--------+
| G(B) | 0.0     | 0.5    |
+------+---------+--------+
| G(C) | 0.5     | 0.0    |
+------+---------+--------+
CPD of S:
+---------+---------+--------+---------+--------+---------+--------+
| G       | G(A)    | G(A)   | G(B)    | G(B)   | G(C)    | G(C)   |
+---------+---------+--------+---------+--------+---------+--------+
| I       | I(high) | I(low) | I(high) | I(low) | I(high) | I(low) |
+---------+---------+--------+---------+--------+---------+--------+
| S(good) | 1.0     | 1.0    | 0.5     | 0.0    | 0.0     | 0.5    |
+---------+---------+--------+---------+--------+---------+--------+
| S(poor) | 0.0     | 0.0    | 0.5     | 1.0    | 1.0     | 0.5    |
+---------+---------+--------+---------+--------+---------+--------+


TypeError: '_AxesStack' object is not callable

<Figure size 640x480 with 0 Axes>

In [4]:
pip install networkx matplotlib

Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.


### Observations
- The program will output the Conditional Probability Distributions (CPDs) for each variable in the Bayesian Network.
- The visualization part (if executed) will show the structure of the Bayesian Network, helping to understand the conditional dependencies visually.

### Conclusion
Constructing a Bayesian Network from given data allows for the systematic analysis of variable dependencies and decision-making under uncertainty. This method is highly beneficial in fields requiring probabilistic reasoning and predictions based on observed data.

### Applications
- Diagnostic systems (e.g., medical diagnosis)
- Recommendation systems
- Risk assessment and management
