# Lab 3: Bayesian Networks

## Learning Objectives

By the end of this lab, you will:
- Understand Bayesian networks and their structure
- Build networks from scratch with CPTs
- Perform inference using variable elimination
- Use sampling for approximate inference
- Apply networks to real-world problems
- Master pgmpy for professional applications

## What are Bayesian Networks?

A **Bayesian Network** (also called Belief Network) is:
- A **graphical model** representing probabilistic relationships
- A **Directed Acyclic Graph (DAG)** where:
  - Nodes = random variables
  - Edges = direct probabilistic dependencies
- Each node has a **Conditional Probability Table (CPT)**

**Key advantage**: Compactly represent complex joint distributions!

## Why Bayesian Networks?

**Problem**: Joint distribution over n binary variables requires 2ⁿ-1 parameters
- 10 variables → 1,023 parameters
- 20 variables → 1,048,575 parameters

**Solution**: Exploit conditional independence!
- Bayesian networks can reduce this dramatically
- Make inference tractable

## Real-World Applications

- 🏥 **Medical Diagnosis**: Symptoms → Diseases → Test results
- 🤖 **Robot Localization**: Sensors → Position → Environment
- 💼 **Decision Support**: Risk factors → Outcomes → Actions
- 🔍 **Search Engines**: Query → User intent → Relevant docs
- 🧬 **Bioinformatics**: Genes → Proteins → Phenotypes


In [None]:
# Import libraries
import numpy as np
import matplotlib.pyplot as plt
import networkx as nx
from typing import Dict, List, Tuple, Set, Optional
from collections import defaultdict
from itertools import product
import pandas as pd

# Set random seed
np.random.seed(42)

# Plot settings
plt.style.use('seaborn-v0_8-darkgrid')
plt.rcParams['figure.figsize'] = (12, 8)

## Part 1: Understanding Bayesian Network Structure

### Example: The Classic "Alarm" Network

```
Burglary → Alarm ← Earthquake
              ↓
          JohnCalls
              ↓
          MaryCalls
```

**Story**:
- Burglary or Earthquake can trigger the Alarm
- If Alarm goes off, John and Mary might call
- John and Mary are independent given the Alarm state

### Conditional Independence

**Key insight**: Each node is conditionally independent of its non-descendants given its parents.

$$P(X_i | X_1, ..., X_{i-1}) = P(X_i | Parents(X_i))$$

**Joint distribution factorization**:

$$P(X_1, ..., X_n) = \prod_{i=1}^{n} P(X_i | Parents(X_i))$$


In [None]:
class BayesianNetworkNode:
    """Represents a node in a Bayesian Network."""
    
    def __init__(self, name: str, values: List[str]):
        """
        Initialize a node.
        
        Args:
            name: Variable name
            values: Possible values for this variable
        """
        self.name = name
        self.values = values
        self.parents = []
        self.children = []
        self.cpt = {}  # Conditional Probability Table
    
    def add_parent(self, parent: 'BayesianNetworkNode'):
        """Add a parent node."""
        if parent not in self.parents:
            self.parents.append(parent)
            parent.children.append(self)
    
    def set_cpt(self, cpt: Dict):
        """
        Set the conditional probability table.
        
        Args:
            cpt: Dictionary mapping parent values to probability distributions
        """
        self.cpt = cpt
    
    def get_probability(self, value: str, parent_values: Dict[str, str] = None) -> float:
        """
        Get P(this=value | parent_values).
        
        Args:
            value: Value of this variable
            parent_values: Dictionary of parent variable values
        
        Returns:
            Conditional probability
        """
        if not self.parents:
            # No parents, return prior
            return self.cpt.get(value, 0.0)
        
        # Create key from parent values
        parent_key = tuple(parent_values.get(p.name, None) for p in self.parents)
        
        if parent_key in self.cpt:
            return self.cpt[parent_key].get(value, 0.0)
        
        return 0.0
    
    def __repr__(self):
        return f"Node({self.name})"


class BayesianNetwork:
    """Simple Bayesian Network implementation."""
    
    def __init__(self):
        self.nodes = {}
    
    def add_node(self, node: BayesianNetworkNode):
        """Add a node to the network."""
        self.nodes[node.name] = node
    
    def add_edge(self, parent_name: str, child_name: str):
        """Add a directed edge (dependency)."""
        parent = self.nodes[parent_name]
        child = self.nodes[child_name]
        child.add_parent(parent)
    
    def get_node(self, name: str) -> BayesianNetworkNode:
        """Get a node by name."""
        return self.nodes.get(name)
    
    def visualize(self, title: str = "Bayesian Network"):
        """Visualize the network structure."""
        G = nx.DiGraph()
        
        # Add nodes and edges
        for node in self.nodes.values():
            G.add_node(node.name)
            for parent in node.parents:
                G.add_edge(parent.name, node.name)
        
        # Create layout
        pos = nx.spring_layout(G, k=2, iterations=50)
        
        # Draw
        plt.figure(figsize=(12, 8))
        nx.draw_networkx_nodes(G, pos, node_color='lightblue', 
                              node_size=3000, alpha=0.9, edgecolors='navy', linewidths=2)
        nx.draw_networkx_labels(G, pos, font_size=12, font_weight='bold')
        nx.draw_networkx_edges(G, pos, edge_color='gray', arrows=True,
                              arrowsize=20, arrowstyle='->', width=2,
                              connectionstyle='arc3,rad=0.1')
        
        plt.title(title, fontsize=16, fontweight='bold')
        plt.axis('off')
        plt.tight_layout()
        plt.show()
    
    def __repr__(self):
        return f"BayesianNetwork({len(self.nodes)} nodes)"


# Example: Build the simple alarm network
print("Building Simple Alarm Network")
print("=" * 60)

# Create network
bn = BayesianNetwork()

# Create nodes
burglary = BayesianNetworkNode('Burglary', ['true', 'false'])
burglary.set_cpt({'true': 0.001, 'false': 0.999})

earthquake = BayesianNetworkNode('Earthquake', ['true', 'false'])
earthquake.set_cpt({'true': 0.002, 'false': 0.998})

alarm = BayesianNetworkNode('Alarm', ['true', 'false'])
# CPT: P(Alarm | Burglary, Earthquake)
alarm.set_cpt({
    ('true', 'true'): {'true': 0.95, 'false': 0.05},   # Both
    ('true', 'false'): {'true': 0.94, 'false': 0.06},  # Burglary only
    ('false', 'true'): {'true': 0.29, 'false': 0.71},  # Earthquake only
    ('false', 'false'): {'true': 0.001, 'false': 0.999} # Neither
})

john = BayesianNetworkNode('JohnCalls', ['true', 'false'])
john.set_cpt({
    ('true',): {'true': 0.90, 'false': 0.10},   # Alarm on
    ('false',): {'true': 0.05, 'false': 0.95}   # Alarm off
})

mary = BayesianNetworkNode('MaryCalls', ['true', 'false'])
mary.set_cpt({
    ('true',): {'true': 0.70, 'false': 0.30},   # Alarm on
    ('false',): {'true': 0.01, 'false': 0.99}   # Alarm off
})

# Add nodes to network
for node in [burglary, earthquake, alarm, john, mary]:
    bn.add_node(node)

# Add edges
bn.add_edge('Burglary', 'Alarm')
bn.add_edge('Earthquake', 'Alarm')
bn.add_edge('Alarm', 'JohnCalls')
bn.add_edge('Alarm', 'MaryCalls')

print("Network created!")
print(f"Nodes: {list(bn.nodes.keys())}")
print()

# Visualize
bn.visualize("Alarm Bayesian Network")

## Part 2: Exact Inference - Enumeration

### Inference Task

**Goal**: Calculate P(Query | Evidence)

**Example**: P(Burglary | JohnCalls=true, MaryCalls=true)

### Inference by Enumeration

**Algorithm**:
1. Sum over all possible values of hidden variables
2. Use the factorization from the network structure
3. Normalize to get a probability distribution

$$P(X|e) = \alpha \sum_{y} P(X, e, y)$$

where α is a normalization constant and y are hidden variables.


In [None]:
def enumerate_all(variables: List[str], evidence: Dict[str, str], 
                 bn: BayesianNetwork) -> float:
    """
    Calculate probability by enumerating all possible assignments.
    
    Args:
        variables: Variables to sum over
        evidence: Observed variable values
        bn: Bayesian network
    
    Returns:
        Probability
    """
    if not variables:
        # Base case: calculate product of probabilities
        prob = 1.0
        for var_name in bn.nodes.keys():
            node = bn.get_node(var_name)
            value = evidence[var_name]
            parent_values = {p.name: evidence[p.name] for p in node.parents}
            prob *= node.get_probability(value, parent_values)
        return prob
    
    # Recursive case: sum over possible values of first variable
    var = variables[0]
    node = bn.get_node(var)
    rest = variables[1:]
    
    total = 0.0
    for value in node.values:
        extended_evidence = evidence.copy()
        extended_evidence[var] = value
        total += enumerate_all(rest, extended_evidence, bn)
    
    return total


def inference_enumeration(query_var: str, evidence: Dict[str, str], 
                         bn: BayesianNetwork) -> Dict[str, float]:
    """
    Perform exact inference using enumeration.
    
    Args:
        query_var: Variable to query
        evidence: Observed variables
        bn: Bayesian network
    
    Returns:
        Distribution over query variable
    """
    query_node = bn.get_node(query_var)
    hidden_vars = [v for v in bn.nodes.keys() if v != query_var and v not in evidence]
    
    distribution = {}
    for value in query_node.values:
        extended_evidence = evidence.copy()
        extended_evidence[query_var] = value
        prob = enumerate_all(hidden_vars, extended_evidence, bn)
        distribution[value] = prob
    
    # Normalize
    total = sum(distribution.values())
    if total > 0:
        distribution = {k: v/total for k, v in distribution.items()}
    
    return distribution


# Example inference queries
print("Inference Examples")
print("=" * 60)

# Query 1: P(Burglary | JohnCalls=true)
evidence1 = {'JohnCalls': 'true'}
result1 = inference_enumeration('Burglary', evidence1, bn)
print("Query 1: P(Burglary | John calls)")
for value, prob in result1.items():
    print(f"  P(Burglary={value} | John calls) = {prob:.6f}")
print()

# Query 2: P(Burglary | JohnCalls=true, MaryCalls=true)
evidence2 = {'JohnCalls': 'true', 'MaryCalls': 'true'}
result2 = inference_enumeration('Burglary', evidence2, bn)
print("Query 2: P(Burglary | John calls AND Mary calls)")
for value, prob in result2.items():
    print(f"  P(Burglary={value} | both call) = {prob:.6f}")
print()

print("Notice: More evidence (both calling) increases burglary probability!")

## Part 3: Approximate Inference - Sampling

### Why Sampling?

**Problem**: Exact inference is exponential in network size
- Large networks become intractable
- Need approximate methods

**Solution**: Monte Carlo sampling
- Generate random samples from the distribution
- Estimate probabilities from sample frequencies
- Trades accuracy for computational efficiency

### Rejection Sampling

**Algorithm**:
1. Generate random sample from prior distribution
2. If sample matches evidence, keep it
3. Otherwise, reject it
4. Estimate query probability from kept samples


In [None]:
def sample_from_network(bn: BayesianNetwork, evidence: Dict[str, str] = None) -> Dict[str, str]:
    """
    Generate one sample from the Bayesian network.
    
    Args:
        bn: Bayesian network
        evidence: Fixed values (optional)
    
    Returns:
        Complete assignment to all variables
    """
    sample = evidence.copy() if evidence else {}
    
    # Topological sort to sample in correct order
    visited = set()
    order = []
    
    def visit(node_name):
        if node_name in visited:
            return
        visited.add(node_name)
        node = bn.get_node(node_name)
        for parent in node.parents:
            visit(parent.name)
        order.append(node_name)
    
    for node_name in bn.nodes:
        visit(node_name)
    
    # Sample each variable in topological order
    for var_name in order:
        if var_name in sample:
            continue
        
        node = bn.get_node(var_name)
        parent_values = {p.name: sample[p.name] for p in node.parents}
        
        # Get probability distribution
        probs = []
        values = []
        for value in node.values:
            prob = node.get_probability(value, parent_values)
            probs.append(prob)
            values.append(value)
        
        # Sample from distribution
        if sum(probs) > 0:
            probs = np.array(probs) / sum(probs)
            sample[var_name] = np.random.choice(values, p=probs)
        else:
            sample[var_name] = values[0]
    
    return sample


def rejection_sampling(query_var: str, evidence: Dict[str, str], 
                      bn: BayesianNetwork, n_samples: int = 10000) -> Dict[str, float]:
    """
    Approximate inference using rejection sampling.
    
    Args:
        query_var: Variable to query
        evidence: Observed variables
        bn: Bayesian network
        n_samples: Number of samples to generate
    
    Returns:
        Approximate distribution over query variable
    """
    counts = defaultdict(int)
    kept = 0
    
    for _ in range(n_samples):
        sample = sample_from_network(bn)
        
        # Check if sample matches evidence
        if all(sample[var] == val for var, val in evidence.items()):
            counts[sample[query_var]] += 1
            kept += 1
    
    # Convert counts to probabilities
    if kept > 0:
        distribution = {val: counts[val] / kept for val in bn.get_node(query_var).values}
    else:
        distribution = {val: 0.0 for val in bn.get_node(query_var).values}
    
    return distribution, kept


# Compare exact vs approximate inference
print("Comparing Exact vs Approximate Inference")
print("=" * 60)

evidence = {'JohnCalls': 'true', 'MaryCalls': 'true'}

# Exact inference
exact = inference_enumeration('Burglary', evidence, bn)
print("Exact inference: P(Burglary | both call)")
for value, prob in exact.items():
    print(f"  P(Burglary={value}) = {prob:.6f}")
print()

# Rejection sampling
approx, n_kept = rejection_sampling('Burglary', evidence, bn, n_samples=100000)
print(f"Rejection sampling (kept {n_kept} samples):")
for value, prob in approx.items():
    exact_prob = exact[value]
    error = abs(prob - exact_prob)
    print(f"  P(Burglary={value}) = {prob:.6f} (error: {error:.6f})")

## Part 4: Building a Medical Diagnosis Network

Let's build a realistic medical diagnosis system!

**Network structure**:
```
Age → Disease
Smoking → Disease
Disease → Symptom1
Disease → Symptom2
Disease → TestResult
```


In [None]:
# Build medical diagnosis network
print("Medical Diagnosis Bayesian Network")
print("=" * 60)

medical_bn = BayesianNetwork()

# Create nodes
age = BayesianNetworkNode('Age', ['young', 'old'])
age.set_cpt({'young': 0.7, 'old': 0.3})

smoking = BayesianNetworkNode('Smoking', ['yes', 'no'])
smoking.set_cpt({'yes': 0.2, 'no': 0.8})

disease = BayesianNetworkNode('Disease', ['yes', 'no'])
# P(Disease | Age, Smoking)
disease.set_cpt({
    ('young', 'yes'): {'yes': 0.10, 'no': 0.90},
    ('young', 'no'): {'yes': 0.01, 'no': 0.99},
    ('old', 'yes'): {'yes': 0.50, 'no': 0.50},
    ('old', 'no'): {'yes': 0.20, 'no': 0.80}
})

cough = BayesianNetworkNode('Cough', ['yes', 'no'])
cough.set_cpt({
    ('yes',): {'yes': 0.80, 'no': 0.20},
    ('no',): {'yes': 0.05, 'no': 0.95}
})

fatigue = BayesianNetworkNode('Fatigue', ['yes', 'no'])
fatigue.set_cpt({
    ('yes',): {'yes': 0.75, 'no': 0.25},
    ('no',): {'yes': 0.10, 'no': 0.90}
})

test = BayesianNetworkNode('TestPositive', ['yes', 'no'])
test.set_cpt({
    ('yes',): {'yes': 0.90, 'no': 0.10},  # 90% sensitivity
    ('no',): {'yes': 0.05, 'no': 0.95}    # 5% false positive
})

# Add nodes
for node in [age, smoking, disease, cough, fatigue, test]:
    medical_bn.add_node(node)

# Add edges
medical_bn.add_edge('Age', 'Disease')
medical_bn.add_edge('Smoking', 'Disease')
medical_bn.add_edge('Disease', 'Cough')
medical_bn.add_edge('Disease', 'Fatigue')
medical_bn.add_edge('Disease', 'TestPositive')

print("Medical network created!")
medical_bn.visualize("Medical Diagnosis Network")

### Running Diagnostic Queries

In [None]:
print("Diagnostic Queries")
print("=" * 60)

# Scenario 1: Young non-smoker with cough
print("Scenario 1: Young non-smoker with cough")
evidence1 = {'Age': 'young', 'Smoking': 'no', 'Cough': 'yes'}
result1 = inference_enumeration('Disease', evidence1, medical_bn)
print(f"P(Disease=yes) = {result1['yes']:.4f}")
print()

# Scenario 2: Old smoker with cough and fatigue
print("Scenario 2: Old smoker with cough and fatigue")
evidence2 = {'Age': 'old', 'Smoking': 'yes', 'Cough': 'yes', 'Fatigue': 'yes'}
result2 = inference_enumeration('Disease', evidence2, medical_bn)
print(f"P(Disease=yes) = {result2['yes']:.4f}")
print()

# Scenario 3: Unknown age/smoking, positive test
print("Scenario 3: Unknown background, positive test")
evidence3 = {'TestPositive': 'yes'}
result3 = inference_enumeration('Disease', evidence3, medical_bn)
print(f"P(Disease=yes) = {result3['yes']:.4f}")
print()

# Scenario 4: Old smoker with positive test
print("Scenario 4: Old smoker with positive test")
evidence4 = {'Age': 'old', 'Smoking': 'yes', 'TestPositive': 'yes'}
result4 = inference_enumeration('Disease', evidence4, medical_bn)
print(f"P(Disease=yes) = {result4['yes']:.4f}")
print()

# Visualize results
scenarios = ['Young\nnon-smoker\n+ cough', 
            'Old smoker\n+ cough\n+ fatigue',
            'Unknown\n+ test',
            'Old smoker\n+ test']
probs = [result1['yes'], result2['yes'], result3['yes'], result4['yes']]

plt.figure(figsize=(12, 6))
bars = plt.bar(scenarios, probs, color=['green', 'orange', 'yellow', 'red'], 
              alpha=0.7, edgecolor='black', linewidth=2)
plt.ylabel('P(Disease = yes | Evidence)', fontsize=12, fontweight='bold')
plt.title('Disease Probability Under Different Scenarios', fontsize=14, fontweight='bold')
plt.ylim(0, 1)
plt.grid(axis='y', alpha=0.3)

for bar, prob in zip(bars, probs):
    height = bar.get_height()
    plt.text(bar.get_x() + bar.get_width()/2., height,
            f'{prob:.3f}', ha='center', va='bottom', fontsize=12, fontweight='bold')

plt.tight_layout()
plt.show()

## Part 5: Using pgmpy - Professional Library

Now let's use **pgmpy** for more efficient and feature-rich Bayesian network operations.


In [None]:
try:
    from pgmpy.models import BayesianNetwork as PgmpyBN
    from pgmpy.factors.discrete import TabularCPD
    from pgmpy.inference import VariableElimination
    from pgmpy.sampling import BayesianModelSampling
    
    PGMPY_AVAILABLE = True
except ImportError:
    print("pgmpy not installed. Run: pip install pgmpy")
    PGMPY_AVAILABLE = False

if PGMPY_AVAILABLE:
    print("Building Student Network with pgmpy")
    print("=" * 60)
    
    # Classic "Student" network
    # Difficulty → Grade ← Intelligence
    # Grade → Letter
    # Intelligence → SAT
    
    model = PgmpyBN([('Difficulty', 'Grade'), 
                     ('Intelligence', 'Grade'),
                     ('Grade', 'Letter'),
                     ('Intelligence', 'SAT')])
    
    # Define CPDs
    cpd_diff = TabularCPD(variable='Difficulty', variable_card=2,
                          values=[[0.6], [0.4]],  # [easy, hard]
                          state_names={'Difficulty': ['easy', 'hard']})
    
    cpd_intel = TabularCPD(variable='Intelligence', variable_card=2,
                           values=[[0.7], [0.3]],  # [low, high]
                           state_names={'Intelligence': ['low', 'high']})
    
    # P(Grade | Intelligence, Difficulty)
    cpd_grade = TabularCPD(
        variable='Grade',
        variable_card=3,
        values=[[0.3, 0.05, 0.9, 0.5],   # grade A
                [0.4, 0.25, 0.08, 0.3],   # grade B
                [0.3, 0.7, 0.02, 0.2]],   # grade C
        evidence=['Intelligence', 'Difficulty'],
        evidence_card=[2, 2],
        state_names={'Grade': ['A', 'B', 'C'],
                    'Intelligence': ['low', 'high'],
                    'Difficulty': ['easy', 'hard']}
    )
    
    # P(Letter | Grade)
    cpd_letter = TabularCPD(
        variable='Letter',
        variable_card=2,
        values=[[0.1, 0.4, 0.99],   # weak letter
                [0.9, 0.6, 0.01]],  # strong letter
        evidence=['Grade'],
        evidence_card=[3],
        state_names={'Letter': ['weak', 'strong'],
                    'Grade': ['A', 'B', 'C']}
    )
    
    # P(SAT | Intelligence)
    cpd_sat = TabularCPD(
        variable='SAT',
        variable_card=2,
        values=[[0.95, 0.2],   # low SAT
                [0.05, 0.8]],  # high SAT
        evidence=['Intelligence'],
        evidence_card=[2],
        state_names={'SAT': ['low', 'high'],
                    'Intelligence': ['low', 'high']}
    )
    
    # Add CPDs to model
    model.add_cpds(cpd_diff, cpd_intel, cpd_grade, cpd_letter, cpd_sat)
    
    # Verify model
    assert model.check_model()
    print("Model is valid!")
    print()
    
    # Perform inference
    inference = VariableElimination(model)
    
    # Query 1: P(Intelligence | Grade=A)
    print("Query 1: P(Intelligence | Grade=A)")
    result = inference.query(variables=['Intelligence'], evidence={'Grade': 'A'})
    print(result)
    print()
    
    # Query 2: P(Intelligence | Grade=A, SAT=high)
    print("Query 2: P(Intelligence | Grade=A, SAT=high)")
    result = inference.query(variables=['Intelligence'], 
                            evidence={'Grade': 'A', 'SAT': 'high'})
    print(result)
    print()
    
    # Query 3: P(Letter | SAT=low, Difficulty=hard)
    print("Query 3: P(Letter | SAT=low, Difficulty=hard)")
    result = inference.query(variables=['Letter'], 
                            evidence={'SAT': 'low', 'Difficulty': 'hard'})
    print(result)
    print()
    
    # Generate samples
    print("Generating samples from the network...")
    sampler = BayesianModelSampling(model)
    samples = sampler.forward_sample(size=10)
    print(samples)
else:
    print("Install pgmpy to run this section: pip install pgmpy")

## Exercises

### Exercise 1: Build Your Own Network
Create a Bayesian network for a domain of your choice (e.g., car diagnosis, loan approval, weather prediction).
Include at least 5 variables.

In [None]:
# TODO: Build your own Bayesian network
# Your code here
pass

### Exercise 2: Explaining Away
Use the alarm network to demonstrate "explaining away":
- What is P(Burglary | Alarm=true)?
- What is P(Burglary | Alarm=true, Earthquake=true)?
- Explain why knowing about the earthquake reduces burglary probability.

In [None]:
# TODO: Demonstrate explaining away effect
# Your code here
pass

### Exercise 3: Sensitivity Analysis
For the medical diagnosis network, vary the CPT probabilities and observe how it affects the diagnosis.
Plot how diagnosis probability changes as test accuracy varies.

In [None]:
# TODO: Perform sensitivity analysis
# Your code here
pass

## Summary

### Key Takeaways

1. **Bayesian Networks** - Compact representation of joint distributions
2. **Conditional Independence** - Key to computational efficiency
3. **Exact Inference** - Enumeration for small networks
4. **Approximate Inference** - Sampling for large networks
5. **Real Applications** - Medical diagnosis, decision support
6. **pgmpy** - Professional tool for practical work

### Why Bayesian Networks Matter

- **Intuitive**: Graphical representation is easy to understand
- **Efficient**: Exploit independence for tractable inference
- **Flexible**: Handle missing data and uncertain evidence
- **Causal**: Can represent cause-effect relationships
- **Widely used**: Medicine, robotics, finance, AI

### Next Steps

In Lab 4, we'll learn about **Hidden Markov Models** - Bayesian networks for sequential data!
