# Uncertainty Analysis with Bayesian Networks

## Overview

This notebook demonstrates modeling uncertainty in customer behavior using Bayesian Networks and probabilistic reasoning.

### Techniques Used

- **Bayesian Networks**: Model probabilistic dependencies between variables
- **Conditional Probability Tables (CPDs)**: Define probability distributions
- **Variable Elimination**: Perform inference on the network
- **Normal Distributions**: Model continuous variables like time on site


In [None]:
## Implementation

import numpy as np
from scipy.stats import norm

# Note: This notebook requires pgmpy library
# Install with: pip install pgmpy

try:
    from pgmpy.models import DiscreteBayesianNetwork
    from pgmpy.factors.discrete import TabularCPD
    from pgmpy.inference import VariableElimination
    print("✓ All required libraries loaded successfully")
except ImportError as e:
    print(f"⚠ Missing library: {e}")
    print("Install with: pip install pgmpy")


In [None]:
## Bayesian Network for Purchase Prediction

# 1. Create network structure
# Variables: PurchaseHistory, TimeOnSite, ClickedPromotion -> Purchase
model = DiscreteBayesianNetwork([
    ('PurchaseHistory', 'Purchase'),
    ('TimeOnSite', 'Purchase'),
    ('ClickedPromotion', 'Purchase')
])

# 2. Define Conditional Probability Distributions
cpd_history = TabularCPD(variable='PurchaseHistory', variable_card=2, values=[[0.7], [0.3]])
cpd_time = TabularCPD(variable='TimeOnSite', variable_card=2, values=[[0.6], [0.4]])
cpd_promotion = TabularCPD(variable='ClickedPromotion', variable_card=2, values=[[0.8], [0.2]])

# Purchase probability depends on all three factors
cpd_purchase = TabularCPD(
    variable='Purchase',
    variable_card=2,
    values=[
        [0.9, 0.7, 0.8, 0.4, 0.6, 0.2, 0.3, 0.1],  # P(Not Purchase)
        [0.1, 0.3, 0.2, 0.6, 0.4, 0.8, 0.7, 0.9]   # P(Purchase)
    ],
    evidence=['PurchaseHistory', 'TimeOnSite', 'ClickedPromotion'],
    evidence_card=[2, 2, 2]
)

# 3. Add CPDs to model
model.add_cpds(cpd_history, cpd_time, cpd_promotion, cpd_purchase)

# 4. Validate model
assert model.check_model()
print("✓ Bayesian Network created and validated successfully")


In [None]:
## Inference: Predicting Purchase Probability

# Scenario: Customer with purchase history, little time on site, but clicked promotion
inference = VariableElimination(model)
result = inference.query(variables=['Purchase'], evidence={
    'PurchaseHistory': 1,  # Has purchase history
    'TimeOnSite': 0,       # Short time on site
    'ClickedPromotion': 1   # Clicked promotion
})

print("Purchase Prediction Results:")
print(result)
print("\\n✓ Given the evidence, purchase probability is calculated")


In [None]:
## Continuous Probability Distributions

# Example: Time on site follows a normal distribution
mean_time = 5  # minutes
std_time = 2
observed_time = 6

# Calculate probability of spending less than observed time
time_prob = norm.cdf(observed_time, loc=mean_time, scale=std_time)

print(f"Mean time on site: {mean_time} minutes")
print(f"Standard deviation: {std_time} minutes")
print(f"P(Time < {observed_time} min) = {time_prob:.2%}")
print("\\n✓ Continuous probability distributions model uncertainty in measurements")


## Key Takeaways

- **Bayesian Networks** provide a structured way to model uncertain relationships
- **Conditional Probabilities** capture how variables influence each other
- **Inference** allows us to predict outcomes given observed evidence
- **Multiple Distributions** can be combined to handle both discrete and continuous uncertainty

This approach is widely used in recommendation systems, fraud detection, medical diagnosis, and risk assessment.
