# The Sprinkler Example in pgmpy

pgmpy is a python library for working with Probabilistic Graphical Models.

Documentation and list of algorithms supported is at our official site http://pgmpy.org/
Examples on using pgmpy: https://github.com/pgmpy/pgmpy/tree/dev/examples

The following examples are taken from the basic tutorial on Probabilistic Graphical models using pgmpy: https://github.com/pgmpy/pgmpy_notebook  

Note 1: need to instal pgmpy and scikit-learn

Note 2: you can get models from https://www.bnlearn.com/bnrepository/ via, e.g., 

alarm = get_example_model("alarm")

In [1]:
from pgmpy.models import BayesianNetwork

## Step 1: Define the model structure

The BayesianModel can be initialized by passing a list of edges in the model structure. In this case, there are 4 edges in the model: Cloudy -> Sprinkler, Cloudy -> Raining, Raining -> Wet, Sprinkler -> Wet.

In [None]:
sprinkler_model = BayesianNetwork(
    [
        ("Cloudy", "Raining"),
        ("Cloudy", "Sprinkler"),
        ("Raining", "Wet"),
        ("Sprinkler", "Wet"),
    ]
)

## Step 2: Define the CPDs
Each node of a Bayesian Network has a CPD associated with it, hence we need to define 4 CPDs in this case. In pgmpy, CPDs can be defined using the TabularCPD class. For details on the parameters, please refer to the documentation: https://pgmpy.org/_modules/pgmpy/factors/discrete/CPD.html

In [None]:
from pgmpy.factors.discrete import TabularCPD

cpd_cloudy = TabularCPD(variable="Cloudy", variable_card=2, values=[[0.5], [0.5]])
cpd_sprinkler = TabularCPD(
    variable="Sprinkler",
    variable_card=2,
    values=[[0.1, 0.5], [0.9, 0.5]],
    evidence=["Cloudy"],
    evidence_card=[2],
)
cpd_raining = TabularCPD(
    variable="Raining",
    variable_card=2,
    values=[[0.8, 0.5], [0.2, 0.5]],
    evidence=["Cloudy"],
    evidence_card=[2],
)
cpd_wet = TabularCPD(
    variable="Wet",
    variable_card=2,
    values=[[1.0, 0.1, 0.1, 0.01], [0.0, 0.90, 0.90, 0.99]],
    evidence=["Sprinkler", "Raining"],
    evidence_card=[2, 2],
)

## Step 3: Add the CPDs to the model.
After defining the model parameters, we can now add them to the model using add_cpds method. The check_model method can also be used to verify if the CPDs are correctly defined for the model structure.

In [None]:
# Associating the parameters with the model structure.
sprinkler_model.add_cpds(cpd_cloudy, cpd_sprinkler, cpd_raining, cpd_wet)

# Checking if the cpds are valid for the model.
sprinkler_model.check_model()

print("Nodes in the model:", sprinkler_model.nodes())
print("Edges in the model:", sprinkler_model.edges())

print(sprinkler_model.get_cpds("Sprinkler"))

## Step 4: Run basic operations on the model

In [None]:
# Check for d-separation between variables
print(sprinkler_model.is_dconnected("Cloudy", "Wet", observed=[]))
print(sprinkler_model.is_dconnected("Cloudy", "Wet", observed=["Sprinkler"]))
print(sprinkler_model.is_dconnected("Cloudy", "Wet", observed=["Raining"]))
print(sprinkler_model.is_dconnected("Cloudy", "Wet", observed=["Sprinkler", "Raining"]))

In [None]:
# Get all d-connected nodes
sprinkler_model.active_trail_nodes("Wet")

In [None]:
# List local independencies for a node
sprinkler_model.local_independencies("Wet")

In [None]:
sprinkler_model.local_independencies("Sprinkler")

In [None]:
sprinkler_model.local_independencies("Cloudy")

In [None]:
# Get all model implied independence conditions
sprinkler_model.get_independencies()

## Step 5: Perform some inference via Variable Elimination
Currently, pgmpy support two algorithms for inference: 1. Variable Elimination and, 2. Belief Propagation. Both of these are exact inferece algorithms. The following example uses VariableElimination but BeliefPropagation has an identifcal API, so all the methods show below would also work for BeliefPropagation.

In [None]:
# Initializing the VariableElimination class

from pgmpy.inference import VariableElimination

sprinkler_infer = VariableElimination(sprinkler_model)

In [None]:
# Computing the probability of Wet (marginalization)
q = sprinkler_infer.query(variables=["Wet"])
print(q)

In [None]:
# Computing the joint probability of Wet and Cloudy.
q = sprinkler_infer.query(variables=["Wet","Sprinkler"])
print(q)

In [None]:
# Computing the probability of Wet and the probability of Cloudy
q = sprinkler_infer.query(variables=["Wet","Cloudy"], joint=False)
print(q["Wet"])
print(q["Cloudy"])

# or ...

for factor in q.values():
    print(factor)

In [None]:
# Computing the probability of Sprinkler given Wet=yes.
q = sprinkler_infer.query(variables=["Sprinkler"], evidence={"Wet":1})
print(q)

In [None]:
# Computing the probability of Raining given Wet=yes.
q = sprinkler_infer.query(variables=["Raining"], evidence={"Wet":1})
print(q)

In [None]:
q=sprinkler_infer.query(variables=["Sprinkler"])
print(q)

In [None]:
q=sprinkler_infer.query(variables=["Sprinkler"],evidence={"Wet":1})
print(q)

In [None]:
# Inference using virtual evidence
cloudy_virt_evidence = TabularCPD(variable="Cloudy", variable_card=2, values=[[0.7], [0.3]])

# Query with hard evidence Wet = 0 and virtual evidence Cloudy = [0.4, 0.6]
q = sprinkler_infer.query(variables=["Sprinkler"], evidence={"Wet": 1}, virtual_evidence=[cloudy_virt_evidence])
print(q)

q = sprinkler_infer.query(variables=["Sprinkler"], evidence={"Wet": 1, "Cloudy": 0}, show_progress=False)
print(q)

In the case of large models, or models in which variables have a lot of states, inference can be quite slow. Some of the ways to deal with it are:

 - Reduce the number of states of variables by combining states together.
 - Try a different elimination order by specifying elimination_order argument. Possible options are: MinFill, MinNeighbors, MinWeight, WeightedMinFill.
 - Try a custom elimination order: The implemented heuristics for computing the elimination order might not be efficient in every case. If you can think of a more efficient order, you can also pass it as a list to the elimination_order argument.
 - If it is still too slow, try using approximate inference using sampling algorithms.

## Step 6: Perform some inference via Belief Propagation / Message Passing

Recall we are going to use Loopy Belief Propagation here ;-) 

In [None]:
# Initializing the VariableElimination class

from pgmpy.inference import BeliefPropagation

sprinkler_infer = BeliefPropagation(sprinkler_model)

In [None]:
# Computing the probability of Wet
q = sprinkler_infer.query(variables=["Wet"])
print(q)

In [None]:
# Computing the joint probability of Wet and Cloudy.
q = sprinkler_infer.query(variables=["Wet","Cloudy"])
print(q)

In [None]:
# Computing the probability of Sprinkler given Wet=yes.
q = sprinkler_infer.query(variables=["Sprinkler"], evidence={"Wet":1})
print(q)

In [None]:
# Computing the probability of Raining given Wet=yes.
q = sprinkler_infer.query(variables=["Raining"], evidence={"Wet":1})
print(q)

In [None]:
# Inference using virtual evidence
cloudy_virt_evidence = TabularCPD(variable="Cloudy", variable_card=2, values=[[0.7], [0.3]])

# Query with hard evidence Wet = 0 and virtual evidence Cloudy = [0.4, 0.6]
q = sprinkler_infer.query(variables=["Sprinkler"], evidence={"Wet": 1}, virtual_evidence=[cloudy_virt_evidence])
print(q)

q = sprinkler_infer.query(variables=["Sprinkler"], evidence={"Wet": 1, "Cloudy": 0}, show_progress=False)
print(q)

## Step 7: Perform some inference via Sampling

Generic Approximate Sampling interface which works for several models, not just for Bayesian Networks


In [None]:
from pgmpy.inference import ApproxInference
sprinkler_infer = ApproxInference(sprinkler_model)

In [None]:
q = sprinkler_infer.query(variables=["Wet"])
print(q)

In [None]:
q = sprinkler_infer.query(n_samples=100000,variables=["Wet"])
print(q)

In [None]:
q = sprinkler_infer.query(n_samples=10,variables=["Sprinkler"],evidence={"Wet":1})
print(q)

In [None]:
q = sprinkler_infer.query(n_samples=10000,variables=["Sprinkler"],evidence={"Wet":1})
print(q)

In [None]:
q = sprinkler_infer.query(n_samples=10000,variables=["Cloudy"],evidence={"Wet":1})
print(q)

## Step 8: Sampling from a Bayesian Network

In [None]:
from pgmpy.sampling import BayesianModelSampling
sprinkler_infer = BayesianModelSampling(sprinkler_model)
samples = sprinkler_infer.forward_sample(size=10)
print(samples)

In [None]:
from pgmpy.factors.discrete import State
samples = sprinkler_infer.rejection_sample(evidence=[State(var="Wet",state=1)], size=20)
print(samples)

In [None]:
from pgmpy.factors.discrete import State
samples = sprinkler_infer.likelihood_weighted_sample(evidence=[State(var="Wet",state=1)], size=10)
print(samples)

In [None]:
from pgmpy.sampling import GibbsSampling
sprinkler_gibbs = GibbsSampling(sprinkler_model)
sprinkler_gibbs.sample(size=200)