# Chapter 7
# Probabilistic Inference with Join Tree Clustering

## Graph Layout and Drawing
The following two functions use **Graphviz** to perform the graph layout and drawing tasks for directed acyclic and undirected graphs.
### dagviz(nodes, edges, name)
This function takes the **nodes** and **edges** for a directed acyclic graph, does the graph layout, creates a *dot* and *PNG* files for drawing the DAG, and writes them in files with the given **name**.
### gviz(nodes, edges, name)
This function takes the **nodes** and e**dges** for an undirected graph, does the graph layout, creates a *dot* and *PNG* files for drawing the undirected graph, and writes them in files with the given **name**.

In [198]:
import pygraphviz as pgv


def dagviz(nodes, edges, name):
    G = pgv.AGraph(strict=False, directed=True)
    G.add_nodes_from(nodes)
    for edge in edges:
        G.add_edge(edge)
    G.write(name + ".dot")
    # use dot
    G.layout(prog="dot")
    # write previously positioned graph to PNG file
    G.draw(name + ".png")


def gviz(nodes, edges, name):
    A = pgv.AGraph()
    A.add_nodes_from(nodes)
    for edge in edges:
        A.add_edge(edge)
    A.write(name + ".dot")
    # use dot
    A.layout(prog="dot")
    # write previously positioned graph to PNG file
    A.draw(name + ".png")

## Circuit diagnosis Bayesian Network
![circuit diagnosis example](circuit.png)
#### Create the BN model and specify its structure
The following code creates the circuit diagnosis BN model. The model factors according to a  DAG having 10 nodes and 12 edges. 
 
The nodes are the following network variables:

	A1, A3, Y. A2, X, Z, E, D, A4, F

Four of the above variables network variables

	A1, A2, A3, A4 

represent hypotheses about the working condition of respective NOR gates of the circuit.

All network variables are binary, taking values 0 and 1.

The directed edges of the DAG are as follows:
	
	A1 -> Z, A3 -> E, Y -> Z , Y -> E, A2 -> D, 
	X -> Z, X -> D, Z -> E, Z -> D, E -> F, 
	D -> F, A4 -> F  

##  Qualitative Specification of the Bayesian Network
The following code creates the qualitative component of the circuit diagnosis Bayesian network model.

### 1. Specifying the nodes and directed edges of the Bayesian network

The following code specifies the nodes which are the network variables and the directed edges which represents direct probabilistic influence between distinct pairs of network variables.

In [199]:
from pgmpy.models import BayesianNetwork
def circuit():
    circuit: BayesianNetwork = BayesianNetwork()
    circuit.add_nodes_from(["X", "Y", "Z", "D", "E", "F", "A1", "A2", "A3", "A4"])
    circuit.add_edge("X", "Z")
    circuit.add_edge("Y", "Z")
    circuit.add_edge("A1", "Z")
    circuit.add_edge("X", "D")
    circuit.add_edge("Z", "D")
    circuit.add_edge("A2", "D")
    circuit.add_edge("Z", "E")
    circuit.add_edge("Y", "E")
    circuit.add_edge("A3", "E")
    circuit.add_edge("D", "F")
    circuit.add_edge("A4", "F")
    circuit.add_edge("E", "F")
    return circuit

### 2. Creating a directed acyclic graph representing the Bayesian network model

directed acyclic graph object to represent the BN structure having the nodes and edges specified for the BN model.

In [200]:
from pgmpy.base import DAG
def circuit_dag():
    model = circuit()
    G = DAG()
    nodes = model.nodes
    edges = model.edges
    G.add_nodes_from(nodes)
    G.add_edges_from(edges)
    return G

print(f"creating the directed acyclic graph for the circuit diagnosis example")
G = circuit_dag()
print(G)

creating the directed acyclic graph for the circuit diagnosis example
DAG with 10 nodes and 12 edges


## Creating DOT and PNG files for the DAG of the circuit diagnosis BN

The following code uses our **dagviz()** function to create the **DOT** and **PNG** files named *DAG_Circuit* for the circuit diagnosis BN. Note that the nodes and edges parameters for **dagviz()** are inputted from the nodes and edges of the DAG object *G*.

In [201]:
dagviz(G.nodes, G.edges, "DAG_circuit")

## Getting the minimal d-separator set between two nodes

The following code uses the **minimal_dseparator()** method of the **DAG** object, *G*, to get the minimal d-separator set separating the node "E" from the node "X." The two variables for the two nodes are conditionally independent given the values of the variables in a d-separator set.

In [202]:
G.minimal_dseparator("E", "X")

{'Y', 'Z'}

## Finding all independencies in the circuit diagnosis Bayesian Network

The following code uses the **get_independencies()** method of the **DAG** object, *G*, to compute all the d-separations in *G*, which implies conditional independencies in the probabilistic model. The result is written in a text file.

In [203]:
from contextlib import redirect_stdout

with open("circuit_dag_d-separations.txt", "w") as f:
    with redirect_stdout(f):
        print(G.get_independencies())
f.close()
print(
    f"Wrote all d-separations for the circuit DAG into file [circuit_dag_d-separations.txt]"
)

Wrote all d-separations for the circuit DAG into file [circuit_dag_d-separations.txt]


## Quantitative Specification of the Bayesian Network

The following code specifies the quantitative component of the circuit diagnosis Bayesian network model.

The following code specifies the condition probability distributions (**CPD**s) for all the network variables and adds the **CPD**s to the model. 

After quantifying the BN model for the circuit diagnosis example, we check the model to see if its specification is correct. The function **check_model()** should return *True* if the checking is successful.

In [204]:
from pgmpy.factors.discrete import TabularCPD
model = circuit()
cpd_X = TabularCPD(variable="X", variable_card=2, values=[[0.5], [0.5]])
cpd_Y = TabularCPD(variable="Y", variable_card=2, values=[[0.5], [0.5]])
cpd_A1 = TabularCPD(variable="A1", variable_card=2, values=[[0.95], [0.05]])
cpd_A2 = TabularCPD(variable="A2", variable_card=2, values=[[0.95], [0.05]])
cpd_A3 = TabularCPD(variable="A3", variable_card=2, values=[[0.95], [0.05]])
cpd_A4 = TabularCPD(variable="A4", variable_card=2, values=[[0.95], [0.05]])
cpd_Z = TabularCPD(
    variable="Z",
    variable_card=2,
    values=[
        [1e-9, 1 - 1e-9, 1 - 1e-9, 1 - 1e-9, 0.5, 0.5, 0.5, 0.5],
        [1 - 1e-9, 1e-9, 1e-9, 1e-9, 0.5, 0.5, 0.5, 0.5],
    ],
    evidence=["A1", "X", "Y"],
    evidence_card=[2, 2, 2],
)
cpd_D = TabularCPD(
    variable="D",
    variable_card=2,
    values=[
        [1e-9, 1 - 1e-9, 1 - 1e-9, 1 - 1e-9, 0.5, 0.5, 0.5, 0.5],
        [1 - 1e-9, 1e-9, 1e-9, 1e-9, 0.5, 0.5, 0.5, 0.5],
    ],
    evidence=["A2", "X", "Z"],
    evidence_card=[2, 2, 2],
)
cpd_E = TabularCPD(
    variable="E",
    variable_card=2,
    values=[
        [1e-9, 1 - 1e-9, 1 - 1e-9, 1 - 1e-9, 0.5, 0.5, 0.5, 0.5],
        [1 - 1e-9, 1e-9, 1e-9, 1e-9, 0.5, 0.5, 0.5, 0.5],
    ],
    evidence=["A3", "Y", "Z"],
    evidence_card=[2, 2, 2],
)
cpd_F = TabularCPD(
    variable="F",
    variable_card=2,
    values=[
        [1e-9, 1 - 1e-9, 1 - 1e-9, 1 - 1e-9, 0.5, 0.5, 0.5, 0.5],
        [1 - 1e-9, 1e-9, 1e-9, 1e-9, 0.5, 0.5, 0.5, 0.5],
    ],
    evidence=["A4", "D", "E"],
    evidence_card=[2, 2, 2],
)

model.add_cpds(cpd_X, cpd_Y, cpd_Z, cpd_D, cpd_E, cpd_F, cpd_A1, cpd_A2, cpd_A3, cpd_A4)

print(model.check_model())

True


## Markov Network Model 

The following code uses the method **to_markov_model()** for the **BayesianNetwork** object *model* to map the BN model of our circuit diagnosis example to a representation as a Markov network model. The Markov network model will map the **CPD**s of the Bayesian network to **factors**. We use the method **get_factors()** for the **MarkovNetwork** object, *markovmodel*, and then we print all the factors of the model using a **for** loop.

In [205]:
from pgmpy.models import MarkovNetwork
markovmodel = model.to_markov_model()
factors = markovmodel.get_factors()
for factor in factors:
    print(factor)

+------+----------+
| X    |   phi(X) |
| X(0) |   0.5000 |
+------+----------+
| X(1) |   0.5000 |
+------+----------+
+------+----------+
| Y    |   phi(Y) |
| Y(0) |   0.5000 |
+------+----------+
| Y(1) |   0.5000 |
+------+----------+
+------+-------+------+------+-----------------+
| Z    | A1    | X    | Y    |   phi(Z,A1,X,Y) |
| Z(0) | A1(0) | X(0) | Y(0) |          0.0000 |
+------+-------+------+------+-----------------+
| Z(0) | A1(0) | X(0) | Y(1) |          1.0000 |
+------+-------+------+------+-----------------+
| Z(0) | A1(0) | X(1) | Y(0) |          1.0000 |
+------+-------+------+------+-----------------+
| Z(0) | A1(0) | X(1) | Y(1) |          1.0000 |
+------+-------+------+------+-----------------+
| Z(0) | A1(1) | X(0) | Y(0) |          0.5000 |
+------+-------+------+------+-----------------+
| Z(0) | A1(1) | X(0) | Y(1) |          0.5000 |
+------+-------+------+------+-----------------+
| Z(0) | A1(1) | X(1) | Y(0) |          0.5000 |
+------+-------+------+--

## Checking the Markov Model is correct 

The following code uses the **check_model()** method of the **MarkovModel** object, *markovmodel*, to verify that the model specification is correct.

In [206]:
markovmodel.check_model()

True

## Getting the Partition Function for the Markov model

The following code uses the **get_partition_function()** method of the **MarkovNetwork** object, *markovmodel*, to get the partition function of the Markov model factors. The method returns a number that normalizes the product of all factors to produce a probability distribution when divided by the partition function. A probability distribution must sum up to 1 on all the network variables and all values from their domains.

In [207]:
markovmodel.get_partition_function()

1.0000000000000002

## Getting the nodes and edges of the factor graph

To get the undirected graph, *factor_graph*,  of the circuit diagnosis example' Markov network model, *markovmodel*, we use the method **to_factor_graph()**. We then print the nodes and edges of the undirected graph, factor_graph. The factor graph consists of two subsets of nodes; one contains the network variables, and the other contains the factors, which are the potential functions of the Markov model.

In [208]:
factor_graph = markovmodel.to_factor_graph()
print(factor_graph.nodes)
print(factor_graph.edges)

['X', 'Z', 'D', 'Y', 'A1', 'A2', 'E', 'A3', 'F', 'A4', 'phi_X', 'phi_Y', 'phi_Z_A1_X_Y', 'phi_D_A2_X_Z', 'phi_E_A3_Y_Z', 'phi_F_A4_D_E', 'phi_A1', 'phi_A2', 'phi_A3', 'phi_A4']
[('X', 'phi_X'), ('X', 'phi_Z_A1_X_Y'), ('X', 'phi_D_A2_X_Z'), ('Z', 'phi_Z_A1_X_Y'), ('Z', 'phi_D_A2_X_Z'), ('Z', 'phi_E_A3_Y_Z'), ('D', 'phi_D_A2_X_Z'), ('D', 'phi_F_A4_D_E'), ('Y', 'phi_Y'), ('Y', 'phi_Z_A1_X_Y'), ('Y', 'phi_E_A3_Y_Z'), ('A1', 'phi_Z_A1_X_Y'), ('A1', 'phi_A1'), ('A2', 'phi_D_A2_X_Z'), ('A2', 'phi_A2'), ('E', 'phi_E_A3_Y_Z'), ('E', 'phi_F_A4_D_E'), ('A3', 'phi_E_A3_Y_Z'), ('A3', 'phi_A3'), ('F', 'phi_F_A4_D_E'), ('A4', 'phi_F_A4_D_E'), ('A4', 'phi_A4')]


## Creating DOT and PNG files for the undirected factor graph of the circuit diagnosis BN

The following code uses our **gviz()** function to create the **DOT** and **PNG** files named *factor_graph* for the circuit diagnosis Markov model. Note that the nodes and edges parameters for **gviz()** are inputted from the nodes and edges of the undirected factor graph object *factor_graph*.

In [209]:
gviz(factor_graph.nodes, factor_graph.edges, "factor_graph")

## Using the Pgmpy library to get a join tree from the Markov model 

The following code gets the junction tree structure built by the **Pgmpy** library by using the **to_junction_tree()** method of the Markov model object, *markovmodel*. The code prints the factors for the junction tree model for our circuit diagnosis example.

In [210]:
from pgmpy.models import JunctionTree
JT = markovmodel.to_junction_tree()
print(JT.factors)

[<DiscreteFactor representing phi(X:2, A2:2, Z:2, D:2) at 0x141486bd0>, <DiscreteFactor representing phi(X:2, Z:2, A4:2, A3:2, Y:2, A1:2, D:2) at 0x16c7fb1a0>, <DiscreteFactor representing phi(Z:2, A4:2, E:2, A3:2, F:2, Y:2, D:2) at 0x16c7f92b0>]


In [211]:
print(JT.nodes)

[('D', 'Z', 'A2', 'X'), ('D', 'Z', 'Y', 'A4', 'A3', 'X', 'A1'), ('D', 'Z', 'Y', 'A4', 'A3', 'F', 'E')]


## Creating DOT and PNG files for the Pgmpy Join Tree

The following code uses our **gviz()** function to create the **DOT** and **PNG** files named *pgmpy_JT* for the circuit diagnosis Join Tree computed by the Pgmpy library. Note that the nodes and edges parameters for **gviz()** are inputted from the nodes and edges of the **JunctionTree** object *JT*.

In [212]:
gviz(JT.nodes, JT.edges, "pgmpy_JT")

## Showing the maximum cliques computed by the Pgmpy library

The following code prints the nodes of the junction tree object, *JT*, computed by the **Pgmpy** library. The result shows four maximum cliques compared to the six maximum cliques from our join tree computation. Our join tree clustering computations are superior as they significantly reduce the clique size, which translates to an exponential decrease in the space requirements for making inferences.

**Building the join tree, computed from the join tree clustering algorithm, for the circuit diagnosis BN**.

This process involves creating a join tree structure specific to our Bayesian network model. 

We use the **JunctionTree()** constructor to create a join tree structure as an object of the *JunctionTree* class from the *Pgmpy* library.

Then, we add the nodes and edges for the join tree computed from the join tree clustering algorithm for the circuit diagnosis DAG of its Bayesian network model. The nodes are the maximum cliques, and the edges are pairs of nodes connected by an edge in the join tree. 

We add the nodes by feeding the list of nodes to the **add_nodes_from()** method for the JunctionTree object, *circuit_JT*. We add the edges by feeding the list of edges to the **add_edges_from()** method for the JunctionTree object, *circuit_JT*.

In [213]:
circuit_JT = JunctionTree()
circuit_JT.add_nodes_from([('Z', 'X', 'Y', 'A1'), ('Z', 'X', 'Y', 'E'), ('Z', 'A3', 'Y', 'E'), ('Z', 'X', 'D', 'E'), ('Z', 'X', 'A2', 'D'), ('A4', 'E', 'D', 'F')])
circuit_JT.add_edges_from([(('Z', 'X', 'Y', 'A1'), ('Z', 'X', 'Y', 'E')),(('Z', 'X', 'Y', 'E'), ('Z', 'A3', 'Y', 'E')),(('Z', 'X', 'Y', 'E'),('Z', 'X', 'D', 'E')),(('Z', 'X', 'D', 'E'),('Z', 'X', 'A2', 'D')),(('Z', 'X', 'D', 'E'),('A4', 'E', 'D', 'F'))])


## Creating DOT and PNG files for the Join Tree from the Tree Clustering Algorithm 

The following code uses our **gviz()** function to create the **DOT** and **PNG** files named *circuit_JT* for the circuit diagnosis Join Tree computed by the Tree clustering algorithm. Note that the nodes and edges parameters for **gviz()** are inputted from the nodes and edges of the **JunctionTree** object *circuit_JT*.

In [214]:
gviz(circuit_JT.nodes, circuit_JT.edges, "circuit_JT")

## Getting the Potential Functions of the Markov Model

The following code maps the CPDs of the Bayesian network, which are *Pgmpy* library type **TabularCPD** objects to *Pgmpy* **DiscreteFactor** objects using the **to_factor()** method applied to the **TabularCPD** objects. 

The code transforms the representation of the circuit BN to a Markov model where the *DiscreteFactor* objects represent the potential functions of the Markov model.

In [215]:
phi_A1 = cpd_A1.to_factor()
phi_A2 = cpd_A2.to_factor()
phi_A3 = cpd_A3.to_factor()
phi_A4 = cpd_A4.to_factor()
phi_X = cpd_X.to_factor()
phi_Y = cpd_Y.to_factor()
phi_Z = cpd_Z.to_factor()
phi_D = cpd_D.to_factor()
phi_E = cpd_E.to_factor()
phi_F = cpd_F.to_factor()


**Computing the Potential Fuctions for the Join Tree**

The following code computes the potential functions for the maximum cliques computed by the tree clustering algorithm for our circuit diagnosis Bayesian network. 

The computed join tree has six cliques, which are the following subsets of the Bayesian network variables:

C1: ('Z', 'X', 'Y', 'A1'),  
C2: ('Z', 'X', 'Y', 'E'),  
C3: ('Z', 'A3', 'Y', 'E'),  
C4: ('Z', 'X', 'D', 'E'),  
C5: ('Z', 'X', 'A2', 'D'),  
C6: ('A4', 'E', 'D', 'F')

To compute the potential function for each clique, we multiply all potential functions of the Markov model allocated to the clique. If a clique does not have any potential function allocated, we assign a potential function for the clique whose value equals 1 for all value instantiations of the clique variables. 

We use the **Numpy** library to multiplicate the *DiscreteFactor* objects and define potential functions whose values are equal to 1.

In [216]:
from pgmpy.factors.discrete import DiscreteFactor
import numpy as np
# clique C1: ('Z', 'X', 'Y', 'A1')
phi_C1 = phi_Z
phi_C1.product(phi_X)
phi_C1.product(phi_Y)
phi_C1.product(phi_A1)
print(phi_C1)
# clique C2: ('Z', 'X', 'Y', 'E')
phi_C2 = DiscreteFactor(['Z', 'X', 'Y', 'E'], [2, 2, 2, 2], np.ones(16) )
print(phi_C2)
# clique C3: ('Z', 'A3', 'Y', 'E')
phi_C3 = phi_E
phi_C3.product(phi_A3)
print(phi_C3)
# clique C4: ('Z', 'X', 'D', 'E')
phi_C4 = DiscreteFactor(('Z', 'X', 'D', 'E'), [2, 2, 2, 2], np.ones(16))
print(phi_C4)
# clique C5: ('Z', 'X', 'A2', 'D')
phi_C5 = phi_D
phi_C5.product(phi_A2)
print(phi_C5)
# clique C6: ('A4', 'E', 'D', 'F')
phi_C6 = phi_F
phi_C6.product(phi_A4)
print(phi_C6)

+------+------+------+-------+-----------------+
| X    | Z    | Y    | A1    |   phi(X,Z,Y,A1) |
| X(0) | Z(0) | Y(0) | A1(0) |          0.0000 |
+------+------+------+-------+-----------------+
| X(0) | Z(0) | Y(0) | A1(1) |          0.0063 |
+------+------+------+-------+-----------------+
| X(0) | Z(0) | Y(1) | A1(0) |          0.2375 |
+------+------+------+-------+-----------------+
| X(0) | Z(0) | Y(1) | A1(1) |          0.0063 |
+------+------+------+-------+-----------------+
| X(0) | Z(1) | Y(0) | A1(0) |          0.2375 |
+------+------+------+-------+-----------------+
| X(0) | Z(1) | Y(0) | A1(1) |          0.0063 |
+------+------+------+-------+-----------------+
| X(0) | Z(1) | Y(1) | A1(0) |          0.0000 |
+------+------+------+-------+-----------------+
| X(0) | Z(1) | Y(1) | A1(1) |          0.0063 |
+------+------+------+-------+-----------------+
| X(1) | Z(0) | Y(0) | A1(0) |          0.2375 |
+------+------+------+-------+-----------------+
| X(1) | Z(0) | Y(0)

## Quantification of the Join Tree

The following code quantifies the join tree by adding all the potential functions of the cliques, which are the join tree factors of *Pgmpy* type *DiscreteFactor* objects.
 
To add the factors to the join tree object, *circuit_JT*,  we use the **add_factors()** method of the *Pgmpy* *JunctionTree* object.

Next, we use the **check_model()** method of the *pgmpy JunctionTree* object to verify the join tree model.

In [217]:
circuit_JT.add_factors(phi_C1, phi_C2, phi_C3, phi_C4, phi_C5, phi_C6)
circuit_JT.check_model()

True

## Inference by Belief Propagation

The following code imports the **BeliefProbpagation** library from the **Pgmpy.inference.ExactInference** package.

The code uses the **query()** method of the **BeliefPropagation** object

The query takes two parameters:

1. a list of variables on whom the posterior distribution is defined

2. a list of observed variable value pairs on which the posterior distribution is conditioned.

## Belief Propagation using the Pgmpy Join Tree

In the following, we formulate queries for inference tasks using belief propagation with the join tree 

***JT***

computed by the **Pgmpy** method

### Query 1.1 Finding the Posterior Probability on assumption variables given evidence e1

The query in the code below returns the posterior probability distribution over all four variables
 
['A1', 'A2', 'A3', 'A4'] 

conditioned on the evidence corresponding to the evidence e1 where the input variables 'X' and 'Y' being on (value 1) and the output variable 'F' being off (value 0). 

{'X': 1, 'Y': 1, 'F': 0})

The posterior probability distribution determines the probability over all single and multiple faults for the circuit components.


In [230]:
from pgmpy.inference.ExactInference import BeliefPropagation

belief_propagation = BeliefPropagation(JT)
res1 = belief_propagation.query(
    variables=['A1', 'A2', 'A3', 'A4'],
    evidence={'X': 1, 'Y': 1, 'F': 0})
print(res1)

+-------+-------+-------+-------+--------------------+
| A1    | A2    | A3    | A4    |   phi(A1,A2,A3,A4) |
| A1(0) | A2(0) | A3(0) | A4(0) |             0.0000 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(0) | A3(0) | A4(1) |             0.2981 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(0) | A3(1) | A4(0) |             0.2981 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(0) | A3(1) | A4(1) |             0.0157 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(1) | A3(0) | A4(0) |             0.2981 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(1) | A3(0) | A4(1) |             0.0157 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(1) | A3(1) | A4(0) |             0.0235 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(1) | A3(1) | A4(1) |             0.0008 |
+-------+-------+-------+-------+--------------------+
| A1(1) | 

### Query 1.2- Finding the Posterior Probability on assumption variables given evidence e2

The query in the code below finds the posterior probability distribution over all four variables, 

['A1', 'A2', 'A3', 'A4'] 

given the evidence **e2** where the input variables 'X' and 'Y' are on (value 1), the output variable 'F' is off (value 0), and the output variable 'D' of component #2 is on (value 1)

{'X': 1, 'Y': 1, 'F': 0, 'D': 1}

The query results are expected to give us a comprehensive probability distribution over all single and multiple faults for the circuit components. 

Relative to query 1, the results of query 2 due to added observation will significantly strengthen our inference, making it more specific and reliable..

In [219]:
from pgmpy.inference.ExactInference import BeliefPropagation

belief_propagation = BeliefPropagation(JT)
res2 = belief_propagation.query(
    variables=['A1', 'A2', 'A3', 'A4'],
    evidence={'X': 1, 'Y': 1, 'F': 0, 'D': 1})
print(res2)

+-------+-------+-------+-------+--------------------+
| A1    | A2    | A3    | A4    |   phi(A1,A2,A3,A4) |
| A1(0) | A2(0) | A3(0) | A4(0) |             0.0000 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(0) | A3(0) | A4(1) |             0.0000 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(0) | A3(1) | A4(0) |             0.0000 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(0) | A3(1) | A4(1) |             0.0000 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(1) | A3(0) | A4(0) |             0.8794 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(1) | A3(0) | A4(1) |             0.0231 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(1) | A3(1) | A4(0) |             0.0463 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(1) | A3(1) | A4(1) |             0.0012 |
+-------+-------+-------+-------+--------------------+
| A1(1) | 

## Query 1.3- Computing the MAP for assumption variables given evidence e1

The query in the code below is designed to compute the Maximum Posteriori Hypothesis (MAP) over all four variables,

['A1', 'A2', 'A3', 'A4'],

subject to the evidence **e1** where the input variables 'X' and 'Y' are on (value 1) and the output variable 'F' is off (value 0),

{'X': 1, 'Y': 1, 'F': 0})

The MAP query is designed to return the value assignment for the four assumption variables, which have the maximum posterior probability over all value instantiations of other unobserved variables, given the evidence **e1**. 

In [220]:
from pgmpy.inference.ExactInference import BeliefPropagation

belief_propagation = BeliefPropagation(JT)
res3 = belief_propagation.map_query(
    variables=['A1', 'A2', 'A3', 'A4'],
    evidence={'X': 1, 'Y': 1, 'F': 0})
print(res3)

{'A1': 0, 'A2': 0, 'A3': 1, 'A4': 0}


## Query 1.4- Computing the MAP for assumption variables given evidence e2

The query in the code below is designed to compute the Maximum Posteriori Hypothesis (MAP) over all four variables,

['A1', 'A2', 'A3', 'A4'],

given the evidence **e2** where the input variables 'X' and 'Y' are on (value 1), the output variable 'F' is off (value 0), and the output variable 'D' of component #2 is on (value 1)

{'X': 1, 'Y': 1, 'F': 0, 'D': 1}

Relative to query 3, the results of query 4 due to added observation will significantly strengthen our confidence in the computed maximum posterior hypothesis.

In [221]:
from pgmpy.inference.ExactInference import BeliefPropagation

belief_propagation = BeliefPropagation(JT)
res4 = belief_propagation.map_query(
    variables=['A1', 'A2', 'A3', 'A4'],
    evidence={'X': 1, 'Y': 1, 'F': 0, 'D': 1})
print(res4)

{'A1': 0, 'A2': 1, 'A3': 0, 'A4': 0}


## Query 1.5- Computing the MPE given evidence e1

The code below computes the most probable explanation (MPE) over the unobserved variables,
 
['A1', 'A2', 'A3', 'A4', 'Z', 'D', 'E']

subject to the evidence **e1** where the input variables 'X' and 'Y' are on (value 1) and the output variable 'F' is off (value 0),

{'X': 1, 'Y': 1, 'F': 0})

In [222]:
from pgmpy.inference.ExactInference import BeliefPropagation

belief_propagation = BeliefPropagation(JT)
res5 = belief_propagation.map_query(
    variables=['A1', 'A2', 'A3', 'A4', 'Z', 'D', 'E'],
    evidence={'X': 1, 'Y': 1, 'F': 0})
print(res5)

{'A1': 0, 'A2': 0, 'A3': 1, 'A4': 0, 'Z': 0, 'D': 0, 'E': 1}


### Query 1.6- Computing the MPE given evidence e2

The code below computes the most probable explanation (MPE) over the unobserved variables,
 
['A1', 'A2', 'A3', 'A4', 'Z', 'E']

given the evidence **e2** where the input variables 'X' and 'Y' are on (value 1), the output variable 'F' is off (value 0), and the output variable 'D' of component #2 is on (value 1)

{'X': 1, 'Y': 1, 'F': 0, 'D': 1}

In [223]:
from pgmpy.inference.ExactInference import BeliefPropagation

belief_propagation = BeliefPropagation(JT)
res6 = belief_propagation.map_query(
     variables=['A1', 'A2', 'A3', 'A4', 'Z', 'E'],
    evidence={'X': 1, 'Y': 1, 'F': 0, 'D': 1})
print(res6)


{'A1': 0, 'A2': 1, 'A3': 0, 'A4': 0, 'Z': 0, 'E': 0}


## Belief Propagation Computed by the Tree Clustering Algorithm

In the following, we formulate queries for inference tasks using belief propagation with the join tree 

***circuit_JT***

computed by the *Tree Clustering Algorithm*.

### Query 2.1- Finding the Posterior Probability on assumption variables given evidence e1

The query in the code below returns the posterior probability distribution over all four variables
 
['A1', 'A2', 'A3', 'A4'] 

conditioned on the evidence corresponding to the evidence e1 where the input variables 'X' and 'Y' being on (value 1) and the output variable 'F' being off (value 0). 

{'X': 1, 'Y': 1, 'F': 0})

The posterior probability distribution determines the probability over all single and multiple faults for the circuit components.

In [224]:
from pgmpy.inference.ExactInference import BeliefPropagation

belief_propagation = BeliefPropagation(circuit_JT)
res1 = belief_propagation.query(
    variables=['A1', 'A2', 'A3', 'A4'],
    evidence={'X': 1, 'Y': 1, 'F': 0})
print(res1)

+-------+-------+-------+-------+--------------------+
| A1    | A2    | A3    | A4    |   phi(A1,A2,A3,A4) |
| A1(0) | A2(0) | A3(0) | A4(0) |             0.0000 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(0) | A3(0) | A4(1) |             0.2981 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(0) | A3(1) | A4(0) |             0.2981 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(0) | A3(1) | A4(1) |             0.0157 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(1) | A3(0) | A4(0) |             0.2981 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(1) | A3(0) | A4(1) |             0.0157 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(1) | A3(1) | A4(0) |             0.0235 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(1) | A3(1) | A4(1) |             0.0008 |
+-------+-------+-------+-------+--------------------+
| A1(1) | 

### Query 2.2- Finding the Posterior Probability on assumption variables given evidence e2

The query in the code below finds the posterior probability distribution over all four variables, 

['A1', 'A2', 'A3', 'A4'] 

given the evidence **e2** where the input variables 'X' and 'Y' are on (value 1), the output variable 'F' is off (value 0), and the output variable 'D' of component #2 is on (value 1)

{'X': 1, 'Y': 1, 'F': 0, 'D': 1}

The query results are expected to give us a comprehensive probability distribution over all single and multiple faults for the circuit components. 

Relative to query 1, the results of query 2 due to added observation will significantly strengthen our inference, making it more specific and reliable.

In [225]:
from pgmpy.inference.ExactInference import BeliefPropagation

belief_propagation = BeliefPropagation(circuit_JT)
res2 = belief_propagation.query(
    variables=['A1', 'A2', 'A3', 'A4'],
    evidence={'X': 1, 'Y': 1, 'F': 0, 'D': 1})
print(res2)

+-------+-------+-------+-------+--------------------+
| A1    | A2    | A3    | A4    |   phi(A1,A2,A3,A4) |
| A1(0) | A2(0) | A3(0) | A4(0) |             0.0000 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(0) | A3(0) | A4(1) |             0.0000 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(0) | A3(1) | A4(0) |             0.0000 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(0) | A3(1) | A4(1) |             0.0000 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(1) | A3(0) | A4(0) |             0.8794 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(1) | A3(0) | A4(1) |             0.0231 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(1) | A3(1) | A4(0) |             0.0463 |
+-------+-------+-------+-------+--------------------+
| A1(0) | A2(1) | A3(1) | A4(1) |             0.0012 |
+-------+-------+-------+-------+--------------------+
| A1(1) | 

## Query 2.3- Computing the MAP for assumption variables given evidence e1

The query in the code below is designed to compute the Maximum Posteriori Hypothesis (MAP) over all four variables,

['A1', 'A2', 'A3', 'A4'],

subject to the evidence **e1** where the input variables 'X' and 'Y' are on (value 1) and the output variable 'F' is off (value 0),

{'X': 1, 'Y': 1, 'F': 0})

The MAP query is designed to return the value assignment for the four assumption variables, which have the maximum posterior probability over all value instantiations of other unobserved variables, given the evidence **e1**. 

In [226]:
from pgmpy.inference.ExactInference import BeliefPropagation

belief_propagation = BeliefPropagation(circuit_JT)
res3 = belief_propagation.map_query(
    variables=['A1', 'A2', 'A3', 'A4'],
    evidence={'X': 1, 'Y': 1, 'F': 0})
print(res3)

{'A1': 0, 'A2': 0, 'A3': 1, 'A4': 0}


## Query 2.4- Computing the MAP for assumption variables given evidence e2

The query in the code below is designed to compute the Maximum Posteriori Hypothesis (MAP) over all four variables,

['A1', 'A2', 'A3', 'A4'],

given the evidence **e2** where the input variables 'X' and 'Y' are on (value 1), the output variable 'F' is off (value 0), and the output variable 'D' of component #2 is on (value 1)

{'X': 1, 'Y': 1, 'F': 0, 'D': 1}

Relative to query 3, the results of query 4 due to added observation will significantly strengthen our confidence in the computed maximum posterior hypothesis.

In [227]:
from pgmpy.inference.ExactInference import BeliefPropagation

belief_propagation = BeliefPropagation(circuit_JT)
res4 = belief_propagation.map_query(
    variables=['A1', 'A2', 'A3', 'A4'],
    evidence={'X': 1, 'Y': 1, 'F': 0, 'D': 1})
print(res4)

{'A1': 0, 'A2': 1, 'A3': 0, 'A4': 0}


## Query 2.5- Computing the MPE given evidence e1

The code below computes the most probable explanation (MPE) over the unobserved variables,
 
['A1', 'A2', 'A3', 'A4', 'Z', 'D', 'E']

subject to the evidence **e1** where the input variables 'X' and 'Y' are on (value 1) and the output variable 'F' is off (value 0),

{'X': 1, 'Y': 1, 'F': 0})


In [228]:
from pgmpy.inference.ExactInference import BeliefPropagation

belief_propagation = BeliefPropagation(circuit_JT)
res5 = belief_propagation.map_query(
    variables=['A1', 'A2', 'A3', 'A4','Z', 'D','E'],
    evidence={'X': 1, 'Y': 1, 'F': 0})
print(res5)

{'A1': 0, 'A2': 0, 'A3': 0, 'A4': 1, 'Z': 0, 'D': 0, 'E': 0}


### Query 2.6- Computing the MPE given evidence e2

The code below computes the most probable explanation (MPE) over the unobserved variables,
 
['A1', 'A2', 'A3', 'A4', 'Z', 'E']

given the evidence **e2** where the input variables 'X' and 'Y' are on (value 1), the output variable 'F' is off (value 0), and the output variable 'D' of component #2 is on (value 1)

{'X': 1, 'Y': 1, 'F': 0, 'D': 1} 

In [229]:
from pgmpy.inference.ExactInference import BeliefPropagation

belief_propagation = BeliefPropagation(circuit_JT)
res6 = belief_propagation.map_query(
     variables=['A1', 'A2', 'A3', 'A4', 'Z', 'E'],
    evidence={'X': 1, 'Y': 1, 'F': 0, 'D': 1})
print(res6)

{'A1': 0, 'A2': 1, 'A3': 0, 'A4': 0, 'Z': 0, 'E': 0}
