# Introduction to Markov Networks



## CSCI E-83
## Stephen Elston

in the previous lesson we explored directed Bayes networks (BNs). BNs are directed acyclic graphs (DAGs) which can represent causality with many probability distributions. Further, BNs can express independencies of probability distributions. However, there is no unique representation of independencies. Many BNs can exhibit the same independency structure, but with different causal relationships. 

Now, we will turn our attention to another class of probabilistic graphical models known as **Markov networks** (**MNs**) or **Markov graphical models**. Markov graphic models are not directed graphs. Yet, like BNs, Markov graphical models can represent probability distributions including the independencies. In fact, as you will see, there are methods to map from BNs to Markov networks and back. 


## Graph separation criteria

You can develop a useful view of the relationship between BNs and MNs with the **D-separation** criteria, **directed-separation** criteria. First, we need to introduce a definition:  

> **Definition:** An **imorality** in a directed graph $G$ occurs where either; a) there is a directed edge between $X$ and $Y$, or b) $X$ and $Y$ are both parents of the same note $Z$. 

This leads to a concept of a **moralized graph** that relates a directed BN to an undirected MN:

> **Definition:** A **moral graph**, $M(G)$, of a BN structure, $G$ over $X$ is the **undirected graph** over $X$ that contains an undirected edge between $X$ and $Y$ if; a) there is a directed edge between $X$ and $Y$, or b) $X$ and $Y$ are both parents of the same note $Z$.

This leads us to a corollary of relating the independencies of the directed BN to the independencies of a MN:

> **Corollary:** Given a distribution $P_B$ such that $B$ is a parameterization on a graph $G$, then $M(G)$ is an I-map for $P_B$.

What does all of this mean? 

<img src="img/MoralizedGraph.JPG" alt="Drawing" style="width:600px; height:200px"/>
<center> **Example of Graph Moralization** </center>

## Representation with MNs

Markov networks can represent a probability distribution using **potentials** on an **undirected graph** $H$:

> **Definition:** a distribution $P(X_1,\ldots,X_n)$ can be represented by an undirected graphical graph $H$ using a set of positive **potential functions** $\psi_c(X_c)$ associated with the cliques of $H$:

$$P(X_1,\ldots,X_n) = \frac{1}{z} \prod_{c \in C} \psi_c(X_c)\\
where\\
Z = \sum{X_1,\ldots,X_n} \prod_{c \in C} \psi_c(X_c)$$

We call $Z$ the **partition function**. 

In other words, we can factor a probability distribution on a graphical model into potentials. Let's look at an example of factorizing a distribution on a graph, using the student letter example we worked on before.

<img src="img/LetterDAG.JPG" alt="Drawing" style="width:600px; height:400px"/>
<center> **DAG for student letter and GRE score** </center>

Referring to the above figure, we can factorize the directed graph or BH using conditional probabilities:

$$P(I,D,G,S,L) = P(I)\ P(D)\ P(S\ |\ I)\ P(G\ |\ I, D)\ P(L\ |\ G)$$

Notice that the **directed edges** of the graph provide **causal relationships**. 

With **undirected edges** we must model the **correlations** between the variables. We factorize the distribution on the MN using potentials. There is a potential for each **clique** on the **moralized** undirected graph. As a first step we need to create the undirected graph and moralize it:

<img src="img/MoralizedLetter.JPG" alt="Drawing" style="width:600px; height:400px"/>
<center> **Transforming DAG to Moralized MN** </center>

The graph is now undirected. The addition of the edge between $I$ and $D$ moralizes this graph.

## Cliques on undirected graphs

In order to factorize an undirected graph we must first decompose it into cliques. We can define a clique as follows:

> **Definition:** A **clique** is a fully connected set of neighbors on an undirected graph. 



Given this definition it is easy to find the cliques of our MN as illustrated in the figure below:

<img src="img/LetterCliques.JPG" alt="Drawing" style="width:600px; height:400px"/>
<center> **Cliques of the Undirected Markov Network** </center>

## Factorization on Markov networks

Now, we are in a position to factorize the MN into a set of clique potentials:

$$P(I,D,G,S,L) = \frac{1}{Z} exp\{ E(D,I,G) + E(G,L) + E(I,S) \} \\
= \frac{1}{Z} \psi(D,I,G)\ \psi(G,L)\ \psi(I,S)$$

Where  
$Z = $ the **partition function**,  
$E() = $ **expectation** function, and   
$\psi() = $ **clique potential**.

## Trails