# Probabilistic Reasoning

This notebook serves as the supporting material for chapter 14**Probabilistic Reasoning**. In this notebook, we will learn how to use the code respository to build network models to reason under uncertainty according to the laws of probability theory. In the previous notebook, we briefly explained what Bayes' Rule is and how it can be utilised in Probabilistic Inference. This notebook introduces a systematic way to represent such conditional relationships in the form of **Bayesian Networks**. We will also have a look at a variety of approximate inference algorithms and will also explore ways in which probability theory can be applied to worlds with objects and relations(worlds represented in first order logic).

## Representing knowledge in an uncertain domain

We saw in the previous notebooks that the full joint probability distribution models can answer any question about the domain but they are computationally expensive as the space complexity grows exponentially. However, we saw that independence and conditional independence relationships among variables can be of great help in defining a full joint distribution. Owing to these shortcomings of full joint distributions and to the merits of conditional relations between random variables, AI researchers have come up with a clever data structure called Bayesian Networks. Bayesian networks can represent essentially any full joint probability distribution and in many cases can do so very concisely.

A Bayesian network is a directed graph in which each node is annotated with quantitative probability information. The full specification is as follows:
* Each node corresponds to a random variable, which may be discrete or continuous.
* A set of directed links or arrows connects pairs of nodes. If there is an arrow from node X to node Y , X is said to be a parent of Y. The graph has no directed cycles (and hence is a directed acyclic graph, or DAG).
* Each node $X_i$ has a conditional probability distribution $P(X_i \mid Parents(X_i ))$ that quantifies the effect of the parents on the node.

The topology of the network—the set of nodes and links—specifies the conditional independence relationships that hold in the domain. The
intuitive meaning of an arrow is typically that X has a direct influence on Y, which suggests
that causes should be parents of effects. It is usually easy for a domain expert to decide what
direct influences exist in the domain—much easier, in fact, than actually specifying the probabilities themselves. Once the topology of the Bayesian network is laid out, we need only
specify a conditional probability distribution for each variable, given its parents.

The Bayesian Networks possess many interesting properties which are quite brilliantly explained in the text. Readers are advised to go through the text to get a feel for the Bayesian Networks. In this notebook, our main focus will be to utilise Bayesian Networks to model various problems.

To work with the Bayesian Networks let us first load the aima jar.

In [2]:
%classpath add jar ../out/artifacts/aima_core_jar/aima-core.jar

### The tooth cavity catch structure

Consider the simple world described in the text, consisting of variables *Toothache, Cavity, Catch* and *Weather*. It is easy to see that *Weather* is independent of other variables. Furthermore it can be argued that *Toothache* and *Catch* are conditionally independent given *Cavity*. These relationships can be represented by a Bayesian Network structure shown below. Formally, the conditional independence of *Toothache* and *Catch*, given
Cavity, is indicated by the absence of a link between *Toothache* and *Catch*. Intuitively, the
network represents the fact that *Cavity* is a direct cause of *Toothache* and *Catch*, whereas
no direct causal relationship exists between Toothache and Catch.

## APIs from the code repository

Let's have a look at the structure of APIs from the code repository. We will understand the APIs by considering the three defining points of the Bayesian Networks.

**Each node corresponds to a random variable, which may be discrete or continuous.**

The `Node` interface represents the node in a Bayesian Network. Given below is a description of the Node interface.
````java
public interface Node {

	/**
	 * 
	 * @return the Random Variable this Node is for/on.
	 */
	RandomVariable getRandomVariable();

	/**
	 * 
	 * @return true if this Node has no parents.
	 * 
	 * @see Node#getParents()
	 */
	boolean isRoot();

	/**
	 * 
	 * @return the parent Nodes for this Node.
	 */
	Set<Node> getParents();

	/**
	 * 
	 * @return the children Nodes for this Node.
	 */
	Set<Node> getChildren();

	/**
	 * Get this Node's Markov Blanket:<br>
	 * 'A node is conditionally independent of all other nodes in the network,
	 * given its parents, children, and children's parents - that is, given its
	 * <b>MARKOV BLANKET</b> (AIMA3e pg, 517).
	 * 
	 * @return this Node's Markov Blanket.
	 */
	Set<Node> getMarkovBlanket();

	/**
	 * 
	 * @return the Conditional Probability Distribution associated with this
	 *         Node.
	 */
	ConditionalProbabilityDistribution getCPD();
````

This interface can be implemented to obtain customised nodes. A default implementation is provided in the repository via the `FullCPTNode` class.

The second specification of the Bayesian Networks tells about the hierarchy of the network.

**A set of directed links or arrows connects pairs of nodes. If there is an arrow from node X to node Y , X is said to be a parent of Y. The graph has no directed cycles (and hence is a directed acyclic graph, or DAG).**

These links are stored as a set and can be obtained by the `getParents()` method of the `Node` interface.

The third specification states the data contained in each node.

**Each node $X_i$ has a conditional probability distribution $P(X_i \mid Parents(X_i ))$ that quantifies the effect of the parents on the node.**

This information is stored in the form of a ConditionalProbabilityDistribution inside a node. After we have defined the nodes for our network, we can use the `BayesNet` class from the repository and then construct a Bayesian Network. Let's work with the ToothAcheCavityAndCatch example.

In [12]:
package aima.notebooks.probabilisticreasoning;

import aima.core.probability.bayes.*;
import aima.core.probability.*;
import aima.core.probability.bayes.impl.*;
import aima.core.probability.util.*;
import aima.core.probability.domain.*;

// First let us define the Random Variables which make up our Bayes Network
RandVar cavityRv = new RandVar("Cavity", new BooleanDomain());
RandVar toothacheRv = new RandVar("Toothache", new BooleanDomain());
RandVar catchRv = new RandVar("Catch", new BooleanDomain());

// Now we will define the nodes that make up the network and represent the above network
// the order of the doubles in CPT is as follows
// If A,B and C are the three random variables then first the possibilities of C will be exhausted, then B and then A.
// For example if A, B and C are Boolean random variables then the doubles will be mentioned in the given order
//    A    B    C
//    1    1    1
//    1    1    0
//    1    0    1
//    1    0    0
//    0    1    1
//    0    1    0
//    0    0    1
//    0    0    0
FullCPTNode cavity = new FullCPTNode(cavityRv, new double[] {
                                    // True			
                                    0.2,
                                    // False
                                    0.8 });
FullCPTNode toothache = new FullCPTNode(toothacheRv,
				new double[] {
						// C=true, T=true
						0.6,
						// C=true, T=false
						0.4,
						// C=false, T=true
						0.1,
						// C=false, T=false
						0.9

				}, cavity);

FiniteNode catchNode = new FullCPTNode(catchRv, new double[] {
				// C=true, Catch=true
				0.9,
				// C=true, Catch=false
				0.1,
				// C=false, Catch=true
				0.2,
				// C=false, Catch=false
				0.8 }, cavity);

// Now let us consider the Bayesian network
// We need to specify only the root nodes from the nerwork
BayesNet cavityBayesNet = new BayesNet(cavity);

// Now let's extract whatever we can from the BayesNet
System.out.println("Random Variables = "+ cavityBayesNet.getVariablesInTopologicalOrder().toString());
System.out.println("The cavity Node: "+ cavityBayesNet.getNode(cavityRv).toString());
return cavityBayesNet;

Random Variables = [Cavity, Toothache, Catch]
The cavity Node: Cavity
The toothache Node: Toothache
The catch Node: Catch


aima.core.probability.bayes.impl.BayesNet@5a94841e

The above block describes how to construct a Bayesian Network. However, a Bayesian Network itself is of little use. Hence,we need inference algorithms which can extract information from the network. Before, introducing various inference algorithms, we must note that a **Bayesian Network** is capable of describing a **Full Joint Distribution** by itself. This can be shown by considering the fact that a generic entry in the joint distribution is the probability of a conjunction of particular
assignments to each variable, such as $P (X_1 = x_1 \land . . . \land X_n = x_n )$. We use the notation
$P (x_1 , . . . , x_n )$ as an abbreviation for this. Now, in terms of conditional probability, using the product rule

$$P (x_1 , . . . , x_n ) = P (x_n | x_{n−1} , . . . , x_1 )P (x_{n−1} , . . . , x_1 )$$

Then we repeat the process, reducing each conjunctive probability to a conditional probability
and a smaller conjunction. We end up with one big product:

$$P (x 1 , . . . , x n ) = P (x n | x n−1 , . . . , x 1 )P (x n−1 | x n−2 , . . . , x 1 ) · · · P (x 2 | x_1 )P (x_1 )$$