# Machine Learning & Energy WS 20/21
## Exercise 9 - Probabilistic Graphical Models: Computing probabilities

In [None]:
%load_ext autoreload
%autoreload 2

In this last exercise we're going to work on Probabilistic Graphical Models (PGM). More specifically all the examples we use are classified as Bayesian Networks, which are PGM's representing conditional dependencies via a Directed Acyclic Graph (DAG). First we are going to marginalize and condition random variables in a simple example. In the second notebook we are going to use a Python library that will make our lives much easier for inference in PGMs.

## 1. Conditional probability

The graphical model below describes the fuel system of a car.
It consists of a battery $B$ that is either CHARGED or FLAT, a fuel tank F that is either FULL or EMPTY, and an electric fuel gauge $G$, which indicates FULL or EMPTY.

<div>
    <img src="images/GM1.jpg" width=450>

The distribution $p(G|B,F)$ is given as follows:

| $B$ | $F$  |  p($G=$ FULL) &nbsp | p($G=$ EMPTY)|
| :- | :- | :- | :- |
| FLAT | EMPTY &nbsp| .1 | .9 |
| FLAT | FULL | .2 | .8 |
| CHARGED &nbsp| EMPTY | .2 | .8 |
| CHARGED | FULL | .8 | .2 |


a) First, write down the formula for the joint probability $p(B=b,F=f,G=g)$ where $(b,f,g)$ are evaluation points. Then, complete the function ``joint_dist_eval()`` in the next cell. It should return the value of $p(B=b,F=f,G=g)$.

In [1]:
# define probability distributions
p_b = {'FLAT':.1, 'CHARGED':.9}
p_f = {'EMPTY':.1, 'FULL':.9}
p_g = {('FLAT','EMPTY','FULL'):.1, ('FLAT','EMPTY','EMPTY'):.9, 
       ('FLAT','FULL','FULL'):.2, ('FLAT','FULL','EMPTY'):.8,
       ('CHARGED','EMPTY','FULL'):.2, ('CHARGED','EMPTY','EMPTY'):.8,
       ('CHARGED','FULL','FULL'):.8, ('CHARGED','FULL','EMPTY'):.2,}

# function definition to evaluate joint distribution
def joint_dist_eval(p_b, p_f, p_g, b, f, g):
    return p_b[b]*p_f[f]*p_g[b,f,g]  # TODO: change code here

print("Joint probability p(B=CHARGED, F=FULL, G=EMPTY): ")
print("Value: {}".format(joint_dist_eval(p_b, p_f, p_g, 'CHARGED', 'FULL', 'EMPTY')))
print("Expected value: 0.162")

Joint probability p(B=CHARGED, F=FULL, G=EMPTY): 
Value: 0.16200000000000003
Expected value: 0.162


b) Now, write down the formula of the marginal distribution $p(G=g)$ and complete the function ``marginal_dist_eval()`` in the next cell.

PS.: The method ``keys()`` returns an iterator with all keys of a dictionary.

In [6]:
# function definition to evaluate marginal distribution of G
import numpy as np
def marginal_dist_eval(p_b, p_f, p_g, g):
    sumg = 0
    for b in p_b.keys():
        for f in p_f.keys():
            sumg +=p_b[b]*p_f[f]*p_g[b,f,g]
        

    return sumg  # TODO: change code here

print("Marginal probability p(G=FULL): ")
print("Value: {}".format(marginal_dist_eval(p_b, p_f, p_g, 'EMPTY')))
print("Expected value: 0.315")

Marginal probability p(G=FULL): 
Value: 0.31500000000000006
Expected value: 0.315


c) If now we observe the fuel gauge and see that it reads EMPTY, what is the probability that the fuel tank is EMPTY? Write down the formula for the conditional probability $p(F=$ EMPTY$| G=$ EMPTY$)$ and calculate it using the function ``generalized_dist_val()``. This function returns the probability of the distribution defined by the given number of evidences. For example, if ``evidences={'b':FLAT, 'f':EMPTY}`` the function's output is $p(B=$ FLAT$, F=$ EMPTY$)$. If ``evidences={'g':FULL}`` the function's output is $p(G=$ FULL$)$

PS.: remember that $P(X|Y) = \frac{P(X,Y)}{P(Y)}$

In [17]:
def generalized_dist_eval(p_b, p_f, p_g, evidences={}):
    variables = {'b':p_b.keys(), 'f':p_f.keys(), 'g':p_g.keys()}
    for v in variables.keys():
        if v in evidences:
            variables[v] = [evidences[v]]
    return sum([sum([sum([p_b[b]*p_f[f]*p_g[(b,f,g)] for b in variables['b']]) for f in variables['f']]) for g in variables['g']])

# value of conditional probability
value = generalized_dist_eval(p_b, p_f, p_g,evidences={'f':'EMPTY', 'g':'EMPTY'})/generalized_dist_eval(p_b, p_f, p_g,evidences= {'g':'EMPTY'})  # TODO: change code here

print("Conditional probability p(F=EMPTY| G=EMPTY): ")
print("Value: {}".format(value))
print("Expected value: 0.2571428571428572")


Conditional probability p(F=EMPTY| G=EMPTY): 
Value: 0.2571428571428572
Expected value: 0.2571428571428572


e) If in addition to knowing that the fuel gauge reads EMPTY we also know that the battery is FLAT, calculate the probability that the fuel tank is EMPTY.

In [23]:
# value of conditional probability
value = generalized_dist_eval(p_b, p_f, p_g, evidences={'f':'EMPTY', 'g':'EMPTY', 'b':'FLAT'})/generalized_dist_eval(p_b, p_f, p_g, evidences= {'g':'EMPTY','b':'FLAT'})  # TODO: change code here  

print("Conditional probability p(F=EMPTY| B=FLAT, G=EMPY): ")
print("Value: {}".format(value))
print("Expected value: 0.11111111111111111")

Conditional probability p(F=EMPTY| B=FLAT, G=EMPY): 
Value: 0.11111111111111112
Expected value: 0.11111111111111111


The example in this notebook is quite simple, but the calculations praticed here are going to help understanding how inference in PGMs works in general.