# Machine Learning & Energy WS 20/21
## Exercise 9 - Probabilistic Graphical Models: Inference with PGMs

**NOTE: Please additionally install pgmpy in our virtual environment:**
- Open the anaconda prompt
- type ``conda activate MLE`` and press enter
- type ``pip install pgmpy`` and press enter

In [1]:
%load_ext autoreload
%autoreload 2

Now we are going to make inference in a PGM that models a section of a distribution grid with prosumers. Instead of using custom functions we are going to use the library ``pgmpy`` for modeling our problem. We encourage you to read its documentation: http://pgmpy.org/

## 1. A Distribution Network Example

Consider a distribution grid section consisting of two households $HH\{1,2\}$ connected to transformer $TRA$. The demand for the households are $D\{1,2\}$ and have states LOW(.8 of probability) and HIGH(.2 of probability). $HH1$ has a $PV$ plant that also assumes states LOW(.3 of probability) or HIGH(.7 of probability). Depending on the combination of $D1$ and $PV$ the state of $HH1$ is determistically estimated as CONS ($PV=$ LOW, $D1=$ HIGH), NONE ($PV=D1$), or PROD ($PV=$ HIGH, $D1=$ LOW). 

$HH2$ owns a small combined heat and power plant $CHP$. Depending on the external temperature $T$, which can be either LOW(.3 of probability) or HIGH(.7 of probability), the production of $CHP$ can either be HIGH or LOW (see table below). Analogous to $HH1$'s state probabilities, the power state of $HH2$ depends on $D2$ and $CHP$.

The transformer $TRA$ can be either under NORMAL or CRITICAL operation. CRITICAL operation happens with .8 of probability only if $HH1=HH2=$ HIGH or $HH1=HH2=$ LOW.

<br>

| &nbsp | &nbsp $T= $ HIGH &nbsp | &nbsp $T= $ LOW &nbsp |
| :- | :-: | :-: |
| p($CHP=$ HIGH) | &nbsp.1 |&nbsp .8 |
| p($CHP=$ LOW)  | &nbsp.9 |&nbsp .2 |

<br>
<br>

<div>
    <img src="images/GM2.jpg" width=600>

a) Build a Bayesian network that models the problem above using the method ``BayesianModel()`` from ``pgmpy.models``. The input for this method is a list with tuples containing the (directed) edges of the PGM. See the example below.

In [8]:
from pgmpy.models import BayesianModel

# PGM's directed edges
edges = [('D1', 'HH1'),('PV', 'HH1'), ('HH1','TRA'), ('T','CRP'),('CRP','HH2'),('D2','HH2'),('HH2','TRA')]  # TODO: add other edges

# define Bayesian network model
model = BayesianModel(edges)

b) Define the probability distribution function of the variables in the model using the method ``TabularCPD`` from ``pgmpy.factors.discrete``. The input for this method are (i) the model's variable you want to input values for, (ii) the number of values (states) the variable can take, (iii) the probability of each state, (iv) the name of each state. We have already coded the marginal distribution for $D1$ and the conditional probability of $CHP$, i.e. $p$($CHP=$ c | $T=$ t). Use them as example.

In [23]:
from pgmpy.factors.discrete import TabularCPD

# define probability for independent variables
p_d1 = TabularCPD(variable='D1', variable_card=2, values=[[0.2], [0.8]], state_names={'D1': ['HIGH', 'LOW']})
p_d2 = TabularCPD(variable='D2', variable_card=2, values=[[0.2], [0.8]], state_names={'D2': ['HIGH', 'LOW']})
p_t = TabularCPD(variable='T', variable_card=2, values=[[0.7], [0.3]], state_names={'T': ['HIGH', 'LOW']})
p_pv = TabularCPD(variable='PV', variable_card=2, values=[[0.7], [0.]], state_names={'PV': ['HIGH', 'LOW']})

# define conditional probability for remaining variables
cp_chp = TabularCPD(variable='CHP', variable_card=2, 
                   values=[[0.1, 0.8],
                           [0.9, 0.2]],
                  evidence=['T'],
                  evidence_card=[2],
                  state_names={'CHP': ['HIGH', 'LOW'],
                               'T': ['HIGH', 'LOW']})

cp_hh1 = TabularCPD(variable='HH1', variable_card=2, 
                   values=[[1,0,0,0],
                           [0,1,1,0],
                           [0,0,0,1]],
                  evidence=['D1','PV'],
                  evidence_card=[2,2],
                  state_names={'HH1': ['CONS', 'NON', 'PROD'],
                               'D1': ['HIGH', 'LOW'],
                               'PV': ['HIGH', 'LOW']})

cp_hh2 = None  # TODO: change code here

cp_tra = None  # TODO: change code here

# Associating the CPDs with the network
model.add_cpds(p_d1, p_d2, p_t, p_pv, cp_chp, cp_hh1, cp_hh2, cp_tra)

print("Conditional probability p(TRA| HH1=h1, HH2=h2): ")
print("Value:")
print(model.get_cpds('TRA'))
print("Expected value:")
print("+---------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+\n\
| HH1           | HH1(PROD) | HH1(PROD) | HH1(PROD) | HH1(NONE) | HH1(NONE) | HH1(NONE) | HH1(CONS) | HH1(CONS) | HH1(CONS) |\n\
+---------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+\n\
| HH2           | HH2(PROD) | HH2(NONE) | HH2(CONS) | HH2(PROD) | HH2(NONE) | HH2(CONS) | HH2(PROD) | HH2(NONE) | HH2(CONS) |\n\
+---------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+\n\
| TRA(NORMAL)   | 0.2       | 1.0       | 1.0       | 1.0       | 1.0       | 1.0       | 1.0       | 1.0       | 0.2       |\n\
+---------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+\n\
| TRA(CRITICAL) | 0.8       | 0.0       | 0.0       | 0.0       | 0.0       | 0.0       | 0.0       | 0.0       | 0.8       |\n\
+---------------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+")

ValueError: values must be of shape (2, 4). Got shape: (3, 4)

c) With ``pgmpy`` inference in PGMs is pretty easy. You just have to select the inference algorithm and call the method ``query()``. Read the documentation to understand its functioning and calculate the marginal probability distribution $p$($TRA$) in the cell below .

In [None]:
from pgmpy.inference import VariableElimination

# initialize "inference" object
infer = None  # TODO: change code here

# make inference
hh1_dist = None  # TODO: change code here

print("Marginal probability p(TRA): ")
print("Value:")
print(hh1_dist)
print("Expected value:")
print("| TRA           |   phi(TRA) |\n\
+===============+============+\n\
| TRA(NORMAL)   |     0.8823 |\n\
+---------------+------------+\n\
| TRA(CRITICAL) |     0.1177 |\n\
+---------------+------------+")

d) You can also calculate the probability distribution of a variable given an evidence using the optional argument ``evidence`` to ``query()``. Now calculate $p$($HH1$|$PV=$ HIGH). 

In [None]:
# make inference
hh1_dist = None  # TODO: change code here

print("Conditional probability p(HH1|PV=HIGH): ")
print("Value:")
print(hh1_dist)
print("Expected value:")
print("| HH1       |   phi(HH1) |\n\
+===========+============+\n\
| HH1(PROD) |     0.8000 |\n\
+-----------+------------+\n\
| HH1(NONE) |     0.2000 |\n\
+-----------+------------+\n\
| HH1(CONS) |     0.0000 |\n\
+-----------+------------+")

e) More efficient algorithms are also available for inference in PGMs. E.g., the Sum-Product algorithm is implemented under the class ``BeliefPropagation``. Use the associated constructor method ``BeliefPropagation()`` and call the method ``get_cliques()`` to print the cliques of the moralized Bayesian network. Do it yourself by hand to check if the result is correct.

In [None]:
from pgmpy.inference import BeliefPropagation

# initialize "inference" object
bp = None  # TODO: change code here

# return cliques from moralized model
cliques = None  # TODO: change code here

print("Cliques of moralized model: ")
print("Value: {}".format(cliques))
print("Expected value: [('CHP', 'T'), ('CHP', 'HH2', 'D2'),  ('HH1', 'TRA', 'HH2'), ('HH1', 'D1', 'PV')]")

f) At last calculate $p$($PV$|$HH1=$ NONE) using belief propagation.

In [None]:
# make inference
pv_dist = None  # TODO: change code here

print("Conditional probability p(PV|HH1=NONE): ")
print("Value:")
print(pv_dist)
print("Expected value:")
print("+----------+-----------+\n\
| PV       |   phi(PV) |\n\
+==========+===========+\n\
| PV(HIGH) |    0.3684 |\n\
+----------+-----------+\n\
| PV(LOW)  |    0.6316 |\n\
+----------+-----------+")

That's it! We hope these exercises helped you with a better understanding of the lecture's content. Also, we advise you to always have these notebooks nearby. Who knows when you'll need to perform inference in a probabilistic model whose structure should be learned from a regression model where you have to make feature engineering using cross-validation ;)