# Workshop 9: Smoothing

## Overview

This workshop relates to Lecture 09. In that lecture, we discussed Dynamic Bayesian Networks
(DBNs). Here you will see how DBNs work using a mixture of Excel, and the Python Pomegranate
package. In particular, you will carry out a mixture of filtering, prediction and smoothing tasks on
the umbrella network that we studied in the lecture.

## Task 1: Excel for filtering and prediction (redux)

The spreadsheet `umbrella-smoothing.xls` that can be found on Blackboard models the umbrella example over the first 3 days.

The first tab/sheet `Filtering D3` provides the solution to two of the problems from last week. First, it gives an answer to the problem from Task 4 on predicting rain. The upper cells of the sheet provide a prediction forward until `Day 10` (when the predicted probability of Rain has converged to `0.5`). 

**BTW, why do we know that this means it had converged?**

Second, the tab/sheet also provides the computation of the filtered probability of rain for `Day 3`. This is the answer to the problem from Task 5. (Note that the tab/sheet `Filtering D2` provides predictions from `Day 3` onwards — it filters to `Day 2` and then predicts — so provides another prediction forward to `Day 10`.)


## Task 2: Excel for smoothing (Day 1)

The tab/sheet `Smoothing D1` gives the smoothing calculation for `Day 1`.

The forward message is computed just as for filtering (it is the same message after all). The backward message (at the bottom of the sheet) is computed exactly as on Slide 26 and Slide 28 (the layout is similar to that on Slide 28 which hopefully makes it easy to see the correspondence).

The smoothed probability is then just the product of the forward and backward messages, normalised. This is the calculation on Slide 27 (and 28).

Look at what happens when the probabilities of umbrella/not umbrella on `Days 1` and `Day 2` vary.

## Task 3: Excel for Smoothing (Day 2)

The tab/sheet `Smoothing D2` gives the smoothing calculation for `Day 2`.

Compared to the calculation for `Day 1`, this involves predicting forward another day, entering evidence, and computing the backward message.

Again, look at what happens when the probabilities of umbrella/not umbrella on `Days 1` and `Day 2` vary.

## Task 4: Smoothing with Python

As in Workshop 8 we will use a Python package called pomegranate, which provides support for
probabilistic reasoning.

In [None]:
!pip install pomegranate==0.15.0

from pomegranate import *

### Model Setup

Then you can run the version of the umbrella model in the following cells. `pomegranate` can only solve Bayesian newtorks (not Dynamic Bayesian Networks), so we have to unroll the whole example to the depth that we want. The following code has the network unrolled to a depth of 2 days. Read through the code, where we defined the probability distributions, nodes, and network edges.

In [None]:
# Define the distributions
Rain0 = DiscreteDistribution({'y': 0.5, 'n': 0.5})

# Conditional distributions for rain on subsequent days
Rain1 = ConditionalProbabilityTable([
    ['y', 'y', 0.7],
    ['y', 'n', 0.3],
    ['n', 'y', 0.3],
    ['n', 'n', 0.7]
], [Rain0])

Rain2 = ConditionalProbabilityTable([
    ['y', 'y', 0.7],
    ['y', 'n', 0.3],
    ['n', 'y', 0.3],
    ['n', 'n', 0.7]
], [Rain1])

# Sensor model for umbrella
Umbrella1 = ConditionalProbabilityTable([
    ['y', 'y', 0.9],
    ['y', 'n', 0.1],
    ['n', 'y', 0.2],
    ['n', 'n', 0.8]
], [Rain1])

Umbrella2 = ConditionalProbabilityTable([
    ['y', 'y', 0.9],
    ['y', 'n', 0.1],
    ['n', 'y', 0.2],
    ['n', 'n', 0.8]
], [Rain2])

# Nodes in the network
s1 = Node(Rain0, name='Rain0')
s2 = Node(Rain1, name='Rain1')
s3 = Node(Umbrella1, name='Umbrella1')
s4 = Node(Rain2, name='Rain2')
s5 = Node(Umbrella2, name='Umbrella2')

# Define the network
model = BayesianNetwork('Umbrella Network')
model.add_states(s1, s2, s3, s4, s5)

# Add edges between nodes
model.add_edge(s1, s2)
model.add_edge(s2, s3)
model.add_edge(s2, s4)
model.add_edge(s4, s5)

# Finalize the network
model.bake()
print('Model setup complete')

To do smoothing, we need to add the following code, to inform the model about rain on `Day 3`.

In [None]:
Rain3 = ConditionalProbabilityTable([
    ['y', 'y', 0.7],
    ['y', 'n', 0.3],
    ['n', 'y', 0.3],
    ['n', 'n', 0.7]
], [Rain2])

# Node 
s6 = Node(Rain3, name='Rain3')
# State
model.add_states(s6)
# Edge
model.add_edge(s4, s6) # The edge connect Rain2 and Rain3

model.bake()

Note that we only call `model.bake()` once the last elements are entered.

Now that we have the model entered, we can ask it questions. But first let's tell the model that we saw umbrellas on `Day 1`and `Day 2`.

In [None]:
# Umbrellas on Day 1 and 2:
scenario = [[None, None, 'y', None,'y', None]]
# Run the model
predict_proba = model.predict_proba(scenario)
# Ask for the probability of rain on Day 1:
print(predict_proba[0][1].items())
# Ask for the probability of rain on Day 2:
print(predict_proba[0][3].items())

This should give exactly the results from the lecture (Slide 28), and from the Excel model.
Note that we didn’t tell `pomegranate` to do smoothing. As we saw last time with `Day 0`, it (in effect) always runs the backwards propagation and gives us smoothed probabilities for all days before the latest piece of evidence.

I said “in effect” because `pomegranate` doesn’t do the computation the way we studied. It just computes the probability of every hidden variable given the evidence. To see this, try looking up the probability of rain on `Day 3`.

## Task 5: More smoothing using Excel 

Now go back to the Excel spreadsheet and calculate the smoothed probability for `Day 3`.

## Task 6: Day 3 using pomegranate

Extend the `pomegranate` model to compute the smoothed probability of rain `Day 3`, given both umbrella being true and false on `Day 4` (that is two separate calculations).

Check your result against the values from the Excel model.

In [None]:
# Extend to day 3
Umbrella3 = ConditionalProbabilityTable(# TODO: fill in the table)
Rain4 = ConditionalProbabilityTable(# TODO: fill in the table)
Umbrella3 = ConditionalProbabilityTable(# TODO: fill in the table)

# TODO: Define and Add nodes
s7 = # TODO: Add node for Umbrella3
s8 = # TODO: Add node for Rain4
s9 = # TODO: Add node for Umbrella4
model.add_states(s7, s8, s9)

# Re-bake the model
model.bake()

# Run prediction with new model
scenario = # TODO: Create a new model scenario
predict_proba = model.predict_proba(scenario)
# Ask for the probability of rain on Day 3:
print(predict_proba[0][5].items())