### Bayesian Network Inference Library

We will be using the pomegranate library for Bayes Net inference

  * Installation instructions https://pomegranate.readthedocs.io/en/latest/install.html
  * Tutorial / documentation https://pomegranate.readthedocs.io/en/latest/BayesianNetwork.html
  
In the tutorial / documentation, ignore the parts about "initializing a Bayesian network based completely on data" and the sections on "Probability" "Prediction" and "Fitting" -- see the example below on how to determine the probability distribution on a node in the graph based on evidence.

Just to make sure things are working, first load in the Monty Hall code from the tutorial and answer the question about whether or not a contestant should take Monty up on his offer to switch doors.

In [None]:
from pomegranate import *

# The three "random variables are"
#    guest -- what door will the guest choose -- doors are A, B, and C
#    prize -- what door is the prize behind
#    monty -- what door will Monty open.  This is a function of both guest and prize:
#               Monty will never open the door the guest chooses and will never open the 
#               door with the prize (if the guest doesn't choose it)
#             So the first three lines of the CPT below say the guest chooses A and 
#               the prize is behind A, and then Monty will choose B or C with equal probability

# Notice the pattern of building networks:  
#   -- build your distributions -- either DiscreteDistribution for nodes without parents
#          or ConditionalProbabilityTable for nodes with parents.  The CPT for Monty needs 27 
#          entries, since there are 9 possible combination of parent values, and three possible
#          values the monty random variable can take.

guest = DiscreteDistribution({'A': 1./3, 'B': 1./3, 'C': 1./3})
prize = DiscreteDistribution({'A': 1./3, 'B': 1./3, 'C': 1./3})
monty = ConditionalProbabilityTable(
        [['A', 'A', 'A', 0.0],
         ['A', 'A', 'B', 0.5],
         ['A', 'A', 'C', 0.5],
         ['A', 'B', 'A', 0.0],
         ['A', 'B', 'B', 0.0],
         ['A', 'B', 'C', 1.0],
         ['A', 'C', 'A', 0.0],
         ['A', 'C', 'B', 1.0],
         ['A', 'C', 'C', 0.0],
         ['B', 'A', 'A', 0.0],
         ['B', 'A', 'B', 0.0],
         ['B', 'A', 'C', 1.0],
         ['B', 'B', 'A', 0.5],
         ['B', 'B', 'B', 0.0],
         ['B', 'B', 'C', 0.5],
         ['B', 'C', 'A', 1.0],
         ['B', 'C', 'B', 0.0],
         ['B', 'C', 'C', 0.0],
         ['C', 'A', 'A', 0.0],
         ['C', 'A', 'B', 1.0],
         ['C', 'A', 'C', 0.0],
         ['C', 'B', 'A', 1.0],
         ['C', 'B', 'B', 0.0],
         ['C', 'B', 'C', 0.0],
         ['C', 'C', 'A', 0.5],
         ['C', 'C', 'B', 0.5],
         ['C', 'C', 'C', 0.0]], [guest, prize])

s1 = Node(guest, name="guest")
s2 = Node(prize, name="prize")
s3 = Node(monty, name="monty")

model = BayesianNetwork("Monty Hall Problem")
model.add_states(s1, s2, s3)
model.add_edge(s1, s3)
model.add_edge(s2, s3)
model.bake()

In [None]:
# Based on no more evidence, what is the likelihood that the contestant will win the prize?
model.marginal()

In [None]:
##  Suppose the guest chooses A, and Monty chooses B.
##  Monty gives the guest to switch from A to C.  Should she?
model.predict_proba({"guest": 'A', "monty": 'B'})

In [None]:
##  Is predict_proba with no evidence the same as marginal?

### Second Example:  Typical Noisy Sensor

* The variable **water** says whether or not there is water in my basement.  This variable takes values **{none, some, lots}**
* I have a water detector **waterDetector** that is either **on** or **off**
  * It is supposed to be **on** if and only if **water** is either some or lots
  * However, it sometimes fails to alert (is **off** when **water** is either **some** or **lots**)
  * It also sometimes false alarms (is **on** when **water** is **none**)

This is what I discovered by observing the basement over time
* On any given day, the probability of **water** is **(.98, .015, .05)** for values **(none, some, lots)**
* The likelihood of a false alarm **P(waterDetector = on | water = none) = 0.01**
* The likelihood of the sensor missing water depends on the water level: **P(waterDetector = off | water = some) = .10**;   **P(waterDetector = off | water = lots) = .005**


In [None]:
from pomegranate import *

# This is the distribution for my water node
wdist = DiscreteDistribution(....)


# Distribution for my water detector node
detectordist = ConditionalProbabilityTable(...)

#  My two nodes
water = Node(...)
waterDetector = Node(...)

# The Network
model = BayesianNetwork("Water Detector")
model.add_states(...)
model.add_edge(...)
model.bake()
                                     

With no further information, what is the likelihood that (a) there is some or lots of water in my basement, and (b) what is the likelihood that my water detector is displaying ON

In [None]:
# Compute probabilities on the basis of no additional evidence.  Its output is a list of 
# distributions over node values, in the order they were added -- in our case, water is at [0] and waterDetector is at [1]
model.predict_proba({})

In [None]:
# This is the distribution over values of water -- not surprising, it is the same as the priors 
#  (subject to rounding error)
model.predict_proba({})[0].parameters

In [None]:
# This is the distribution over values of waterDetector
model.predict_proba({"waterDetector": "on"})[0].parameters

Suppose I learn that the water detector is **on**.  How does that affect my beliefs over the basement water level

In [None]:
model.predict_proba({})[0].parameters

Suppose instead I go to the basement and observe that there is no water in the basement.  
How does that affect my belief as to whether or not the water detector is on?

####  Question 1 ####
My beliefs about water level change with the season.   There are two seasons, the dry season and the wet season.
The season affects my prior belief in the water level, not the behavior of the sensor.   If the season is 
**wet**  my prior distribution on **water** is
{"none": 0.80, "some": 0.15, "lots": 0.05})
and if the season is **dry** the distribution is 
{"none": 0.95, "some": 0.035, "lots": 0.015})

Adjust the model accordingly.  Suppose it's the dry season, but my water detector is saying **on** -- what do I believe about water in the basement?


#### Question 2 ####

Code the example from the lectures, about burglaries, alarms, and phone calls
* Variables are B (Burglary), E (Earthquake), A (Alarm), J (John calls), M (Mary calls)
  * All are true/false
* Assumptions we make as domain experts
  * Burglaries do not cause earthquakes, and vice versa
  * The alarm's behavior depends only on B and E and uncorrelated error behavior (for example, if there is no Burglary or Earthquake, then the likelihood the alarm sounds anyway does not depend on any other factor)
  * John and Mary have their own parameters, but those depend only on A, and otherwise they act independently
* Parameters we have gathered
  * Prob(B) = .001
  * Prob(E) = .002
  * Prob(A | B,E) = (.95, .94, .29, .001)    ((+b, +e), (+b, -e), (-b, +e), (-b, -e))
  * Prob(J | A) = (.9, .45) (+a, -a)
  * Prob(M | A) = (.7, .01) (+a, -a)
 
 Code this as a Bayes network, and answer these two questions:

1.  My phone is off so I don't know whether or not I got a call.  What is the likelihood there was a burglary?  What is the likelihood my alarm went off?

2.  I just got a call from Mary but not John.  How does that affect my belief that there was a burglary?

3.  I heard a reliable news report that there was an earthquake.  What is the chance I will will be getting a call from John soon?
