# LECTURE 6 NOTES

## Content

* continued introduction to decision theory
* bayesian models
* modeling decisions

## Objectives

* understand and build decision trees
* perform basic decision modeling with decision trees
* understand and explain bayes rule
* recognize and apply when to use bayes rule in decision processes

## DECISION TREES

As we work our way through decision making processes, we will quickly find that visualizing decisions is often an excellent way to understand them!  One of the most useful tools in visualization is the **decision tree**.  A decision tree is typically constructed with two types of nodes -- a _decision_ or _choice node_ and a _chance node_.  We will return to the chance node in a moment, but for now let's understand how we will build the tree on a very basic example decision.

**EXAMPLE**

> Bob owns a used bookstore and is also a lover of art.  He has recently become interested in promoting some of the artists in his community and wants to display artwork for sale in his bookstore. Bob decides not to charge a display fee and all of the proceeds will go directly to the artists, as he is sensitive to the difficulties of being an independent artist -- so Bob is seeing this as an opportunity to bring business to his store.  His bookstore has two floors and the main entrance to the cafe is on the first floor.  However, the main sales floor  of the bookstore is actually _on the second floor_.  Thus, if the artwork is placed on the first floor, the visitors may not go the second floor to browse the general book collection (as Bob would like to encourage).  However, if the artwork is placed on the first floor, visitors will have access to the cafe and thus increase the possibility of the sale of a coffee or pastry.  The manager has space near the cafe or near the window, where the stairs lead to the second floor.

> Bob knows that visitors who only come into the first floor spend an average of _\$4 at the coffee shop_.  He also knows that when visitors come upstairs, they also spend an average of _\$4 on bookstore items_.  At the current moment, he does not know what the chances are of visitors going upstairs, either if he places the artwork near the cafe or near the stairs.

The diagram below shows the decision tree model for the scenario described.  Each **square node** is a decision node and each edge captures the _payoff_ Bob will receive if he chooses that decision.  The **green triangle nodes** represent the termination of the decision path.

![](./assets/decisiontree@2017.03.26_15.18.47.png)

### Decision Trees With Probabilities of States of  Nature

Now that we have a tool for building the basic decision tree, let's move our discussion to a more realistic scenario.  In the typical decisions, there is some level of _risk_ or _uncertainty_ associated with a decision ... recall we are rarely faced with a complete knowledge of the states of nature (as in deterministic models).  Modeling uncertainty requires probabilistic models and we will begin with extending the example of Bob's bookstore above.

>We will simplify (for the sake of the example) Bob's dilema to placing the artwork on the first or second floor (regardless of the location on the second floor).  Since not _every_ customer buys something, Bob has looked in his transaction data and determined that when customers visit the first floor, on sunny days his net first floor sales (payoff) is \$100 and \$175 on cloudy days, and net second floor sales is \$125 and \$200 on cloudy days.  Bob will be putting the display up on Sunday and the forecast predicts a 0.65 chance of sun and a 0.35 chance of clouds.


Let's model this first with a payoff table:

|                         |  Sunny    |  Cloudy     |
|-------------------------|:---------:|:-----------:|
| Artwork on first floor  |    100    |   175       |
| Artwork on second floor |    125    |   200       |

What is missing from this are the probabilities, so we will add them in parenthesis, next to the states of nature:

|                    |  Sunny (0.65)  | Cloudy (0.35) |
|--------------------|:------:|:-----------:|
| Artwork on first floor  |    100    |   175       |
| Artwork on second floor |    125    |   200       |

Modeling this with a decision tree requires us to put the probabilities of the states of nature on each edge.  So we will see that the tree with probabilities (on the edges coming out of the chance node indicated by the **yellow circle**) is now:

![](./assets/decisiontree@2017.03.27_12.24.34.png)

[Silver Decision source file (JSON) for this diagram](./assets/decisiontree@2017.03.27_13.09.30.json)

### RESOURCES
* [Silver Decisions](http://www.silverdecisions.pl/) is a open-source, free browser-based tool for making basic decision trees.  It is developed and maintained by the [Decision Support and Analysis Division, Warsaw School of Economics](http://www.sgh.waw.pl/en/Pages/default.aspx)
* [Silver Decisions gallery](https://github.com/SilverDecisions/SilverDecisions/wiki/Gallery) should give you a good idea of some of the nicer completed decision trees and their contexts.
*  Dr. Hossein Arsham's [Introduction and Summary / Analysis of Risky Decisions](http://home.ubalt.edu/ntsbarsh/Business-stat/opre/partIX.htm#rintrodecisionanaly), which provides a very good introduction to these topics.

## BAYESIAN DECISION PROCESSES

Though we have begun introduce probabilities into our formulation of decision trees, we are going to go a few steps forward to help us answer questions that are slightly more difficult and tricky, but before we do, we will quickly review conditional probability leading us to a key result that will help us moving forward.

### CONDITIONAL PROBABILITY
We won't spend a lot of time reviewing probability, since this is something that you can do as you need, but we will review one key result that is important in getting us to a Bayesian approach to decision making and that is conditional probabilities.

**Conditional probability**, $\Pr(A\big|B)$, is defined as $$\Pr(A\big|B) = {{\Pr(A \cap B)}\over{\Pr(B)}}.$$  That is the the probability of A is conditional on B occuring, or stated another way, the "probability of A given B". Concretely, let's consider the _probability of clouds given it is raining_.  With the definition of conditional probability, this is the probability of _rain_ **and** _clouds_ -- $\Pr(rain \cap clouds)$ -- divided by the _probability of rain_ -- $\Pr(rain)$.  You can also think about this as the _ratio_ of rainy and cloudy days to rainys (as opposed to the ratio of rainy and sunny days to rainy days, which you will note is not a high ratio in most places).  To solve this, we need to know the probability of rain and clouds occurring at the same time and also the probability of rain.

Let's make note of this as we will be returning to these results shortly.

Moving on, let's say we want to cast these conditional probabilities into a slightly different light.  What if you're traveling in another state and receive a call from your sister who is into weather (and probability).  Your sister loves to offer you challenges and says to you "it is cloudy, what is the probability that it is raining right now?".  What do you need to solve this seemingly difficult problem?

Let's return to our definition of conditional probability.  Let's shape the problem this way -- let $R$ be rain, and $C$ be cloudy -- and $\neg R$ be not rain and $\neg C$ be not cloudy.  If we use our conditional probability definition above, let's first set up the part of the question we know.  Being a weather geek, you know a lot about the probability of clouds given that it is rainy, which is given by the following:

$$ 
    \Pr(C\big|R) = {{\Pr(R \cap C)} \over \Pr(R)}
$$

There is some good news already -- you know the probability of rain, $\Pr(R)$, quite well, so we're already in a good position to answer this probability.  Unfortunately, we _want_ to know the probability of rain given that it is cloudy, not the other way around.  Using the definition of conditional probability, this can be formulated by:

$$ 
    \Pr(R\big|C) = {{\Pr(C \cap R)} \over \Pr(C)}
$$

More good news -- you know the probability of clouds.  But notice something else interesting ... on the right hand side of each equation, we have an equivalent probability $\Pr(R \cap C) = \Pr(C \cap R)$.  That means that we can do some substitution and come up with a form for the question your sister is asking by :

$$
{\Pr(R \cap C)} = {\Pr(C \cap R)} = {\Pr(C\big|R) \Pr(R)}
$$

And thus,

$$   \\
\begin{align*}
    \Pr(R\big|C) &= {{\Pr(C \cap R)} \over \Pr(C)} \\
                 &= {{\Pr(C\big|R) \Pr(R)} \over \Pr(C)} \\
\end{align*}
$$

On last thing we need to realize is the the $\Pr(C)$ is _conditional_ on the $\Pr(R)$.  Since we have two possibilities of when it could be cloudy (when it is raining or not raining), we will need to add probabilities up.  The probability of clouds is the probability of clouds given rain **and** the probability of clouds given no rain.  This fact will need to be remembered when we do our final calculations.

We now have all we need to answer the question that your sister thought would trump you.  At the same time, we have derived the formula for a very important result in probability called **Bayes Theorem**.

### BAYES THEOREM
As formulated, **Bayes Theorem** allows us to answer questions where we have probabilistic knowledge about the world, and specifically in contexts where conditional probabilities are known.  To recap, **Bayes theorem** states generally:

$$ 
{\Pr(A \big| B)} = {{\Pr(B\big|A)} \Pr(A) \over \Pr(B)}
$$

Let's finish your sister's challenge and end with concrete values.  Being the weather bug you are, and knowing a few bits of historical data, you find the following for the day (today) your sister is trying to trump you:

- $\Pr(R) = 0.65, \Pr(\neg R) = 0.35$
- $\Pr(C) = \Pr(R)\Pr(C\big|R) + \Pr(\neg R)\Pr(C\big|\neg R)$
- $\Pr(C\big|R) = 0.95$
- $\Pr(C\big|\neg R) = 0.40$

Thus,

\begin{align*}
    \Pr(R\big|C) 
                 &= {{0.95 \times 0.65 } \over (0.65)0.95 + (0.35)0.40} \\
                 &= 0.8152
\end{align*}

If you answered to your sister that it is raining, then you have enough evidence to support such an answer.

### Probability Trees
In the prior example, we've encoded a lot of information that gets complex rather quickly, and one way to keep up with it all is with a _probability tree_.  These trees look similar to, but _are not_ the same as decision trees -- recall a decision tree encodes _states of nature_ and _actions_, while probability trees encode probabilities of states of nature (events) alone.  They visually represent the data required to understand conditional probabilities (and more generally, things like Bayes).  

To build a probability tree, build from the root node. Place the  non-conditional events as children of the root. From these children nodes, build up subtrees by adding conditional events as children of the children nodes, then continue moving through the tree building it based on the conditional represented by the parent node of the subtrees until you've reached the leaves.  Note that the probability of the children at each level (and leaf nodes) will add up to 1, as they should.

Using our method for your sister's challenge, the probability tree looks like:

![](./assets/sister_challenge_pr_tree.png)
<!-- built using GraphvizFiddle http://stamm-wilbrandt.de/GraphvizFiddle/1.0.1/ -->

With the probability tree above, we need only work backwards to find the probabilities required to solve Bayes Theorem for our question. 



## Bayesian Decision Processes

Let us revisit the decision tree above involving Bob's Bookstore.  Bob wants to make the right decision, so he is considering hiring a "shopping agency" that has professional shoppers who understand the dynamics of weather on shopping.  They claim to be able to be accurate when it comes to using weather as a predictor of sales.  Prior agency results indicate that when they predicted sales, they were correct (R) 65% of the time when the weather was sunny and 35% incorrectly (I).  When the weather was cloudy, they predicted sales correctly (R) 85% of the time and 15% of the time incorrectly (I).

We can represent these probabilities like so,

- $\Pr(R\big|S) = 0.65, \Pr(I\big|S) = 0.35$
- $\Pr(R\big|C) = 0.85, \Pr(I\big|C) = 0.15$

When we produce the probability tree, we must do one for each floor, but in this case the probabilities of weather is the same.  We could, however, model weather on various days and update our decision tree and probability trees accordingly.

![](./assets/bobs_sales_predictor.png)

Now we must cast our new shopping agency information into the decision tree.  What we'd like to do is determine the probabilities with this new information, or more precisely what are the _updated_ probabilities of sunny and cloud, if we were to assume the correct and incorrect probabilities of the agency has given.  

Take a close look at the original decision tree:

![](assets/decisiontree@2017.04.03_20.22.57.png)
Alternatively, $\Pr(S\big|R), \Pr(S\big|I), \Pr(C\big|R),$ and $\Pr(C\big|I)$.  Taking a closer look, we can see Bayes Theorem can help us.  We will go through a single example:

$$
\begin{align*}
\Pr(S\big|R) &= {{\Pr(R\big|S)\Pr(S)} \over {\Pr(R)}} \\
             &= {{(0.35)0.65}\over{(0.65)0.35 + (0.35)0.85}} \\
             &= 0.4333
\end{align*}
$$

The decision tree (for just the first floor) now looks like :

![](assets/bayes_example.png)

From here you can compute the second floor tree and hence have enough information to determine if the shopping agency's information is worth the cost they are charging.

## RESOURCES

* [Think Bayes](http://greenteapress.com/wp/think-bayes/) is an excellent introduction to Bayesian statistics from a computational perspective in Python.  There are a lot of coding examples there.
* For a formal and thorough treatment of Bayesian Reasoning and Machine Learning see [Bayesian Reasoning and Machine Learning, David Barber (2010)](http://web4.cs.ucl.ac.uk/staff/D.Barber/textbook/090310.pdf).
* For a general introduction to Bayesian Statistics, see [Bayesian Data Analysis, Gelman, et al](http://www.stat.columbia.edu/~gelman/book/).