Decisions Under Risk
======================

Shane Steinert-Threlkeld

S.N.M.Steinert-Threlkeld AT uva DOT nl

Recap
-----

* Decisions under ignorance: true state unknown, but so are the probabilities
* Many choice rules:
    - avoid dominance, maximin, leximin, maximax, optimism-pessimism, minimax regret
* Every rule seemed to have pros and cons.  Is there a single _correct_ (normatively) choice rule?

Outline
------

* Decisions under risk: probabilities known
* Choice rule: maximize _expected_ utility
* Justifying the MEU principle
    - What is utility?
    - What are the probabilities?
* Evidential versus Causal Decision Theory
* Problems for MEU

In [1]:
%%HTML
<style type="text/css">
.rendered_html tbody tr td:first-child {
    border-right: 1px solid black;
}
    
.rendered_html table {
    font-size: 28px;
}
</style>

Ride to Work Revisited
----------

| &#160; | rain | no rain |
| ----- | ----- | ----- |
| take clothes | 1 | 1 |
| leave clothes | 0 | 2 |

Ride to Work Revisited
----------

| &#160; | rain ($0.7$) | no rain ($0.3$) |
| ----- | ----- | ----- |
| take clothes | 1 | 1 |
| leave clothes | 0 | 2 |

(NB: we now assume that these utilities are _cardinal_, representing my actual values, not merely the ordering of my preferences over the outcomes.)

In [2]:
from collections import namedtuple

decision_problem = namedtuple(
    'decision_problem',
    ('states', 'actions', 'utilities', 'probabilities'),
    defaults=[None]
)

Maximize Expected Utility
---------

* Observation: for each $a$, $u( \cdot , a)$ is a _random variable_ with domain $S$ and range $\mathbb{R}$.
* For full generality, we assume that the function $p$ in the agent's decision problem works as follows: for each action $a$, $p_a$ is a probability distribution over the states $S$.  Intuitively, this assigns probabilities to statements of the form "if $a$, then $s$".  Much more on this later.

$$\text{MEU}(D) = \text{argmax}_a \mathbb{E}_{p_a} u(\cdot , a)$$

Intuitively: we weight the utilities of the action in each outcome by how probable that outcome is.

MEU in Python
------------

In [3]:
import numpy as np

ride_to_work = decision_problem(
    ('rain', 'no_rain'),
    ('take_clothes', 'leave_clothes'),
    np.array([
        [1, 1],
        [0, 2]
    ]),
    np.array([
        [0.7, 0.3],
        [0.7, 0.3]
    ])
)

In [10]:
def maximize_expected_utility(decision):
    weighted_utilities = decision.utilities * decision.probabilities  # element-wise multiplication
    expected_utilities = np.sum(weighted_utilities, axis=1)
    max_eu_act_idxs = np.where(expected_utilities == np.amax(expected_utilities))
    return list(np.array(decision.actions)[max_eu_act_idxs])

In [11]:
maximize_expected_utility(ride_to_work)

['take_clothes']