Make sure you fill in any place that says `YOUR CODE HERE` or "YOUR ANSWER HERE". As a reminder, there is **NO COLLABORATION** whatsoever on the final.

---

In [None]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt

from IPython.display import YouTubeVideo

In the gameshow *Deal or No Deal*, the contestant is presented with 26 closed briefcases numbered from 1 to 26, each containing a piece of paper with a different amount of money written on it. The amounts range from 0.01 to 1,000,000 USD and are posted on a board that everyone can see:

<img src="images/deal-or-no-deal-board.gif" alt="Deal or no deal board" width="200px"/>


At the begnning of the show, the contestant selects a briefcase, but does not get to open it. Unless something happens, the amount written in that suitcase is how much the contestant will win at the end of the game. But what is the chosen suitcase worth? This is the essential question that drives the game's dynamic.

On each round, the participant selects a briefcase to eliminate from those that remain. The selected briefcase is opened, revealing the dollar amount inside. The contestant thus learns that the originally-chosen suitcase does not contain that amount. It's struck off the board. 

Then the participant is confronted with a decision: continue playing by eliminating another suitcase and getting an even better estimate of the value of the briefcase in hand, or accept a deal offered by The Banker: take a fixed amount of money and end the game immediately. The amount of money offered by The Banker depends on the suitcases that remain.

Should the contestant take the risk and continue playing, possibly beating The Banker's offer? Or should the contestant accept the deal and walk away, money in hand? This is the game's essential tension, always present. It's what makes the show exciting. Deal, or no deal? 

If, by the end of the game, the contestant has not taken any of The Banker's offers, then the contestant walks away with the amount of money written in the suitcase that was chosen at the start of the show.

If you are confused about how the game works, consider watching a bit of the show on YouTube. Here's a particularly exciting clip.

In [None]:
YouTubeVideo("hmZFHjQfx-o")

---
## Part A (1 point)

Decision theory recommends that a contestant should accept the deal if the utility of The Banker's offer exceeds the expected utility of continuing to play. What *is* expected utility, you ask? First, consider the amount of money that is offered to the contestant. It's a number in units of dollars. But most of us don't experience life in units of dollars, and that's where utility comes into play. The utility function transforms dollars into *utility*, the satisfaction that a decision-maker will experience in receiving that amount of money. Using the utility function, it is straightforward to transform The Banker's offer into a utility. But what about the utility of continuing to play? We don't know it for sure, so instead we must compute the expected utility — the probability-weighted average utility of all possible futures.

We can decide whether to accept or reject a deal by comparing the offer's utility to the *expected utility* of continuing to play. Note that if the contestant continues playing, there are as many possible outcomes as there are dollar values remaining on the board, and these outcomes are all equally likely. Thus the expected utility of the action "continue-to-play" $A$ is given by the probability-weighted average utility over all possible outcomes, $o\in A$:

$$E[U(A)] = \sum_{o\in A} p(o)U(o)$$

For example, if three briefcases remain (\$10, \$300, and \$5,000), then the expected utility of rejecting a deal and keeping the current case is:

$$E[U] = p(\$10)U(\$10) + p(\$300)U(\$300) + p(\$5000)U(\$5000),$$

where $p(x)$ is the probability of receiving $x$ dollars and $U(x)$ is the utility of receiving $x$ dollars.

The expected utility thus depends both on the probability of the possible outcomes and on the utility function $U$. For this problem, we will assume the cases are equally likely to be chosen. We will explore various choices of the utility function, $U$.

One of the simplest utility functions is a linear utility function, which equates utility with dollars.  

- `linear` — equates utility with dollar amount,
  i.e. $U(x)=x$
  
However, a classic result in the study of economics and decision-making is that people are risk-averse for gains: they will take a fixed outcome over a gamble with the same expected utility when money and utility are equated. Given a choice between \$0.50 and a gamble with a 50% chance of getting a dollar and a 50% chance of gettling nothing, they'll choose the sure thing over the gamble. This can be explained by a concave utility function, such as the logarithm:

- `log` — natural logarithm of the dollar
  amount, i.e. $U(x)=\log{x}$
  
Other concave utility functions have been proposed, too:
  
- `sqrt` — square root of the dollar amount,
  i.e. $U(x)=\sqrt{x}$
- `negexp` — negative exponential of the dollar amount,
  i.e. $U(x)=1-e^{-x/200000}$

<div class="alert alert-success">To compare these various utility functions, complete the function `utility`. This function takes a $N$-length vector of dollar amounts, `x`, and the name of a utility function, `utility_function`, and computes the utility of each value of `x` using the given utility function. The four utility functions you should implement are described above.
</div>

In [None]:
def utility(x, utility_function):
    """Compute the utility.
    
    Parameters
    ----------
    x : numpy array
        A numpy array with shape (N,) containing dollar values.
    utility_function : string
        The name of the utility function ("linear", "log", "sqrt", or "negexp").
        
    Returns
    -------
    numpy array
        A numpy array with shape (N,) corresponding to the utility of
        each value in x under the named utility function.

    """
    # YOUR CODE HERE
    raise NotImplementedError()

Here's a cell to test your code:

In [None]:
# add your own test cases here


In [None]:
"""Check that utility is implemented correctly."""

from numpy.testing import assert_allclose

arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([ 0.60726381,  0.41245142,  0.90066819,  0.4128766 ,  0.47827406])
arr3 = np.array([ 7.,  7.,  2.,  5.,  7.,  6.,  7.,  1.,  4.,  2.])

# Test linear utility function
assert_allclose(utility(arr1, 'linear'), arr1)
assert_allclose(utility(arr2, 'linear'), arr2)
assert_allclose(utility(arr3, 'linear'), arr3)
    
# Test log utility function
expected_log_1 = np.array([ 0.        ,  0.69314718,  1.09861229,  1.38629436])
expected_log_2 = np.array([-0.49879197, -0.88563685, -0.10461836, -0.88460652, -0.73757136])
expected_log_3 = np.array([ 
    1.94591015,  1.94591015,  0.69314718,  1.60943791,  1.94591015,
    1.79175947,  1.94591015,  0.        ,  1.38629436,  0.69314718])

assert_allclose(utility(arr1, 'log'), expected_log_1)
assert_allclose(utility(arr2, 'log'), expected_log_2)
assert_allclose(utility(arr3, 'log'), expected_log_3)
    
# Test sqrt utility function
expected_sqrt_1 = np.array([ 1.        ,  1.41421356,  1.73205081,  2.        ])
expected_sqrt_2 = np.array([ 0.77927133,  0.64222381,  0.9490354 ,  0.64255474,  0.69157361])
expected_sqrt_3 = np.array([ 
    2.64575131,  2.64575131,  1.41421356,  2.23606798,  2.64575131,
    2.44948974,  2.64575131,  1.        ,  2.        ,  1.41421356])

assert_allclose(utility(arr1, 'sqrt'), expected_sqrt_1)
assert_allclose(utility(arr2, 'sqrt'), expected_sqrt_2)
assert_allclose(utility(arr3, 'sqrt'), expected_sqrt_3)
    
# Test negexp utility function
expected_negexp_1 = np.array([4.99998750e-06, 9.99995000e-06, 1.49998875e-05, 1.99998000e-05])
expected_negexp_2 = np.array([3.03631444e-06, 2.06225497e-06, 4.50333081e-06, 2.06438087e-06, 2.39136744e-06])
expected_negexp_3 = np.array([
    3.49993875e-05, 3.49993875e-05, 9.99995000e-06, 2.49996875e-05, 3.49993875e-05,
    2.99995500e-05, 3.49993875e-05, 4.99998750e-06, 1.99998000e-05, 9.99995000e-06])

assert_allclose(utility(arr1, 'negexp'), expected_negexp_1)
assert_allclose(utility(arr2, 'negexp'), expected_negexp_2)
assert_allclose(utility(arr3, 'negexp'), expected_negexp_3)

print("Success!")

---
## Part B (2 points)

<div class="alert alert-success">
By graphing the utility functions, you can see how the relative utility of different amounts of money changes. To do this, we'll first create a function that lets you visualize the utility function for a range of values, in this case from \$100 to \$500,000 (including \$500,000) in increments of \$100. Complete the following function `plot_utility`.
</div>

In [None]:
def plot_utility(axis, utility_function):
    """Plot a utility function over the range $100 to $500,000 
    (including $500,000) in increments of $100. You should make use 
    of your completed `utility` function to compute the utilities. 
    In addition:
    
    * give the plot an appropriate title which includes the name of
      the utility function
    * make sure you include x and y labels
    
    Parameters
    ----------
    axis : matplotlib axis object
        The axis on which to plot the utility function
    utility_function : string
        The name of the utility function

    """
    # YOUR CODE HERE
    raise NotImplementedError()

Once your function is done, you can use it to plot any of the utility functions, for example the log utility function:

In [None]:
fig, axis = plt.subplots()
plot_utility(axis, 'log')

In [None]:
"""Check that plot_utility is correct"""
from numpy.testing import assert_array_equal
from nose.tools import assert_equal

functions = ['linear', 'log', 'sqrt', 'negexp']
fig, axes = plt.subplots(1, 4)
for i in range(len(functions)):
    axis = axes[i]
    plot_utility(axis, functions[i])
    
    # check that only one thing is plotted
    assert_equal(len(axis.lines), 1)
    
    # check the xdata
    assert_array_equal(axis.lines[0].get_xdata(), np.arange(100, 500100, 100))
    # check the ydata
    assert_array_equal(axis.lines[0].get_ydata(), utility(np.arange(100, 500100, 100), functions[i]))
    
    # check the axis labels
    assert axis.get_xlabel() != ''
    assert axis.get_ylabel() != ''
    
    # check the title
    assert functions[i] in axis.get_title(), "the utility function name is not in the plot title"
plt.close()
  
# make sure it calls the utility function
fig, axis = plt.subplots()
old_utility = utility
del utility
try:
    plot_utility(axis, 'log')
except NameError:
    pass
else:
    raise AssertionError("plot_utility does not call utility")
finally:
    utility = old_utility
    del old_utility
    plt.close()
    
print("Success!")

---
## Part C (2 points)

<div class="alert alert-success">
Run the cell below to create plots for all four of the utility functions. Examine the slopes of these functions. How do the slopes vary? How do they relate to the amount of risk averseness in each utility function? Make sure you refer to *all* of the different functions in your answer.
</div>

<div class="alert alert-warning"> **NB**. Upon running the code below, you should see four separate plots, each with a different line plotted. If you do not, you probably have a bug in your `plot_utility` function from Part B!
</div>

In [None]:
# create a 2x2 grid of subplots
fig, axes = plt.subplots(2, 2)

# plot each of the utility functions
functions = ['linear', 'log', 'sqrt', 'negexp']
for i in range(len(functions)):
    plot_utility(axes.flat[i], functions[i])

# make the figure bigger
fig.set_figwidth(8)
fig.set_figheight(8)
plt.tight_layout()

YOUR ANSWER HERE

---
## Part D (1 points)

<div class="alert alert-success">
Let's now use our utility functions to compute the expected utility of continuing to play given that a certain set of suitcases remain. Assuming all the cases are equally likely to be chosen, complete the function `expected_utility` so that it computes the equation from Part A with the given utility function. You should again make use of your completed `utility` function.
</div>

In [None]:
def expected_utility(remaining_suitcases, utility_function):
    """Compute the expected utility of continuing to play.
    
    Parameters
    ----------
    remaining_suitcases : numpy array
        A numpy array with shape (N,) containing dollar values of
        the remaining cases.
    utility_function : string
        The name of the utility function ("linear", "log", "sqrt", 
        or "negexp").
        
    Returns
    -------
    float
       The expected utility of the remaining suitcases under the 
       passed `utility_function`
    """
    # YOUR CODE HERE
    raise NotImplementedError()

Here's a cell to test your code:

In [None]:
# add your own test cases here


In [None]:
"""Check that expected_utility is correct"""
from numpy.testing import assert_allclose

arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([ 0.60726381,  0.41245142,  0.90066819,  0.4128766 ,  0.47827406])
arr3 = np.array([ 7.,  7.,  2.,  5.,  7.,  6.,  7.,  1.,  4.,  2.])

# check some different inputs
assert_allclose(expected_utility(arr1, 'linear'), 2.5)
assert_allclose(expected_utility(arr2, 'linear'), 0.56230681599999999)
assert_allclose(expected_utility(arr3, 'linear'), 4.7999999999999998)
assert_allclose(expected_utility(arr1, 'log'), 0.79451345758698633)
assert_allclose(expected_utility(arr2, 'log'), -0.62224501212915961)
assert_allclose(expected_utility(arr3, 'log'), 1.395742670012319)
assert_allclose(expected_utility(arr1, 'sqrt'), 1.5365660924854931)
assert_allclose(expected_utility(arr2, 'sqrt'), 0.74093177939662735)
assert_allclose(expected_utility(arr3, 'sqrt'), 2.109699008928752)
assert_allclose(expected_utility(arr1, 'negexp'), 1.2499906250518222e-05)
assert_allclose(expected_utility(arr2, 'negexp'), 2.8115297067365931e-06)
assert_allclose(expected_utility(arr3, 'negexp'), 2.3999647503747389e-05)

# make sure it calls utility
old_utility = utility
del utility
try:
    expected_utility(np.array([1, 2, 3]), 'linear')
except NameError:
    pass
else:
    raise AssertionError("expected_utility does not call utility")
finally:
    utility = old_utility
    del old_utility

print("Success!")

---
## Part E (1 point)

<div class="alert alert-success">
The contestant accepts the bank's offer if the expected utility of the offer exceeds the expected utility of continuing to play. Using both your `utility` and `expected_utility` functions, create a `contestant` function that plays the game, deciding whether to accept the deal.
</div>

In [None]:
def contestant(offer, remaining_suitcases, utility_function):
    """Play Deal or No Deal, deciding whether to accept The Banker's offer.
    The contestant accepts the bank's offer if the expected utility of the 
    offer exceeds the expected utility of continuing to play.
    
    Parameters
    ----------
    offer : float
        The dollar value of the offer.
    remaining_suitcases : numpy array
        A numpy array with shape (N,) containing dollar values of the remaining cases.
    utility_function : string
        The name of the utility function ("linear", "log", "sqrt", or "negexp").
        
    Returns
    -------
    boolean
        True if the contestant accepts the deal, otherwise False.

    """
    # YOUR CODE HERE
    raise NotImplementedError()

Here is a cell to test out your code:

In [None]:
response = contestant(104, np.array([110, 100]), 'linear')
response

In [None]:
"""Is contestant implemented correctly?"""

# Some offers should be accepted.
assert contestant(106, np.array([110, 100]), 'linear')
assert contestant(105, np.array([110, 100]), 'log')

# Other offers should be rejected.
assert not contestant(104, np.array([110, 100]), 'linear')
assert not contestant(104, np.array([110, 100]), 'log')

# check some other values, too
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([ 0.60726381,  0.41245142,  0.90066819,  0.4128766 ,  0.47827406])
arr3 = np.array([ 7.,  7.,  2.,  5.,  7.,  6.,  7.,  1.,  4.,  2.])

assert not contestant(2.5, arr1, 'linear')
assert contestant(3.5, arr1, 'linear')
assert not contestant(0.56, arr2, 'linear')
assert contestant(0.57, arr2, 'linear')
assert not contestant(4.7, arr3, 'linear')
assert contestant(4.9, arr3, 'linear')

assert not contestant(2.213363839400643, arr1, 'log')
assert contestant(2.22, arr1, 'log')
assert not contestant(0.53, arr2, 'log')
assert contestant(0.54, arr2, 'log')
assert not contestant(4.03, arr3, 'log')
assert contestant(4.04, arr3, 'log')

assert not contestant(2.3610353565761373, arr1, 'sqrt')
assert contestant(2.37, arr1, 'sqrt')
assert not contestant(0.54, arr2, 'sqrt')
assert contestant(0.55, arr2, 'sqrt')
assert not contestant(4.45, arr3, 'sqrt')
assert contestant(4.46, arr3, 'sqrt')

assert not contestant(2.4999968749994781, arr1, 'negexp')
assert contestant(2.5, arr1, 'negexp')
assert not contestant(0.56, arr2, 'negexp')
assert contestant(0.57, arr2, 'negexp')
assert not contestant(4.7, arr3, 'negexp')
assert contestant(4.8, arr3, 'negexp')

# make sure it calls utility
old_utility = utility
del utility
try:
    contestant(106, np.array([110, 100]), 'linear')
except NameError:
    pass
else:
    raise AssertionError("contestant does not call utility")
finally:
    utility = old_utility
    del old_utility

# make sure it calls expected_utility
old_expected_utility = expected_utility
del expected_utility
try:
    contestant(106, np.array([110, 100]), 'linear')
except NameError:
    pass
else:
    raise AssertionError("contestant does not call expected_utility")
finally:
    expected_utility = old_expected_utility
    del old_expected_utility

print("Success!")

---
## Part F (2 points)

Which utility function do contestants actually use when playing the game? Researchers have collected game-play data from the show and modeled the utility functions and choice models used by the contestants and The Banker. In this question, you will determine, based on actual game-play data, which utility function provides the closest fit to human behavior.

The data come in the form of an array where rows are dollar values (those on the board) and columns are game rounds. The value is `True` if that dollar value remains in play on the board and `False` if it has been eliminated. Your goal is to evaluate each utility function by tallying up the number of rounds where the predicted response matches that of the contestant.

Here's an example history of the board from one particular contestant:

In [None]:
contestant_001 = dict(np.load("data/deal_or_no_deal/contestant_001.npz"))
contestant_001["board_history"]

And here are the dollar values of the board:

In [None]:
contestant_001["values"]

And here are the offers made by The Banker on each round:

In [None]:
contestant_001["offers"]

These are the decisions the contestant made on each round. `False` indicates that they rejected the deal, while `True` indicates that they accepted the deal:

In [None]:
contestant_001["decisions"]

To compare our different utility functions, we will measure how well they predict the contestants' decisions using a metric called an [F-score](http://en.wikipedia.org/wiki/F1_score). We have provided a function `F_score` for you:

In [None]:
def F_score(model, human):
    """Compute the F-score between model predictions
    and human data. A larger F-score means that the model
    predicts human behavior better.
    
    Parameters
    ----------
    model : boolean numpy array with shape (n,)
        The model decisions
    human : boolean numpy array with shape (n,)
        The human's decisions
        
    Returns
    -------
    float : the F-score for the model

    """
    TT = (model & human).sum()
    TF = (model & ~human).sum()
    FT = (~model & human).sum()

    if TT == 0 and TF == 0:
        precision = 0
    else:
        precision = TT / (TT + TF)

    if TT == 0 and FT == 0:
        recall = 0
    else:
        recall = TT / (TT + FT)

    if precision == 0 and recall == 0:
        return 0

    return 2 * precision * recall / (precision + recall)

<div class="alert alert-success">
Using the `F_score` function, as well as your `contestant` function, complete the `best_fitting_utility_fuction` below to compute F-scores for each of the utility functions:
</div>

In [None]:
def best_fitting_utility_function(contestant_data, utility_functions):
    """Which utility function best fits the data? Your function should use your
    contestant function from above to determine the decisions that a contestant
    would make for the given board and offers, for each utility function. For 
    each utility function, you should compute the F-score for the model compared
    to the contestant's real decisions. (The model is your `contestant` function, 
    using the given utility function.)
    
    Parameters
    ----------
    contestant_data : dict
        A dictionary containing the data for a contestant in the gameshow, with
        the following keys and values:
            board_history : (m, n) Numpy array
                Whether the dollar values remain in each of n rounds.
            values : (m,) Numpy array
                All the possible dollar values at the start of the game.
            offers : (n,) Numpy array
                The dollar value of The Banker's offer.
            decisions : (n,) Numpy array
                Whether the contestant accepted the offer.
    utility_functions : list
        A list of the utility functions you should compare.

    Returns
    -------
    numpy array
        The F-score for each utility function. The ith element should correspond to
        utility_functions[i].

    """
    # YOUR CODE HERE
    raise NotImplementedError()

In [None]:
data = dict(np.load("data/deal_or_no_deal/contestant_001.npz"))
best_fitting_utility_function(data, ["log", "linear", "sqrt", "negexp"])

In [None]:
# add your own test cases here


In [None]:
"""Check that best_fitting_utility_function is correct"""
from numpy.testing import assert_array_equal

data_001 = dict(np.load("data/deal_or_no_deal/contestant_001.npz"))

# make sure it calls the contestant function
old_contestant = contestant
del contestant
try:
    best_fitting_utility_function(data_001, ["log", "linear", "sqrt", "negexp"])
except NameError:
    pass
else:
    raise AssertionError("best_fitting_utility_function does not call contestant")
finally:
    contestant = old_contestant
    del old_contestant
    
# make sure it calls the F_score function
old_F_score = F_score
del F_score
try:
    best_fitting_utility_function(data_001, ["log", "linear", "sqrt", "negexp"])
except NameError:
    pass
else:
    raise AssertionError("best_fitting_utility_function does not call F_score")
finally:
    F_score = old_F_score
    del old_F_score

assert_allclose(
    best_fitting_utility_function(data_001, ["log", "linear", "sqrt", "negexp"]), 
    np.array([ 0.2       ,  0.        ,  0.22222222,  0.28571429]))
assert_allclose(
    best_fitting_utility_function(data_001, ["log", "sqrt", "linear", "negexp"]),
    np.array([ 0.2       ,  0.22222222,  0.        ,  0.28571429]))

data_002 = dict(np.load("data/deal_or_no_deal/contestant_002.npz"))
assert_allclose(
    best_fitting_utility_function(data_002, ["log", "linear", "sqrt", "negexp"]),
    np.array([ 0.22222222,  0.        ,  1.        ,  0.5       ]))
assert_allclose(
    best_fitting_utility_function(data_002, ["log", "sqrt", "linear", "negexp"]), 
    np.array([ 0.22222222,  1.        ,  0.        ,  0.5       ]))

data_003 = dict(np.load("data/deal_or_no_deal/contestant_003.npz"))
assert_allclose(
    best_fitting_utility_function(data_003, ["log", "linear", "sqrt", "negexp"]), 
    np.array([ 0.25      ,  0.        ,  0.33333333,  0.4       ]))
assert_allclose(
    best_fitting_utility_function(data_003, ["log", "sqrt", "linear", "negexp"]), 
    np.array([ 0.25      ,  0.33333333,  0.        ,  0.4       ]))

print("Success!")

---
## Part G (1 point)

Let's examine how well the utility functions fit the different contestants, on average. The following cell will print out the average F-score for each model across all the contestants. A larger F-score means that the model is a better match to the contestants' decisions:

In [None]:
utility_functions = ["linear", "log", "sqrt", "negexp"]

# compute the F-score of each utility function for each participant
scores = np.empty((47, 4))
for i in range(1, 48):
    data = dict(np.load("data/deal_or_no_deal/contestant_{:03d}.npz".format(i)))
    scores[i - 1] = best_fitting_utility_function(data, utility_functions)

# print out the mean F-score of each utility function
for j in range(len(utility_functions)):
    print("{}: {}".format(utility_functions[j], scores[:, j].mean()))

<div class="alert alert-success">
How well does each function explain the contestants' decisions? Which is the best-fitting function? Which is the worst? (Make sure you mention all the functions in your answer).
</div>

YOUR ANSWER HERE

<div class="alert alert-success">
What do these results suggest about human utility functions?
</div>

YOUR ANSWER HERE

---

Before turning this problem in remember to do the following steps:

1. **Restart the kernel** (Kernel$\rightarrow$Restart)
2. **Run all cells** (Cell$\rightarrow$Run All)
3. **Save** (File$\rightarrow$Save and Checkpoint)

<div class="alert alert-danger">After you have completed these three steps, ensure that the following cell has printed "No errors". If it has <b>not</b> printed "No errors", then your code has a bug in it and has thrown an error! Make sure you fix this error before turning in your exam.</div>

In [None]:
print("No errors!")