# Chapter 9. Decision Analysis

[Link to chapter online](https://allendowney.github.io/ThinkBayes2/chap09.html)

## Warning

The content of this file may be incorrect, erroneous and/or harmful. Use it at Your own risk.

## Imports

In [None]:
include("./pmfAndCdf.jl")
include("./simplestat.jl")

In [None]:
import CairoMakie as Cmk
import CSV as Csv
import DataFrames as Dfs
import Distributions as Dsts
import KernelDensity as Kde
import Statistics as Stats

In [None]:
Num = Union{Int,Float64} # custom type

## The Price is Right Problem

*The Price is Right* - a gameshow. The objective is to guess the price of a collection of prizes.
The contestant who comes closest to the actual price, without going over, wins the prizes.

One of the episodes, two contestants (N and L):
- N Prize: dishwasher, wine cabinet, laptop, car.
- L Prize: pinball machine, video arcade game, pool table, cruise of the Bahamas

Bids:
- N: $26'000 (real price: $25'347, diff: $653)
- L: $21'500 (real price: $21'578, diff: $78)

L wins her showcase, and due to smaller diffs also N showcase.

Several questions for a Bayesian thinker.

1. Before seeing the prizes, what prior beliefs should the contestants have about the price of the showcase?
2. After seeing the prizes, how should the contestants update those beliefs?
3. Based on the posterior distribution, what should the contestants bid?

Problem inspired by Cameron Davidson-Pilon's [book](https://dataorigami.net/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/).

## The Prior

To choose the prior distribution we can use the track of previous prices.
See [the book repo.](https://github.com/AllenDowney/ThinkBayes2/tree/master/data)

In [None]:
function read_data(filename::String):: Dfs.DataFrame
    df = Csv.read(filename, Dfs.DataFrame; header=false, skipto=4) 
    df = Dfs.dropmissing(df)
    df = Dfs.permutedims(df, 1)
    df[!, 2:end]
end

In [None]:
df2011 = read_data("./showcases2011.csv")
df2012 = read_data("./showcases2012.csv")
df = vcat(df2011, df2012)
first(df, 3)

The first two columns, `Showcase 1` and `Showcase 2`, are the values of the
showcases in dollars. The next two columns are the bids the contestants made.
The last two columns are the differences between the actual values and the bids.

## Kernel Density Estimation

We can use this sample to estimate the prior distribution of showcase prices, e.g. using KDE, i.e. [kernel density estimation](https://mathisonian.github.io/kde/).

More info on used [KDE library in Julia](https://github.com/JuliaStats/KernelDensity.jl).

In [None]:
function getKDEfromSample(sample::Vector{A}, qs::Vector{B}) where {A<:Num, B<:Num}
    # optional keyword argument is kernel (defaults to Dsts.Normal)
    gaussianKde::Kde.KernelDensity.UnivariateKDE = Kde.kde(sample) 
    ps::Vector{Float64} = Kde.pdf(gaussianKde, qs)
    pmf::Pmf{B} = Pmf(qs, ps)
    return pmf
end

In [None]:
qs = range(0, 80000, 81) |> collect
prior1 = getKDEfromSample(df[!, "Showcase 1"], qs)
prior2 = getKDEfromSample(df[!, "Showcase 2"], qs)

In [None]:
fig = Cmk.Figure()
ax1, l1 = Cmk.lines(
    fig[1, 1],
    prior1.names, prior1.priors,
    color=:blue, linewidth=3,
    axis=(;title="Prior distribution of showcase value",
    xlabel="Showcase value in \$", ylabel="PMF",
    xticks=(0:10000:80000, map(x -> string(x, "k"), 0:10:80)),
    yticks=0:0.01:0.06, 
    )

)
l2 = Cmk.lines!(
    fig[1, 1],
    prior2.names, prior2.priors,
    color=:orange, linewidth=3,
)
Cmk.axislegend(
    ax1,
    [l1, l2],
    ["Showcase 1", "Showcase 2"]
)
fig

## Distribution of Error

To update the priors we need to know:
- What data should we consider and how should we quantify it?
- Can we compute a likelihood function; that is, for each hypothetical price,
can we compute the conditional likelihood of the data?


[...] model each contestant as a price-guessing instrument with known error characteristics.

Now the question we have to answer is, “If the actual price is `price`, what is
the likelihood that the contestant’s guess would be `guess`?”

Equivalently, if we define `error = guess - price`, we can ask, “What is the likelihood that the contestant’s guess is off by `error`?”

In [None]:
sampleDiff1 = df[:, "Bid 1"] .- df[:, "Showcase 1"]
sampleDiff2 = df[:, "Bid 2"] .- df[:, "Showcase 2"];

In [None]:
qs = range(-40000, 20000, 61) |> collect
kdeDiff1 = getKDEfromSample(sampleDiff1, qs)
kdeDiff2 = getKDEfromSample(sampleDiff2, qs);

In [None]:
fig = Cmk.Figure()
ax1, l1 = Cmk.lines(
    fig[1, 1],
    kdeDiff1.names, kdeDiff1.priors,
    color=:blue, linewidth=3,
    axis=(;title="Difference between bid and actual value",
    xlabel="Difference in value in \$", ylabel="PMF",
    xticks=(-40000:10000:20000, map(x -> string(x, "k"), -40:10:20)),
    yticks=0:0.01:0.07, 
    )

)
l2 = Cmk.lines!(
    fig[1, 1],
    kdeDiff2.names, kdeDiff2.priors,
    color=:orange, linewidth=3,
)
Cmk.axislegend(
    ax1,
    [l1, l2],
    ["Diff 1", "Diff 2"]
)
fig

It looks like these distributions are well modeled by a normal distribution

In [None]:
meanDiff1 = Stats.mean(sampleDiff1)
stdDiff1 = Stats.std(sampleDiff1)

(meanDiff1, stdDiff1)

We will use the diffs to model distribution of errors.

Assumptions:

- contestants underbid because they are being strategic, and that on average their guesses are accurate. In other words, the mean of their errors is 0.
- the spread of the differences reflects the actual spread of their errors. So, the standard deviation of the differences is the standard deviation of their errors.

In [None]:
errorDist1 = Dsts.Normal(0, stdDiff1)

Now, we can, e.g. calculate the probablity density of `error=-100` for Player 1.

In [None]:
Dsts.pdf(errorDist1, -100)

By itself, this number doesn’t mean very much, because probability densities are not probabilities. But they are proportional to probabilities, so we can use them as likelihoods in a Bayesian update [...].

## Update

Suppose we are Player 1 and our guess for the total price of prize is $23'000.

In [None]:
guess1 = 23_000
error1 = guess1 .- prior1.names;

Moreover, we assume that our estimation error is well modeled by `errorDist1`.

In [None]:
likelihood1 = Dsts.pdf.(errorDist1, error1);

Now we can compute the likelihood of error under each hypothesis.

In [None]:
posterior1 = Pmf(prior1.names |> copy, prior1.priors)
posterior1.likelihoods = likelihood1
updatePosteriors!(posterior1, true);

In [None]:
fig = Cmk.Figure()
ax1, l1 = Cmk.lines(
    fig[1, 1],
    prior1.names, prior1.priors,
    color=:gray, linewidth=3,
    axis=(;title="Prior and posterior distribution of showcase value",
    xlabel="Showcase value in \$", ylabel="PMF",
    xticks=(0:10000:80000, map(x -> string(x, "k"), 0:10:80)),
    yticks=0:0.01:0.09, 
    )

)
l2 = Cmk.lines!(
    fig[1, 1],
    posterior1.names, posterior1.posteriors,
    color=:purple, linewidth=3,
)
Cmk.axislegend(
    ax1,
    [l1, l2],
    ["Prior 1", "Posterior 1"]
)
fig

In [None]:
getMean(prior1, true),
getMean(posterior1, false)

Before we saw the prizez, we expected to see a showcase with a value ~$30k,
after making a guess of 23k we updated the distibution. Now we expect the actual
price o be ~26k.

### Exercise 1

Now it's Player 2 turn. He evaluates the prize to be worth $38'000.
Perform the above calculations for Player 2.

In [None]:
guess2 = 38_000
error2 = guess2 .- prior2.names;

In [None]:
meanDiff2 = Stats.mean(sampleDiff2)
stdDiff2 = Stats.std(sampleDiff2)

(meanDiff2, stdDiff2)

In [None]:
errorDist2 = Dsts.Normal(0, stdDiff2)

In [None]:
likelihood2 = Dsts.pdf.(errorDist2, error2);

In [None]:
posterior2 = Pmf(prior2.names |> copy, prior2.priors)
posterior2.likelihoods = likelihood2
updatePosteriors!(posterior2, true);

In [None]:
fig = Cmk.Figure()
ax1, l1 = Cmk.lines(
    fig[1, 1],
    prior2.names, prior2.priors,
    color=:gray, linewidth=3,
    axis=(;title="Prior and posterior distribution of showcase value",
    xlabel="Showcase value in \$", ylabel="PMF",
    xticks=(0:10000:80000, map(x -> string(x, "k"), 0:10:80)),
    yticks=0:0.01:0.09, 
    )

)
l2 = Cmk.lines!(
    fig[1, 1],
    posterior2.names, posterior2.posteriors,
    color=:purple, linewidth=3,
)
Cmk.axislegend(
    ax1,
    [l1, l2],
    ["Prior 2", "Posterior 2"]
)
fig

In [None]:
getMean(prior2, true),
getMean(posterior2, false)

## Probability of Winning

From the point of view of Player 1, let's compute the probability that Player 2 overbids. To keep it simple, I’ll use only the performance of past players, ignoring the value of the showcase.

In [None]:
function getProbOverbid(sampleDiff::Vector{<:Num})::Float64
    return Stats.mean(sampleDiff .> 0) 
end

In [None]:
# an estimate of P(Player 2 overbid)
getProbOverbid(sampleDiff2)

Now suppose Player 1 underbids by $5'000. What is the probability that Player 2
underbids by more?

In [None]:
function getProbWorseThan(diff::A, sampleDiff::Vector{B})::Float64 where {A<:Num, B<:Num}
    return Stats.mean(sampleDiff .< diff)    
end

In [None]:
# P(Player 2 underbids by >= 5k)
getProbWorseThan(-5_000, sampleDiff2)

In [None]:
# P(Player 2 underbids by >= 10k)
getProbWorseThan(-10_000, sampleDiff2)

We can combine these functions to compute the probability that Player 1 wins,
given the difference between their bid and the actual price:

In [None]:
function getProbWin(diff::A, sampleDiff::Vector{B})::Float64 where {A<:Num, B<:Num}
    # if you overbid you lose 
    if diff > 0
        return 0
    else
        # if the opponent overbids, you win
        pOppOverbid = getProbOverbid(sampleDiff)
        # of if theif bid is worse than yours, you win
        pOppBidWorse = getProbWorseThan(diff, sampleDiff)

        # pOppOverbid and pOppBidWorse are mutually exclusive
        return pOppOverbid + pOppBidWorse
    end
end

In [None]:
# prob Player 1 wins, given they overbid by $5k
getProbWin(-5000, sampleDiff2)

In [None]:
xs = range(-30_000, 5_000, 121) |> collect
ys = map(x -> getProbWin(x, sampleDiff2), xs);

In [None]:
fig = Cmk.Figure()
Cmk.lines(
    fig[1, 1],
    xs, ys,
    color=:blue, linewidth=2,
    axis=(;title="Player 1",
        xlabel="Difference between bid and actual price (\$)",
        ylabel="Probability of winning with Player 2",
        xticks=(-30_000:5_000:5_000, map(x -> string(x, "k"), -30:5:5)),
        yticks=0:0.2:1, 
    )
)
fig

If Player 1 underbids by $30k, the probability of winning is ~0.3, which is
mostly the chance Player 2 overbids.

As the bid gets closer to the actual price the prob is closer to 1. If Player 1
overbids, they loose even if Player 2 overbids.

### Exercise 2

Run the same analysis from the point of view of Player 2.

In [None]:
# an estimate of P(Player 1 overbid)
getProbOverbid(sampleDiff1)

In [None]:
# P(Player 1 underbids by >= 5k)
getProbWorseThan(-5_000, sampleDiff1)

In [None]:
# prob Player 2 wins, given they overbid by $5k
getProbWin(-5000, sampleDiff1)

In [None]:
xs = range(-30_000, 5_000, 121) |> collect
ys = map(x -> getProbWin(x, sampleDiff1), xs);

In [None]:
fig = Cmk.Figure()
Cmk.lines(
    fig[1, 1],
    xs, ys,
    color=:blue, linewidth=2,
    axis=(;title="Player 2",
        xlabel="Difference between bid and actual price (\$)",
        ylabel="Probability of winning with Player 1",
        xticks=(-30_000:5_000:5_000, map(x -> string(x, "k"), -30:5:5)),
        yticks=0:0.2:1, 
    )
)
fig

## Decision Analysis

In contrary to what we did in the previous section the contestants don’t 
if and how much they underbid/overbid (they don’t know the actual price).

But they do have a posterior distribution that represents their beliefs about
the actual price, and they can use that to estimate their probability of winning
with a given bid.

$P(win) = \sum P(Price_i) P(win | price_i)$

In [None]:
"""getTotalProbOfWin(bid::Real,
        posteriors::Pmf{Float64},
        diffsForOpponent::Vector{Real})::Float64

    Computes the total probability of winning with a given bid.

	---
	args:
        bid: your bid
        posteriors: Pmf of showcase value
        diffsForOpponent: sequence of differences for the opponent
    
    returns:
        probability of winning
"""
function getTotalProbOfWin(
    bid::A, posteriors::Pmf{Float64},
    diffsForOpponent::Vector{B})::Float64 where {A<:Num, B<:Num}

    total::Float64 = 0.0
    diff::Float64 = 0.0
    for (price, prob) in zip(posteriors.names, posteriors.posteriors)
        diff = bid - price 
        total += prob * getProbWin(diff, diffsForOpponent)
    end
   return total 
end

Here's the probability that Player 1 wins, based on a bid of $25'000 and the
posterior distribution `posterior1`

In [None]:
getTotalProbOfWin(25_000, posterior1, sampleDiff2)

Now we can loop through a series of possible bids and compute the probability of
winning for each one.

In [None]:
bids = posterior1.names
probs = map(bid -> getTotalProbOfWin(bid, posterior1, sampleDiff2), bids);

In [None]:
fig = Cmk.Figure()
ax, l = Cmk.lines(
    fig[1, 1],
    bids, probs,
    color=:orange, linewidth=2,
    axis=(;title="Optimal bid: probability of winning",
        xlabel="Bid [\$]",
        ylabel="Probability of winning",
        xticks=(0:10_000:80_000, map(x -> string(x, "k"), 0:10:80)),
        yticks=0:0.1:6, 
    )
)
Cmk.axislegend(
    ax,
    [l],
    ["Player 1"]
)
fig

In [None]:
indxMaxProb = findfirst(prob -> prob == maximum(probs), probs)

In [None]:
probs[indxMaxProb],
bids[indxMaxProb]

It appears that the bid that maximizes Player1 chance of winning is $21'000.

### Exercise 3

Do the same analysis for Player 2.

In [None]:
bids = posterior2.names
probs = map(bid -> getTotalProbOfWin(bid, posterior2, sampleDiff1), bids);

In [None]:
fig = Cmk.Figure()
ax, l = Cmk.lines(
    fig[1, 1],
    bids, probs,
    color=:orange, linewidth=2,
    axis=(;title="Optimal bid: probability of winning",
        xlabel="Bid [\$]",
        ylabel="Probability of winning",
        xticks=(0:10_000:80_000, map(x -> string(x, "k"), 0:10:80)),
        yticks=0:0.1:6, 
    )
)
Cmk.axislegend(
    ax,
    [l],
    ["Player 2"]
)
fig

In [None]:
indxMaxProb = findfirst(prob -> prob == maximum(probs), probs)

In [None]:
probs[indxMaxProb],
bids[indxMaxProb]

## Maximizing Expected Gain

[...] if your bid is off by $250 or less, you win both showcases. So it might be
a good idea to increase your bid a little: it increases the chance you overbid
and lose, but it also increases the chance of winning both showcases.

In [None]:
"""
Compute expected gain given a bid and actualPrice.
"""
function getGain(
    bid::A,
    actualPrice::B,
    sampleDiffs::Vector{Int})::Float64 where {A<:Num, B<:Num}

    diff::Int = bid - actualPrice
    prob::Float64 = getProbWin(diff, sampleDiffs)
    # if you are within 250 dollars, you win both showcases
    if -250 <= diff <= 0
        return 2 * actualPrice * prob
    else
        return actualPrice * prob
    end
end

In [None]:
getGain(30_000, 35_000, sampleDiff2)

In reality we don’t know the actual price, but we have a posterior distribution
that represents what we know about it. By averaging over the prices and
probabilities in the posterior distribution, we can compute the expected gain
for a particular bid.

In this context, “expected” means the average over the possible showcase values,
weighted by their probabilities.

In [None]:
"""
Compute the expected gain of a given bid.
"""
function getExpectedGain(
    bid::A,
    posteriors::Pmf{Float64},
    sampleDiffs::Vector{Int})::Float64 where {A<:Num}
    
    total::Float64 = 0
    for (price, prob) in zip(posteriors.names, posteriors.posteriors)
        total += prob * getGain(bid, price, sampleDiffs)
    end

    return total
end

In [None]:
getExpectedGain(21_000, posterior1, sampleDiff2)

But can we do any better?

To find out, we can loop through a range of bids and find the one that maximizes
expected gain.

In [None]:
bids = posterior1.names
gains = map(bid -> getExpectedGain(bid, posterior1, sampleDiff2), bids);

In [None]:
fig = Cmk.Figure()
ax, l = Cmk.lines(
    fig[1, 1],
    bids, gains,
    color=:green, linewidth=2,
    axis=(;title="Optimal bid: expected gain",
        xlabel="Bid [\$]",
        ylabel="Expected gain [\$]",
        xticks=(0:10_000:80_000, map(x -> string(x, "k"), 0:10:80)),
        yticks=(0:2500:17500, map(x -> string(x, "k"), 0:2.5:17.5)), 
    )
)
Cmk.axislegend(
    ax,
    [l],
    ["Player 1"]
)
fig

In [None]:
maxGainIndx = findfirst(gain -> gain == maximum(gains), gains)

In [None]:
bids[maxGainIndx],
gains[maxGainIndx]

### Exercise 4

Do the same analysis for Player 2.

In [None]:
bids = posterior2.names
gains = map(bid -> getExpectedGain(bid, posterior2, sampleDiff1), bids);

In [None]:
fig = Cmk.Figure()
ax, l = Cmk.lines(
    fig[1, 1],
    bids, gains,
    color=:green, linewidth=2,
    axis=(;title="Optimal bid: expected gain",
        xlabel="Bid [\$]",
        ylabel="Expected gain [\$]",
        xticks=(0:10_000:80_000, map(x -> string(x, "k"), 0:10:80)),
        yticks=(0:2500:17500, map(x -> string(x, "k"), 0:2.5:17.5)), 
    )
)
Cmk.axislegend(
    ax,
    [l],
    ["Player 2"]
)
fig

In [None]:
maxGainIndx = findfirst(gain -> gain == maximum(gains), gains)

In [None]:
bids[maxGainIndx],
gains[maxGainIndx]

## Summary

Let’s review what we did in this chapter:

- we used KDE and data from past shows to estimate prior distributions for the values/prices of the showcases.
- we used bids from past shows to model the distribution of errors as a normal distribution.
- we did a Bayesian update using the distribution of errors to compute the likelihood of the data.
- we used the posterior distribution for the value/price of the showcase to
compute the probability of winning for each possible bid, and identified the bid
that maximizes the chance of winning.
- we used probability of winning to compute the expected gain for each possible
bid, and identified the bid that maximizes expected gain.

## Discussion

[...] because Bayesian and frequentist method produce different kinds of results:
- The result of frequentist methods is usually a single value that is considered
to be the best estimate (by one of several criteria) or an interval that
quantifies the precision of the estimate.
- The result of Bayesian methods is a posterior distribution that represents all
possible outcomes and their probabilities.

Granted, you can use the posterior distribution to choose a “best” estimate or
compute an interval. And in that case the result might be the same as the
frequentist estimate.

[...] the primary benefit of Bayesian methods: the posterior distribution is
more useful than a single estimate, or even an interval, e.g.

Using the entire posterior distribution, we can compute the bid that maximizes
the probability of winning, or the bid that maximizes expected gain, even if the
rules for computing the gain are complicated (and nonlinear).

With a single estimate or an interval, we can’t do that, even if they are
“optimal” in some sense. In general, frequentist estimation provides little
guidance for decision-making.


## Exericses

### Exercise 5

Below task description (copy-paste) from the book, I/me/my, etc. means Allen Downey.

When I worked in Cambridge, Massachusetts, I usually took the subway to South Station and then a commuter train home to Needham. Because the subway was unpredictable, I left the office early enough that I could wait up to 15 minutes and still catch the commuter train.

When I got to the subway stop, there were usually about 10 people waiting on the platform. If there were fewer than that, I figured I just missed a train, so I expected to wait a little longer than usual. And if there there more than that, I expected another train soon.

But if there were a lot more than 10 passengers waiting, I inferred that something was wrong, and I expected a long wait. In that case, I might leave and take a taxi.

We can use Bayesian decision analysis to quantify the analysis I did intuitively. Given the number of passengers on the platform, how long should we expect to wait? And when should we give up and take a taxi?

My analysis of this problem is in redline.ipynb, which is in the repository for this book. [Click here to run this notebook on Colab](https://colab.research.google.com/github/AllenDowney/ThinkBayes2/blob/master/notebooks/redline.ipynb).