# Chapter 9. Decision Analysis

[Link to chapter online](https://allendowney.github.io/ThinkBayes2/chap09.html)

## Warning

The content of this file may be incorrect, erroneous and/or harmful. Use it at Your own risk.

## Imports

In [None]:
include("./pmfAndCdf.jl")
include("./simplestat.jl")

In [None]:
import CSV as Csv
import DataFrames as Dfs
import KernelDensity as Kde

In [None]:
Num = Union{Int,Float64} # custom type

## The Price is Right Problem

*The Price is Right* - a gameshow. The objective is to guess the price of a collection of prizes.
The contestant who comes closest to the actual price, without going over, wins the prizes.

One of the episodes, two contestants (N and L):
- N Prize: dishwasher, wine cabinet, laptop, car.
- L Prize: pinball machine, video arcade game, pool table, cruise of the Bahamas

Bids:
- N: $26'000 (real price: $25'347, diff: $653)
- L: $21'500 (real price: $21'578, diff: $78)

L wins her showcase, and due to smaller diffs also N showcase.

Several questions for a Bayesian thinker.

1. Before seeing the prizes, what prior beliefs should the contestants have about the price of the showcase?
2. After seeing the prizes, how should the contestants update those beliefs?
3. Based on the posterior distribution, what should the contestants bid?

Problem inspired by Cameron Davidson-Pilon's [book](https://dataorigami.net/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/).

## The Prior

To choose the prior distribution we can use the track of previous prices.
See [the book repo.](https://github.com/AllenDowney/ThinkBayes2/tree/master/data)

In [None]:
function read_data(filename::String):: Dfs.DataFrame
    df = Csv.read(filename, Dfs.DataFrame; header=false, skipto=4) 
    df = Dfs.dropmissing(df)
    df = Dfs.permutedims(df, 1)
    df[!, 2:end]
end

In [None]:
df2011 = read_data("./showcases2011.csv")
df2012 = read_data("./showcases2012.csv")
df = vcat(df2011, df2012)
first(df, 3)

The first two columns, `Showcase 1` and `Showcase 2`, are the values of the
showcases in dollars. The next two columns are the bids the contestants made.
The last two columns are the differences between the actual values and the bids.

## Kernel Density Estimation

We can use this sample to estimate the prior distribution of showcase prices, e.g. using KDE, i.e. [kernel density estimation](https://mathisonian.github.io/kde/).

More info on used [KDE library in Julia](https://github.com/JuliaStats/KernelDensity.jl).

In [None]:
function getKDEfromSample(sample::Vector{A}, qs::Vector{B}) where {A<:Num, B<:Num}
    # optional keyword argument is kernel (defaults to Dsts.Normal)
    gaussianKde:: Kde.KernelDensity.UnivariateKDE = Kde.kde(sample) 
    ps::Vector{Float64} = Kde.pdf(gaussianKde, qs)
    pmf::Pmf{B} = Pmf(qs, ps)
    return pmf
end

In [None]:
qs = range(0, 80000, 81) |> collect
prior1 = getKDEfromSample(df[!, "Showcase 1"], qs)
prior2 = getKDEfromSample(df[!, "Showcase 2"], qs)

In [None]:
fig = Cmk.Figure()
ax1, l1 = Cmk.lines(
    fig[1, 1],
    prior1.names, prior1.priors,
    color=:blue, linewidth=3,
    axis=(;title="Prior distribution of showcase value",
    xlabel="Showcase value in \$", ylabel="PMF",
    xticks=(0:10000:80000, map(x -> string(x, "k"), 0:10:80)),
    yticks=0:0.01:0.06, 
    )

)
l2 = Cmk.lines!(
    fig[1, 1],
    prior2.names, prior2.priors,
    color=:orange, linewidth=3,
)
Cmk.axislegend(
    ax1,
    [l1, l2],
    ["Showcase 1", "Showcase 2"]
)
fig