In [8]:
import os
import sys
module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
    sys.path.append(module_path)
    
from thinkbayes import Pmf, Suite, Percentile, CredibleInterval, EstimatedPdf, MakeCdfFromList
import numpy as np

## Chap 6: Decision Analysis

How to decide on the price of a showcase?
Bayesian thinking towards an answer:    
1) Prior beliefs on what the showcase prices could be: Analyse previous prices on the show.    
2) Likelihood/Update: Seeing the prizes, how should you update? i.e. How to interpret the data?    
3) Results from Update on the Prior: the Posterior. How to choose from the posterior distribution?    

All of these steps require subjective decisions. 

**Modeling the contestants**   
If you were a contestant on the show you could use this distribution (fig 6.1) to quantify your prior belief about the price of each showcase (before you even see the prizes). 

<img src="thinkbayesprice.png">

We want to model the contestant as a price-guessing instrument with known error characteristics. 
Under this model we have to answer this question: "If the actual price is `price`, what is the likelihood that the contestant's estimate would be `guess`"

We define `guess` as `error = price - guess`

and can then ask: "What is the likelihood that the contestant's estimate is off by `error`?"

To answer this we use historical data. and a cumulative distribution of `diff` from these data:

`diff = price - bid`

When `diff` is negative, the bid is too high. We use this distribution to estimate the reliability of the contestants guesses. We make some assumptions: 

- distribution of `error` if Gaussian with mean 0
- same variance as `diff`

The `Player` class implements this model:

In [4]:
class Player(object):
    
    def __init__(self, prices, bids, diffs):
        self.pdf_price = EstimatedPdf(prices)
        self.cdf_diff = MakeCdfFromList(diffs)
        
        mu = 0
        sigma = np.std(diffs)
        self.pdf_error = GaussianPdf(mu, sigma)


`prices` = sequence of showcase prices   
`bids` = sequence of bids     
`diffs` = sequence of `diffs`m where `diff` = `price` - `bid`

Again, we use the variance of diff the estimate the variance of error. This estimate is not perfect because contestants bids are sometimes strategic: in that case diff does not refelct error. 

An alternative would be for someone to estimate their own distribution of error by watching previous shows and recording their guesses and the actual prices.

## Likelihood

Now we are ready to write the likelihood function. As usual we define a new class that extends `thinkbayes.Suite`:

In [None]:
class Price(thinkbayes.Suite):
    
    def __init__(self, pmf, player):
        thinkbayes.Suite.__init__(self, pmf)
        self.player = player

`pmf` represents the prior distribution and `player` is a Player object as described previously

In [5]:
def Likelihood(self, data, hypo):
    price = hypo
    guess = data
    
    error = price - guess
    like = self.player.ErrorDensity(error)
    
    return like

`hypo` is the hypothetical price of the showcase. `data` is the contestant's best guess at the price. `error` is the difference, and `like` is the likelihood of the data, given the hypothesis.

`ErrorDensity` is defined in `Player`:

In [None]:
class Player:
    def ErrorDensity(self, error):
        return self.pdf_error.Density(error)

`ErrorDensity` works by evaluating `pdf_error` at the given value of `error`. The result is a probability density, so it is not really a probability. 

Remember that `Likelihood` doesn't need to compute a probability; it only has to compute something *proportional* to a probability. As long as the constant of proportionality is the same for all likelihoods, it gets cancelled out when we normalise the posterior distribution. Therefor a probability density is a perfectly good likelihood.

### Update

`Player` provides a method that takes the contestant's guess and computes the posterior distribution

To Update, we have to answer these questions:

1) What data should we consider and how should we quantify it?     
2) Can we compute a likelihood function; i.e for each hypo value of price, can we compute the conditional likelihood of the data?

In [6]:
# class Player
    def MakeBeliefs(self, guess):
        pmf = self.PmfPrice()
        self.prior = Price(pmf, self)
        self.posterior = self.prior.Copy()
        self.posterior.Update(guess)

# PmfPrice generate a discrete approximation to the PDF of price, which we use to construct the prior.
# PmfPrice uses MakePmf, which evaluates pdf_price at a sequence of values:

In [None]:
# class Player

    n = 101
    price_xs = np.linspace(0, 75000, n)
    
    def PmfPrice(self):
        return self.pdf_price.MakePmf(self.price_xs)

To construct the posterior, we make a copy of the prior and then invoke `Update`, which invokes `Likelihood` for each hypothesis, multiplies the priors by the likelihoods and renormalizes.

Original scenario: you are Player 1 and when you see your showcase, your best guess is that the total price of the prizes is \$20000.

Fig 6.3 shows prior and posterior beliefs about the actual price. The posterior is shifted to the left because your guess is on the low end of the prior range. 

<img src="thinkbayesprice2.png">

We are treating the historical data as the prior and updating it based on your guesses, but we could equivalently use your guess as a prior and update it based on historical data.

If you think of it that way, it is less surprising that the most likely value in the posterior is not your original guess.