In [10]:
import os
import sys
module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
    sys.path.append(module_path)
    
from thinkbayes import Pmf, Suite, Percentile, CredibleInterval, EstimatedPdf
from thinkplot import Pmf as Plot_Pmf 
from thinkplot import Show as Plot_Show 

## Chap 6 Decision Analysis

### Representing PDFs

**PDFs** are a continuous range rather than a discrete range, as PMFs are. In PMFs each value is mapped directly to its probability. 

**PDFs** are written as functions i.e. f(x) = 
Where for a given value of x, the function computes the probability density.

**PDF** is an abstract type - Creates an interface, but not a compete implimentation. Pdf interface contains two methods
Density and MakePmf

In [11]:
class Pdf(object):
    def Density(self,x):
        raise UnimplementedMethodException()
        
    def MakePmf(self,xs):
        pmf = Pmf()
        for x in xs:
            pmf.Set(x, self.Density(x))
        pmf.Normalize()
        return pmf

`Density` take a value, `x`, and returns the corresponding density. `MakePmf` makes a discrete approximation to the `PDF`.
Pdf provides an implimentation of `MakePmf`, but not `Density`, which has to be provided by a child class.

A **concrete type** is a child class that extends an abstract type and provides an implimentation of the missing methods. For example, `GaussianPdf` extends `pdf` and provides `Density`.

In [12]:
# Here we provide an implimentation of Density to create a concrete type. This is also a child of class Pdf
class GaussianPdf(Pdf):
    
    def __init__(self, mu, sigma):
        self.mu = mu
        self.sigma = sigma
    
    def Density(self, x):
        return scipy.stats.norm.pdf(x, self.mu, self.sigma) # This is the line not included in thinkbayes.py

`__init__` takes mu and sigma (mean and sd of the dist) and stores them as attributes.

`Density` uses a function from scipy.stats to evaluate the Gaussian PDF. Called "norm" as Gaussian is also called the "normal" distribution.

Gaussian is defined by a simple math function, so it is both easy and useful (many quantities in the real world have Gaussian distributions).

With real data there is no guarantee that it is Gaussian distributed. So, we use a sample to estimate the PDF of the whole population. 

In *The Price is Right* data we have 313 prices for the first show-case. We can think of these values as a sample from the population of all possible showcase prices.

**Kernel Density Estimation (KDE)** is an algorithm that takes a sample and finds an appropriately smooth PDF that fits the data. 
`thinkbayes.py` provides a class called `EstimatedPdf` that uses it: 

In [13]:
class EstimatedPdf(Pdf):
    
    def __init__(self, sample):
        self.kde(scipy.stats.gaussian_kde(sample))
        
    def Density(self, x):
        return self.kde.evaluate(x)

`__init__` takes a sample and computes a kernel density estimate. The result is a `gaussian_kde` object that provides an `evaluate` method.

`Density` takes a value, calls `gaussian_kde.evaluate`, and returns the resulting density.

Figure 6.1:

<img src="thinkbayesprice.png">

Here is an outline of the code used to generate Figure 6.1:

In [15]:
# Don't run this cell!
prices = ReadData()

# This initializes an object which makes a smoothed kde distribution based on the data(price)
pdf = EstimatedPdf(prices)

low, high = 0, 75000
n = 101
# linspace means "linear space". Takes a range (low and high) and the number of points, n, and returns a new numpy array 
# with n elements equally spaced between low and high.
xs = numpy.linspace(low, high, n)


These discrete values in list xs are passed to the MakePmf method of the initial pdf object (which was created using a sample and kde estimation) to allow us to make a pmf:

In [None]:
pmf = pdf.MakePmf(xs)

A pdf object, estimated by KDE.pmf approximating the Pdf by evaluating the density at a sequence fo equally space 
values