# Week 3

Week three took us deeper into frequenist statistical methods and introduced us to Bayesian methods. The basic difference we explored is that frequentist methods allow one to quantify the probability of the data given specific parameters, while Bayesian methods allow one to quantify the probability of the parameters given the data. With frequentist methods you start with a hypothesis, gather data, and compare your data to your hypothesis to form your theory. With Bayesian methods you can start with data and build your theory from the data without presupposing a hypothesis. Bayesian statistics allows us to make probabilistic statements about our model parameters, while frequestist methods only allow us to make probabilitic statements about our data given some parameters.

In laymen's terms, a frequentist need a hypothesis in order to have something to compare the data to. A Bayesian can start without a hypothesis and just see where the data takes us.

Topics covered this week include:  

* Power analysis
* Bayesian updating
 * Beta distribution
 * Conugate prior distributions
* Multi-armed Bandit testing
* Linear systems of equations as matrix multiplication
* Markov Chains
* Exploratory data analysis
* Linear Regression
 * Loss fuctions
 * Residual plots
 * StatsModels
 * Linearity
 * Homoscedasticity
 * Normality
 * Multicolinearity
* Cross Validation and Regularization
 * Train-test split
 * K-fold cross validation
 * Leave-one-out cross validation
* Regularized regression
 * Ridge
 * Lasso
 

The notebook *Regression* has some work examples.
**Here is some code we wrote**

```python
import matplotlib.pyplot as plt


class Bayes(object):
    '''
    Args:
        prior (dict): each key is a possible parameter value (e.g. 4-sided die),
                      each value is the associated probability of that parameter value

        likelihood_func (function): takes a new piece of data and a parameter value,
                                    outputs the likelihood of getting that data given
                                    that value of the parameter
    '''
    def __init__(self, prior, likelihood_func):
        self.prior = prior
        self.likelihood_func = likelihood_func


    def normalize(self):
        '''
        Makes the sum of the probabilities in self.prior equal 1.

        Args: None

        Returns: None

        '''
        denom = sum(self.prior.values())
        for k, v in self.prior.items():
            self.prior[k] = v / denom
    
    def update(self, data):
        '''
        Conduct a bayesian update. For each possible parameter value 
        in self.prior, multiply the prior probability by the likelihood 
        of the data and make this the new prior.

        Args:
            data (int): A single observation (data point)

        Returns: None
        
        '''
        for k, v in self.prior.items():
            #print(k, v)
            self.prior[k] = v * self.likelihood_func(data, k)
            #print(k, v)
            
        

    def print_distribution(self):
        '''
        Print the current posterior probability.
        '''
        print(self.prior)
    
    def plot(self, color=None, title=None, label=None):
        '''
        Plot the current prior.
        '''
        
        x = self.prior.keys()
        y = self.prior.values()
        plt.scatter(x, y)
        plt.show()

    def subplot(self):
        '''
        Creates 4x2 subplot of specific pmfs for a biased coin. 
        Given a set of outcomes (flips), plots the probability
        that the coin's bias is any given amout between 0.0-0.99
        '''
        flips = {0:'H', 1:'T', 2:'HH', 3:'TH', 4:'HHH', 5:'THT', 6:'HHHH', 7:'THTH'}
        fig, ax = plt.subplots(nrows=4, ncols=2, figsize=(8,10))
        flip = 0
        row = 0
        col = 0
        for n in range(8):
            # a = prior_1.copy()
            coin = Bayes(self.prior.copy(), self.likelihood_func)
            for item in flips[flip]:
                coin.update(item)
                coin.normalize()
            x = coin.prior.keys()
            y = coin.prior.values()
            ax[row, col].plot(x, y)
            ax[row, col].set_title(flips[flip])
            flip += 1
            if col == 0:
                col = 1
            elif col == 1:
                col = 0
            if col == 0:
                row += 1
        plt.tight_layout()
        plt.show()


```