**Tutorial 8 - Markov Chain Monte Carlo**

In this tutorial we will learn how to construct a basic Metropolis-Hastings Markov Chain sampler and apply it to supernova data.

We will use the same supernova data we used before, but this time we will take into account the true, nonlinear relationship between redshift and distance modulus.  In practice it would be better to solve this problem by a non-linear $\chi^2$ fit because there are few parameters, but we will do it by MCMC as an exercises.  

 1) Make a Metropolis-Hastings stepping function.

 The function should take the following inputs: 
 x - a numpy vector giving the current position in parameter space
 
 loglike - the value of the log of the likelihood 
           evaluated at x
           
 loglike_func() - a function that returns the log of the 
            likelihood given a position in parameters space
            
 proposal_func() - a proposal function that takes a position in 
                  parameter space and returns another point

 The function should return the updated position, the updated 
 loglike at that position and a Boolean that is True if the proposed step was 
 accepted and False if it was not. 
 You can assume the proposal function is symmetric, 
i.e. $q(x | y) = q(y | x)$

Call the function MH_step()

In [1]:

def MH_step(p,loglike,loglike_func,proposal_func):
    '''
    Metropolis-Hastings Monte Carlo Step
    '''
  .
  .
    return p,loglike,False


2) Now you must make a *class* for a Gaussian proposal function called 
"gaussian_proposal_class".  The class should have a constructor that takes the standard deviation of the proposal in each dimension.  This is done by defining a \_\_init\_\_ function within it.  You should be able to call an instance of this class like a function by defining a \_\_call\_\_ function.

In [2]:
# Complete this code for a Gaussian proposal function class

#class gaussian_proposal_class :
    ## This part is the constructor and 
    ## sets the internal information in the object
    # that is declared with
    # "func = gaussian_proposal_function(sigma_vector)"
#    def __init__(self, sigma):
#        self.n = len(sigma)
#        self.s = sigma
#
    ##  This part defines what happens when 
    ##  you do "y = func(params)"
    ##  This should return a new point
#    def __call__(self,params):
#        return ________________
#
# Once this class is defined:
#
# example of creating a instance of this class
# gpf = gaussian_proposal_class(sigma)
#
# using it after it has been created
# result = gpf(params)

In [4]:
# 3) Make a Gaussian likelihood class that stores the data and errors and 
# returns the log likelihood as a default function.

#class LogGaussianLikelihood :
#    def __init__(self,y_data,x_data,y_model,sigma):
#    
#         store the data, model and errors in the object
#         The function y_model(params,x_data) will return the 
#         predicted value for y to be compared to y_data for 
#         any input vectors params and x_data.  This does not 
#         need to be specified here.
#
#    def __call__(self,params):
#         This is the prior on Omega matter
#        if(params[1] < 0 or params[1] > 1) return -1.0e100
#        
#        Using the stored data and model and the input parameters 
#        to calculate the log of the Gaussian likelihood and return 
#        its value.
#        

4) Write a function that returns the distance modulus with 
signature def mu_model(p,z). 

Where parameters p are:

p[0] is the absolute magnitude normalization 

p[1] is omega_matter.

z is the redshift.

Use the library function 
astropy.cosmology.FlatLambdaCDM.luminosity_distance(z).value 
to calculate the luminosity distance.  This is a nonlinear function of the omega_matter.


In [6]:
#from astropy.cosmology import FRW
import astropy.cosmology as cosmo

def mu_model(p,z):
    cos = cosmo.FlatLambdaCDM(70,p[1])
    return ...

5) Read in the supernova data from SCPUnion2.1_mu_vs_z.txt and plot it.

 6) Make an instance of LogGaussianLikelihood with the data.  Call it loglike_func

Make an instance of gaussian_proposal_function

Set up the initial point p[] and its log likelihood 
 using  loglike_func(p)

Make an MCMC loop 1000 steps or more long and make a scatter plot of  the chain.  Record the acceptance fraction.


7) Make historgrams of the two parameters.

8) Use plt.hist2d() to make a 2 dimensional histogram of the chain with labels.

 9) I have written a function with just a few lines missing that returns the cross-correlation function between two vectors with lag m.
 
 Use this function to estimate the correlation length 
 of your chain. Plot the auto-correlation function for 
 lags of zero to a few hundred.
 
 What is the correlation length of your chain?

In [51]:
def corrfunction(x,y):
    '''
    This function calculates the correlation coefficient 
    as a function of lag between 
    '''
    xc = x - np.mean(x)
    yc = y - ...
    
    N = len(x)
    out = np.empty(N-2)
    stdx = np.std(xc)
    stdy = ...

    for i in range(N-2) :
        xt = xc[0:N-i]
        yt = yc[i:N]
        if(stdx == 0 or stdy == 0):  ## this can happen for last elements
            out[i] = 0
        else :
            out[i] = np.mean(xt*yt)

    out /= ....
    return out

10) Calculate the mean, variance and normalized covariance of the parameters.  For a flat universe, $\Omega_m + \Omega_\Lambda = 1$ where $\Omega_\Lambda$ is the density of the cosmological constant.  What are the mean value and "1 sigma" error bars on  $\Omega_\Lambda$.