# P09: Analyzing Monte Carlo Markov Chains 

In the last problem sheet, we wrote our own Metropolis-Hastings sampler and applied it to the SNLS SNe Ia likelihood. In this problem set, we will analyze the generated MCMC, and in particular we will investigate its convergence.

To this end, we will sample the SNLS likelihood using the intial condition generator and proposal covariance matrix as given below:

In [1]:
import numpy as np

In [2]:
def initGen(init0 = np.array([0.2, 1.5, 2.0, 24.5]), sig = np.array([0.03, 0.1, 0.1, 0.02])):
    return np.random.randn() * sig + init0

In [3]:
proposal_cov = np.array([[ 0.003, -0.004, -0.005,  0.002],
                   [-0.004,  0.036,  0.011, -0.002],
                   [-0.005,  0.011,  0.061, -0.003],
                   [ 0.002, -0.002, -0.003,  0.002]])

## Problem 1: Visualization

Use your implementation of the Metropolis-Hastings sample to generate 500 samples from the SNLS posterior, starting around an initial position as generated by the intial condition generator given above and with proposal covariance as given by `proposal_cov`.

(i) Plot the trace plots for all 4 parameters $\Omega_m, \alpha, \beta, M$ and estimate the burn-in.

(ii) Does the scatter of the burn-in removed chains around their mean agree with your expectations? Why? Why not?

(iii) Estimate the acceptance fraction of the burn-in-removed chain.

(iv) Plot the 2D contours for the burn-in-removed chains using [`corner`](https://corner.readthedocs.io/en/latest/) (or your contour plotting tool of choice).

## Problem 2: Autocorrelation function

(i) Create 1500 samples from the SNLS posterior, cut-off the burn-in, and estimate the autocorrelation function $\rho_{ff}(T)$ for $T \leq 50$ for all 4 parameters $\Omega_m, \alpha, \beta, M$. Plot your results.

(ii) What do you observe for the autocorrelation time at large $T$? How do you interpret your results? What happens if you increase the length of the chain to 10,000?

(iii) Estimate $\tau_{ff}$ by determining the best-fit value such that $\hat{\rho}_{ff}(T)=e^{-T/\tau_{ff}}$ up to a suitably chosen $T_{\mathrm{max}}$. Note that you can use linear regression if you reparametrize the problem accordingly.

(iv) Discuss your results.

## Problem 3: MCMC errors

(i) Use your estimate of $\tau_{ff}$ to estimate the errors on the mean of the posterior as estimated from the sample for each model parameter. 

(ii) Confirm your results by repeated sampling of the SNLS posterior.

## Problem 4: Gelman-Rubin convergence test

(i) Use the samples generated in problem 3 to compute the Gelman-Rubin statistic $R$. 

(ii) What does your result suggest about the chain convergence?

(iii) Do your results depend on the number of chains you consider?