# Model selection & comparison

_Alex Malz (LINCC@CMU)_
_LSSTC Data Science Fellowship Program_

In [None]:
from astropy import cosmology as apcosmo
import numpy as np
import scipy.stats as sps

In [None]:
import matplotlib.pyplot as plt

## Overview

Let's try to infer the cosmological parameters from redshifts and distances to Type Ia SNe.
This problem is adapted from [Supernova Cosmology Inference with Probabilistic Photometric Redshifts (SCIPPR)](https://github.com/aimalz/scippr), specifically the [forward model](https://github.com/aimalz/scippr/blob/master/code/demos/Simulation.ipynb) and [posterior inference](https://github.com/aimalz/scippr/blob/master/code/demos/Inference.ipynb) procedures.

### Data

__TODO__: fix formatting

In [None]:
# Planck values
prior_H0 = sps.norm(loc=67.4, scale=0.5)
prior_Om0 = sps.norm(loc=0.315, scale=0.007)
true_H0 = prior_H0.rvs()
true_Om0 = prior_Om0.rvs()
print(f'$H_{{0}}^{{true}}=${true_H0}, $\Omega_{{m}}^{{true}}=${true_Om0}')
true_cosmo = apcosmo.FlatLambdaCDM(H0=true_H0, Om0=true_Om0)

In [None]:
# def nz_func(z, c=0.3):
#     return 1./(2.*c) * (z/c)**2 * np.exp(-1. * z/c)
nz_true = sps.gamma(3, scale=0.3)
zs_true = nz_true.rvs(100)

In [None]:
plt.hist(zs_true);
plt.xlabel(r'$z$')
plt.title('SNIa redshift distribution')

In [None]:
mus_true = true_cosmo.distmod(zs_true).value

In [None]:
plt.scatter(zs_true, mus_true)
plt.xlabel(r'$z$')
plt.ylabel(r'$\mu$')
plt.title('true Hubble diagram')

In [None]:
# bias_lsst = 0.003
# scatter_lsst = 0.02
# outlier_lsst = 0.1

z_min = 0.
z_max = 3.
# z_norm_low_true = (z_min - zs_true - bias_lsst * (1 + zs_true)) / (scatter_lsst * (1+zs_true))
# z_norm_high_true = (z_max - zs_true - bias_lsst * (1 + zs_true)) / (scatter_lsst * (1+zs_true))
# zs_pdf_true = sps.truncnorm(z_norm_low_true, z_norm_high_true, loc=zs_true+bias_lsst, scale=scatter_lsst*(1+zs_true))
# zs_obs = zs_pdf_true.rvs()
# z_norm_low_obs = (z_min - zs_obs - bias_lsst * (1 + zs_obs)) / (scatter_lsst * (1+zs_obs))
# z_norm_high_obs = (z_max - zs_obs - bias_lsst * (1 + zs_obs)) / (scatter_lsst * (1+zs_obs))
# zs_pdf_obs = sps.truncnorm(z_norm_low_obs, z_norm_high_obs, loc=zs_obs+bias_lsst, scale=scatter_lsst*(1+zs_obs))
# zs_est = zs_pdf_obs.rvs()
zs_est = zs_true

In [None]:
# plt.scatter(zs_true, zs_est)
# plt.xlabel(r'$z_{true}$')
# plt.ylabel(r'$z_{obs}$')
# plt.title('redshift uncertainties')

__TODO: look up realistic mu error__

In [None]:
mus_err = (1. + zs_est) / z_max
mus_err_dist = sps.norm(loc=mus_true, scale=mus_err)
mus_obs = mus_err_dist.rvs()
mus_pdf_obs = sps.norm(loc=mus_obs, scale=mus_err)

In [None]:
plt.scatter(mus_true, mus_obs)
plt.xlabel(r'$\mu_{true}$')
plt.ylabel(r'$\mu_{obs}$')
plt.title('distance modulus uncertainties')

In [None]:
plt.errorbar(zs_est, mus_obs, #xerr=scatter_lsst*(1.+zs_est), 
             yerr=mus_err, fmt='.')
plt.xlabel(r'$z$')
plt.ylabel(r'$\mu$')
plt.title('observed Hubble diagram')

### Models

In [None]:
z_grid = np.linspace(z_min, z_max, 100)
est_H0 = prior_H0.rvs()
est_Om0 = prior_Om0.rvs()
est_cosmo = apcosmo.FlatLambdaCDM(H0=est_H0, Om0=est_Om0)
mus_est = est_cosmo.distmod(z_grid).value

In [None]:
prior_w0 = sps.norm(loc=0.89, scale=0.13)
new_cosmo = apcosmo.wCDM(est_H0, est_Om0, 1.-est_Om0, w0=prior_w0.rvs())
mus_new = new_cosmo.distmod(z_grid).value

In [None]:
plt.errorbar(zs_est, mus_obs, #xerr=scatter_lsst*(1.+zs_est), 
             yerr=mus_err, fmt='.', c='k', alpha=0.5)
plt.plot(z_grid, true_cosmo.distmod(z_grid).value, label='true model')
plt.plot(z_grid, mus_est, label='estimated model')
plt.plot(z_grid, mus_new, label='misspecified model')
plt.xlabel(r'$z$')
plt.ylabel(r'$\mu$')
plt.title('observed Hubble diagram with models')

## Goodness-of-fit & hypothesis testing

### Problem 0a

Implement a function calculating the reduced $\chi^{2} = \sum_{i=1}^{N}\left(\frac{y_{i} - M_{i}(\theta)}{\sigma_{i}}\right)^{2}$ and calculate the $\chi^{2}$ for the two models.

In [None]:
def chi_sq(model, xvals, yvals_obs, yerrs_obs):
    """
    Calculates the $chi^{2}$ statistic
    
    Parameters
    ----------
    model: function
        function taking xvals and producing yvals
    xvals: array, float
        values of the independent variable at which yvals_obs were measured
    yvals_obs: array, float
        values of the dependent variable at xvals
    yerrs_obs:
        errors on dependent variable observations
        
    Returns
    -------
    chi_sq: float
        value of the $\chi^{2}$ statistic
    """

In [None]:
# print(chi_sq(est_cosmo, zs_est, mus_obs, mus_err))
# print(chi_sq(new_cosmo, zs_est, mus_obs, mus_err))

# make a grid of values for cosmological parameters, plot both on same axes at all points

### Problem 0b

Minimize the $\chi^{2}$ to find the maximum likelihood estimator of the cosmological parameters.

### Problem 1a

Implement a function empirically calculating the Fisher matrix $F$ where $F_{ij} = \frac{1}{2}\frac{\partial^{2}}{\partial\theta_{i}\partial\theta_{j}}\chi^{2}(M, \theta)$.
When is $F^{-1}_{i,j} = \sigma_{i}\sigma_{j}$? (When likelihood is Gaussian)

In [None]:
# outline helper functions from https://github.com/COINtoolbox/RESSPECT/blob/master/resspect/cosmo_metric_utils.py

In [None]:
def fisher(model, )

### Problem 1b

Implement a function that plots error ellipses given a Fisher matrix.
This is a useful thing to have around -- I'm still recycling code I wrote to do this in grad school!

In [None]:
def fisher_to_ellipse(fisher_mat):
    
def plot_ellipse(semimajor, semiminor):
    

## Information theory & likelihoods

I really want to introduce you to metrics from the perspective of information theory, but more appropriate data for the problem will be available for the experimental design lecture.

KLD/relative entropy

## Model comparison

$AIC = 2\log[p(data | \theta)] - 2N_{param}$

$BIC = 2\log[p(data | \theta)] + N_{param}\log[N_{data}]$ 

$\dots$

Sometimes these are defined as negative of what's shown here -- 

__TODO: Is the $N_{data}$ MCMC samples or something to do with the actual data?__ number of points upon which likelihood is based, so if MCMC sampling to obtain the likelihood, would be number of samples

# might develop simple example to model comparison rather than introducing some real outside data?

Let's look at the MCMC chains from [Chang+18](https://doi.org/10.1093/mnras/sty2902).

__TODO: explain the scenarios__

In [None]:
# read in and parse here
# pull out likelihoods, number of parameters

### Problem 2a

Write functions for each of the above information criteria

In [None]:
def get_aic():

### Problem 2b

Compare each dataset's cosmological constraints over $S_{8}$ under the published and matched scenarios and interpret the results.

### Problem 2c

Compare the matched-assumption constraints across all the data sets and interpret the results.

## Bayesian metrics

The Bayes Factor $BF_{0,1} = \frac{\frac{P(\Theta_{0} | x)}{P(\Theta_{1} | x)}}{\frac{P(\Theta_{0})}{P(\Theta_{1})}} = \frac{\int_{\Theta_{0}}f(x|\theta)g_{0}(\theta)d\theta}{\int_{\Theta_{1}}f(x|\theta)g_{1}(\theta)d\theta}$ compares posteriors estimated from the same data.
In Chang+18, we technically had different data, so it wasn't entirely kosher to calculate it.
