# PSI Numerical Methods 2024 - Homework Assignment on Model Fitting & MCMC

We're going to put together everything we have learned so far to re-do the data analysis for the
Perlmutter et al. 1999 paper on the discovery of dark energy!  (https://ui.adsabs.harvard.edu/abs/1999ApJ...517..565P/abstract)

Start by `Forking` this repository on Github: https://github.com/dstndstn/PSI-Numerical-Methods-2024-MCMC-Homework
And then clone the repository to your laptop or to Symmetry.
You can modify this notebook, and when you are done, save it, and then `git commit -a` the results,
and `git push` them back to your fork of the repository.  You will "hand in" your homework by giving
a link to your Github repository, where the marker will be able to read your notebook.

First, a little bit of background on the cosmology and astrophysics.  The paper reports measurements
of a group of supernova explosions of a specific type, "Type 1a".  These are thought to be caused by
a white dwarf star that has a companion star that "donates" gas to the white dwarf.  It gradually gains
mass until it exceeds the Chandresekhar mass, and explodes.  Since they all explode through the same
mechanism, and with the same mass, they should all have the same intrinsic brightess.  It turns out to
be a _little_ more complicated than that, but in the end, these Type-1a supernovae can be turned into
"standard candles", objects that are all the same brightness.  If you can also measure the redshift of
each galaxy containing the supernova, then you can map out this brightness--redshift relation, and the
shape of that relation depends on how the universe grows over cosmic time.  In turn, the growth rate of
the universe depends on the contents of the universe!

In this way, these Type-1a supernova allow us to constrain the parameters of a model of the universe.
Specifically, the model is called "Lambda-CDM", a universe containing dark energy and matter (cold dark matter,
plus regular matter).  We will consider a two-parameter version of this model: $\Omega_M$, the
amount of matter, and $\Omega_{\Lambda}$, the amount of dark energy.  These are in cosmology units of
"energy density now relative to the critical density", where the critical density is the energy density you need
for the universe to be spatially flat (angles of a large triangle sum to 180 degrees).
So $\Omega_M = 1$, $\Omega_{\Lambda} = 0$ would be a flat universe containing all matter, while
$\Omega_M = 0.25$, $\Omega_{\Lambda} = 0.5$ would be a spatially closed universe with dark energy and matter.
Varying these ingredients changes the growth history of the universe, which changes how much the light from a
supernova is redshifted, and how its brightness drops off with distance.

(In the code below, we will call these `Omega_M` = $\Omega_M$ and `Omega_DE` = $\Omega_{\Lambda}$.)

Distance measurements in cosmology are complicated -- see https://arxiv.org/abs/astro-ph/9905116 for details!
For this assignment, we will use a cosmology package that will handle all this for us.  All we need to use is
the "luminosity distance", which is the one that tells you how objects get fainter given a redshift.

In [None]:
# Let's start by installing the Cosmology package!
using Pkg
Pkg.add("Cosmology")

In [None]:
# We'll also end up using all our old friends:
using WGLMakie
using CSV
using DataFrames
using Cosmology
using Statistics

In [None]:
# There is a data file in this directory, taken basically straight out of the Perlmutter+1999 paper.  We can read it with the CSV package.
data = CSV.read("p99-data.txt", DataFrame, delim=" ", ignorerepeated=true);

In [None]:
# Make a copy of the data columns that we want to treat as the "y" measurements.
# These are the measured brightnesses, and their Gaussian uncertainties (standard deviations).
data.mag = data.m_b_eff
data.sigma_mag = data.sigma_m_b_eff;

In [None]:
f = Figure()
Axis(f[1,1], title="Perlmutter+99 Supernovae", xlabel="Redshift z", ylabel="m_B")
errorbars!(data.z, data.mag, data.sigma_mag)
scatter!(data.z, data.mag, markersize=5, color=:maroon)
f

In [None]:
# Here is how we will use the "cosmology" package.  This will create a cosmology "object" with the parameters we pass in.
# It does not take an Omega_Lambda parameter; instead, it takes Omega_Matter, and Omega_K (for "curvature"), where
# Omega_K = 1. - Omatter - Olambda.  We will also pass in "Tcmb=0", which tells it to ignore the effects of radiation.

universe = cosmology(OmegaK=0.1, OmegaM=0.4, Tcmb=0)
@show universe
@show universe.Ω_Λ;

In [None]:
# We can then pass that "universe" object to other functions to compute things about it.  Basically the only one you'll
# need is this `distance_modulus`, which tell you, in _magnitudes_, how much fainter an object is at the given redshift,
# versus how faint it would be if it were 10 parsecs away.

function distance_modulus(universe, z)
    DL = luminosity_dist(universe, z)
    # DL is in Megaparsecs; the distance for absolute to observed mag is 10 pc.
    5. * log10.(DL.val * 1e6 / 10.)
end;

There is one more parameter to the model we will be fitting: $M$, the _absolute magnitude_ of the supernovae.  This is a
"nuisance parameter" - a parameter that we have to fit for, but that we don't really care about; it's basically a calibration
of what the intrinsic brightness of a supernova is.  To start out, we will fix this value to a constant, but later we will
fit for it along with our Omegas.

The _observed_ brightness of a supernova will be its _absolute mag_ plus its _distance modulus_.  The _distance modulus_ depends on
the redshift _z_ and our parameters Omega_M and Omega_DE.

In [None]:
# We'll cheat a bit and use a "nominal" cosmology with currently-accepted values of Omega_M = 0.29, Omega_DE = 0.71.
nominal = cosmology(Tcmb=0)

f = Figure()
ax = Axis(f[1,1], title="Perlmutter+99 Supernovae", xlabel="Redshift z", ylabel="Observed mag")
errorbars!(data.z, data.mag, data.sigma_mag)
scatter!(data.z, data.mag, markersize=5, color=:maroon)

# Compute the average absolute magnitude M given nominal cosmology -- ie, an estimate of the absolute mag of the supernovae
DLx = map(z->distance_modulus(nominal, z), data.z)
abs_mag = median(data.mag - DLx)

# Here's another way to plot a function evaluated on a grid of values.
zgrid = 0.01:0.01:1.
DL = map(z->distance_modulus(nominal, z), zgrid)
lines!(zgrid, DL .+ abs_mag, label="Nominal OmegaM = 0.29, OmegaDE = 0.71")

universe = cosmology(OmegaK=0.0, OmegaM=0.6, Tcmb=0)
DL = map(z->distance_modulus(universe, z), zgrid)
lines!(zgrid, DL .+ abs_mag, color=:red, label="OmegaM = 0.6, OmegaDE = 0.4")

universe = cosmology(OmegaK=0.0, OmegaM=0.1, Tcmb=0)
DL = map(z->distance_modulus(universe, z), zgrid)
lines!(zgrid, DL .+ abs_mag, color=:green, label="OmegaM = 0.1, OmegaDE = 0.9")

#f[2,1] = Legend(f, ax, "Cosmologies", framevisible = false)
# Create a legend for our plot
axislegend(ax, position = :rb)
f

In [None]:
# Here's our scalar estimate of the absolute mag.
abs_mag

## Part 1 - The Log-likelihood terrain

First, you have to write out the likelihood function for the observed supernova data, given cosmological model parameters.

That is, please complete the following function.  It will be passed vectors of `z`, `mag`, and `mag_error` measurements,
plus scalar parameters `M`, `Omega_M` and `Omega_DE`.  You will need to create a "cosmology" object, find the _distance modulus_ for
each redshift `z`, and add that to the absolute mag `M` to get the _predicted_ magnitude.  You will then compare that to each
measured magnitude, and compute the likelihood.

In [None]:
function supernova_log_likelihood(z, mag, mag_error, M, Omatter, Ode)
    # z: vector of redshifts
    # mag: vector of measured magnitudes
    # mag_error: vector of uncertainties on the measured magnitudes (sigmas).
    # M: scalar, absolute magnitude of a Type-1a supernova
    # Omatter: scalar Omega_M, amount of matter in the universe
    # Ode: scalar Omega_DE, amount of dark energy in the universe

    ###   YOUR CODE HERE!!

    # You must return a scalar value
end;

Next, please keep `M` fixed to the `abs_mag` value we computed above, and call your `supernova_log_likelihood` on a grid of
`Omega_M` and `Omega_DE` values.  (You will pass in `data.z`, `data.mag`, and `data.sigma_mag` for the `z`, `mag`, and `mag_error` values.)

Try a grid from 0 to 1 for both Omega_M and Omega_DE, and show the `supernova_log_likelihood` values using the `heatmap` function.
You may find it helpful to limit the range using something like `heatmap(om_grid, ode_grid, sn_ll, colorrange=[maximum(sn_ll)-20, maximum(sn_ll)])`.

Another thing you can do is, instead of showing the _log_-likelihood, show the likelihood by taking the `exp` of your `sn_ll` grid, like this, `heatmap(om_grid, ode_grid, exp.(sn_ll))`.

Please compare your plot to Figure 7 in the Perlmutter et al. 1999 paper, shown below.  Does your likelihood contour look consistent with the blue ellipses?

<img src="perlmutter-fig7.png" width="400"/>



Next, try expanding the grid ranges for Omega_M and Omega_DE up to, say, 0 to 2 or 0 to 3.  You should encounter a problem -- the cosmology package will fail to compute the `distance_modulus` for some combinations!  You can work around this by using Julia's `try...catch` syntax,
like this:

In [None]:
# Example of Julia's try-catch syntax:
ll = 0.
try:
    ll = supernova_log_likelihood(data.z, data.mag, data.sigma_mag, abs_mag, 2.0, 2.0)
catch err
    ll = -Inf
end

This will "try" to run the `supernova_log_likelihood` function, and if it fails, it will go into the "catch" branch.

## Part 2 - Using MCMC to sample from the likelihood

Next, we will use Markov Chain Monte Carlo to draw samples from the likelihood distribution.

You can start with the `mcmc` function from the lecture.

You will need to tune the MCMC proposal's step sizes (also known as "jump sizes").  To do this, you can use
the variant of the `mcmc` routine that cycles through the parameters and only jumps one at a time, named
`mcmc_cyclic` in the updated lecture notebook.  After tuning the step sizes with `mcmc_cyclic`, you can go back
to the plain `mcmc` routine if you want, or stick with `mcmc_cyclic`; it is up to you.

Please plot the samples from your MCMC chains, to demonstrate that the chain looks like it has converged.  Ideally, you
would like to see reasonable acceptance rates, and you would like to see the samples "exploring" the parameter space.
Decide how many step you need to run the MCMC routine for, and write a sentence or two describing why you think that's
a good number.

For this part, please include the `M` (absolute magnitude) as a parameter that you are fitting -- so you are fitting for `M`
in addition to `Omega_M` and `Omega_DE`.  This is a quite standard situation where you have a "nuisance" parameter `M`
that you don't really care about, in addition to the `Omega` parameters that you do care about.

It is quite common to plot the results from an MCMC sampling using a "corner plot", which shows the distribution of
each of the individual parameters, and the joint distributions of pairs of parameters.  This will help you determine
whether some of the parameters are correlated with each other.

Below is a function you can use to generate corner plots from your chain -- call it like `cornerplot(chain, ["M", "Omega_M", "Omega_DE"])`.  There is also a CornerPlot package (https://juliapackages.com/p/cornerplot) but I have not had luck getting it
to work for me.

Once you have made you corner plots, please write a few sentences interpreting what you see.  Is the nuisance parameter `M` correlated with the Omegas?  Are the Omegas correlated with each other?

In [None]:
function cornerplot(x, names; figsize=(600,600))
    # how many columns of data
    dim = size(x, 2)
    # rows to plot
    idxs = 1:size(x,1)
    f = Figure(size=figsize)
    for i in 1:dim, j in 1:dim
        if i < j
            continue
        end
        ax = Axis(f[i, j], aspect = 1,
                  topspinevisible = false,
                  rightspinevisible = false,)
        if i == j
            hist!(x[idxs,i], direction=:y)
            ax.xlabel = names[i]
        else
            #scatter!(x[idxs,j], x[idxs,i], markersize=4)
            hexbin!(x[idxs,j], x[idxs,i])
            ax.xlabel = names[j]
            ax.ylabel = names[i]
        end
    end
    f
end;

Finally, please try to make a contour plot similar to Perlmutter et al.'s Figure 7.  From your MCMC chain, you can pull out the `Omega_M` and `Omega_DE` arrays, and then create a 2-d histogram.  Once you have a 2-d histogram, you can use the `contour` function to find and plot the contours in that histogram.