# Sampling a posterior distribution

This notebook shows how to instantiate a posterior and sample from it.

In [None]:
import itertools
import os
import sys
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import pathlib

sys.path.append('..')
import cogwheel
from cogwheel import posterior, sampling, utils

In [None]:
from IPython.display import IFrame

In [None]:
all_events = [name.rstrip('.npz') for name in os.listdir(cogwheel.data.DATADIR)]

In [None]:
parentdir = NotImplemented  # Set to a path to save output
eventnames = ['GW151226', 'GW190521']  # Pick from `all_events` 
approximant = 'IMRPhenomXPHM'  # See cogwheel.waveform.APPROXIMANTS for options
prior_class = 'IASPrior'  # See `cogwheel.prior.prior_registry` for options

assert parentdir is not NotImplemented, 'But I told you to set `parentdir` to a path!'

## Set up `Posterior` objects

For each event, we need to find a waveform with good likelihood to set as reference for relative binning.
This is managed from the `Posterior` class.

Below, we submit this task to the cluster for each event. This may take ~ 10 min. At the end, subdirectories for each event will be created inside `parentdir` and each will contain a `Posterior.json` file from which we can instatiate a `Posterior`.

In [None]:
posterior.initialize_posteriors_slurm(eventnames, approximant, prior_class, parentdir)

In [None]:
#... Wait for completion...
incomplete = [eventname for eventname in eventnames
              if not (utils.get_eventdir(parentdir, prior_class, eventname)
                      /'Posterior.json').exists()]
if incomplete:
    print(', '.join(incomplete), 'did not complete yet. Wait and rerun this cell.')
else:
    print('All completed, good to go!')

## Run PyMultinest

For each event we will make several parameter estimation runs, with different sampler settings.
Here we construct a table of such settings. Its columns should be keywords to `pymultinest.run`.

In [None]:
nlives = 512 * 2**np.arange(4)[::-1]
tols = (1/8, 1/2)
importance_nested_samplings = (True, False)

table = pd.DataFrame(
    itertools.product(nlives, tols, importance_nested_samplings),
    columns=['n_live_points', 'evidence_tolerance', 'importance_nested_sampling'])
table

Once the table looks good, we submit to the cluster.

For each event we load the `Posterior` previously saved and submit as many runs as rows in the table. 

In [None]:
print(f'About to submit {len(eventnames) * len(table)} jobs...')

In [None]:
for eventname in eventnames:
    post = utils.read_json(utils.get_eventdir(parentdir, prior_class, eventname))
    pym = sampling.PyMultiNest(post)
    
    for _, run_kwargs in table.iterrows():
        pym.run_kwargs |= run_kwargs
        pym.submit_slurm(pym.get_rundir(parentdir))

... wait for completion and
## Plot samples
We can use `Samplings.diagnostics` to make diagnostic plots and save them to pdf format.

In [None]:
for eventname in eventnames:
    sampling.diagnostics(utils.get_eventdir(parentdir, prior_class, eventname))

Now we create a symlink to `parentdir` in the working directory.
(Needed because `IFrame` can only show pdfs that are in subdirectories of the working directory)

In [None]:
! ln -s {parentdir} parentdir

Visualize the pdf plots

In [None]:
IFrame(utils.get_eventdir('parentdir', prior_class, eventnames[0])/'diagnostics.pdf',
       width=950, height=800)