## Running pbilby on a cluster (or laptop)

This tutorial will demonstrate how to set up parallel_bilby (pbilby) jobs, which can be run on a cluster, via slurm, or on a laptop (though the latter will be limited to the number of available cores on the machine).

There are three steps to running pbilby. The first is the ini file. This contains everything to set up the run. The ini file `GW170817.ini` is contained in this directory. This tutorial is similar to the GW150914 tutorial, except here we're going to manually specify the data rather than going through GWOSC. This type of example (where we manually point to data frames) is closer to how we would run on real data for production LVC analyses.



### ini file

The first part of the ini file sets the data specific settings

In [None]:

################################################################################
## Data generation arguments
################################################################################

trigger_time = 1187008882.43

################################################################################
## Detector arguments
################################################################################

detectors = [H1, L1, V1]
psd_dict = {H1=psd_data/h1_psd.txt, L1=psd_data/l1_psd.txt, V1=psd_data/v1_psd.txt}

# Download the data from https://www.gw-openscience.org/events/GW170817/ and place in raw_data/
data-dict = {H1:raw_data/H-H1_LOSC_CLN_4_V1-1187007040-2048.gwf, L1:raw_data/L-L1_LOSC_CLN_4_V1-1187007040-2048.gwf, V1:raw_data/V-V1_LOSC_CLN_4_V1-1187007040-2048.gwf}
channel_dict = {H1:LOSC-STRAIN, L1:LOSC-STRAIN, V1=LOSC-STRAIN}
duration = 128

The trigger time is the time of the merger, as estimated by the search pipelines. This can be found from GWOSC. By convention, the data is such that the trigger time occurs 2s before the end of the data segment.

Next, we specify the detectors, PSD, data channel and data duration. The LIGO Hanford and Livingston, and Virgo detectors were operational at the time of GW170817 so we specify these three instruments. The PSD is contained in the `psd_data` directory. Lastly, because GW170817 is in band for around 100s, we analyze 128s of data containing the signal.

The data is expected to be in the directory `raw_data`. This is not contained in the git repo because of file size limits. To run this tutorial, you will first need to download the `.gwf` data from GWOSC (https://www.gw-openscience.org/events/GW170817/)

The next set of arguments set up the likelihood and prior:

In [None]:
################################################################################
## Likelihood arguments
################################################################################

distance-marginalization=True
phase-marginalization=True
time-marginalization=True

################################################################################
## Prior arguments
################################################################################

prior-file = GW170817.prior


The likelihood arguments are flags that specify if the three parameters `distance, phase, time` should be numerically/analytically marginalized over each time the likelihood is called. Setting these to True can significantly speed up the run, and these parameters can be recovered in postprocessing (i.e., they're not lost if you choose to marginalized over them). The only time when you might not want to set these to true is if you're going to use a waveform that contains higher order mode content. In this case, the prescription for phase marginalization is formally invalid and phase-marginalization should be set to false.

Next we set the prior. These priors are chosen to be wide enough to capture the bulk of the posterior, but narrow enough so the run converges fairly quickly. The prior is specified in the file `GW170817.prior` which is contained in this directory.

The next set of arguments set the template waveform. Here we will use `IMRPhenomPv2_NRTiral` because it achieves a good trade off between accuracy and speed. More contemporary waveform models could easily be swapped in. Note we also set `frequency-domain-source-model = lal_binary_neutron_star`. This essentially "turns on" sampling over tidal parameters.

In [None]:
################################################################################
## Waveform arguments
################################################################################

waveform_approximant = IMRPhenomPv2_NRTidal
frequency-domain-source-model = lal_binary_neutron_star

Next, we set up dynesty:

In [None]:
###############################################################################
## Sampler settings
################################################################################

sampler = dynesty
nlive = 1000
nact = 5


These settings should be fine for a "quick" run, though our recommended settings for "production" analyses are `nlive=1500`, `nact=10`

Lastly, we set up the slurm scheduler:
    

In [None]:
################################################################################
## Slurm Settings
################################################################################

nodes = 10
ntasks-per-node = 16
time = 24:00:00

The actual settings you choose will depend entirely on the cluster you run on. Here `ntasks-per-node` is the number of CPUs per node (or cores per node), so this job would specify 10 nodes, each with 16 cores, for a total of 160 cores/CPUs. 

Running pbilby is a two step process. First, we run `parallel_bilby_generation`. This creates the run directory structure, produces the `data_dump` file which contains the data, psd etc..., as well as the slurm submit script:

In [2]:
!parallel_bilby_generation GW170817.ini

14:26 bilby_pipe INFO    : Command line arguments: Namespace(Tmax=10000, accounting=None, adapt=False, autocorr_c=5.0, autocorr_tol=50.0, bilby_zero_likelihood_mode=False, burn_in_nact=50.0, calibration_model=None, catch_waveform_errors=False, channel_dict='{H1:LOSC-STRAIN, L1:LOSC-STRAIN, V1=LOSC-STRAIN}', check_point_deltaT=3600, clean=False, cluster=None, coherence_test=False, convert_to_flat_in_component_mass=False, create_plots=False, create_summary=False, data_dict='{H1:raw_data/H-H1_LOSC_CLN_4_V1-1187007040-2048.gwf, L1:raw_data/L-L1_LOSC_CLN_4_V1-1187007040-2048.gwf, V1:raw_data/V-V1_LOSC_CLN_4_V1-1187007040-2048.gwf}', data_dump_file=None, data_format=None, default_prior='BBHPriorDict', deltaT=0.2, detectors=['H1', 'L1', 'V1'], distance_marginalization=True, distance_marginalization_lookup_table=None, dlogz=0.1, do_not_save_bounds_in_resume=False, duration=128.0, dynesty_bound='multi', dynesty_sample='rwalk', email=None, enlarge=1.5, existing_dir=None, extra_likelihood_kwargs=

14:30 bilby INFO    : Complete ini written: outdir/GW170817_config_complete.ini
14:30 bilby INFO    : Setup complete, now run:
 $ bash outdir/submit/bash_GW170817.sh


When this runs successfully, you should see 
```
14:04 bilby INFO    : Complete ini written: outdir/GW150914_config_complete.ini
14:04 bilby INFO    : Setup complete, now run:
$ bash outdir/submit/bash_GW170817.sh
```

If you now inspect the directory, you'll see a new folder called `outdir`. This is where results, logs, data, and submit files are contained.

The last thing to do is to run parallel bilby. If you're running on a cluster, the easiest thing to do at this point would be to run `bash outdir/submit/bash_GW170817.sh`. However, let's first take a look at the contenets of `outdir/submit/bash_GW170817.sh`. The bash script contains instructions to run another script, `analysis_GW170817_0.sh`. Inside `analysis_GW170817_0.sh` is the *actual* command that's submitted by the slurm scheduler:

`mpirun parallel_bilby_analysis outdir/data/GW150914_data_dump.pickle --label GW150914_0 --outdir /Users/rsmi0016/git/parallel_bilby/examples/GW150914_IMRPhenomPv2/outdir/result --sampling-seed 1234`


This can be run on your laptop, or headnode to test if everything is working.