# Designing `pycbc_inference` workflow generator for batch of injections

>>> **ALGORITHM**

>> **Requirements**:
 1. Take as input **one** `injection.ini` with configuration options to generate a list of
     appropriate injections.
     - This is passed onto `pycbc_create_injections` with a new seed for every run
 2. Take as input **multiple** `data.ini`:
     - Each can have different ways of configuring data (noise vs no-noise)
 3. Take as input **multiple** `sampler.ini`:
     - Each can have different samplers configured
 4. Take as input **multiple** `inference.ini` with options for the `pycbc_inference` jobs.
     - Each `inference.ini` can use different sampling priors and `model`s,
 3. Take as input **one** `config.ini` with configuration options for the workflow
     - Specifies which `inference???.ini` to use
     - Specifies which `data???.ini` to use
     - Specifies which `sampler???.ini` to use
     - Specifies which `inference???.ini` to use
     - Each subset of injections will have a full set of operations,
     - A new directory structure is created as:
         - `${ROOT}/injNNNNNN/` ...


>>> **Outline**:
 1. Read in `config.ini` and parse it. 
 1. Say the user asks for `N` injections, `D` different data configs, `S` different samplers, and `M` different inferencing combinations. We have a total of `N x D x S x M` independent runs.
 1. Create subdirs `injection???` for each of the `N` physically distinct injections.
 1. Create `D` subdirs of each of those for each data config
 1. Create `S` subdirs of each of those for each sampler type
 1. Create `M` subdirs of each of those for each model / prior set
 1. Run directory for each injection is `injection???/data???/sampler???/model???`
 1. Copy over `injection.ini` (common for all injections) to `injection???/data???/sampler???/model???`
 1. Copy over the relevant `data.ini`, `sampler.ini` and `inference???.ini` as `inference.ini`
 1. Create run script for creating the injection. Call it `make_injection.sh`
 1. Create run script for running `pycbc_inference` on it. Call it `run_inference.sh`.
 1. Create `2` `BaseJob`s PER injection + inference combination, and make the former a parent job. **Use fresh seed for the injection job**.
 1. Add both jobs as nodes to the `DAG`
 1. **SUBMIT** the `DAG` to chosen scheduler: condor or slurm;
      - All jobs at once? : of course!
      - Do we check if an analysis is running? : yes, for the `DAG` only though!
      - Do we resume? Or, do we kill and restart all forcefully? (BAD IDEA)
          - Is `pycbc_inference` able to resume **correctly** yet?


**NOTE**: Steps`9-11` assume `Condor`! Upgrade to use `Slurm` later.

>> **Implementation**:
 1. `DAG` managed through `glue`,
 2. `CondorJob` managed through `BaseJob`,
 3. Way to define variables in equivalent representation,
 

>>> **Technical**
 1. Directory structure for all jobs ... ?

## Imports and setup

In [1]:
import os
import sys
import shutil
import glob
import h5py

import matplotlib.pyplot as plt
from matplotlib import rc
rc('text', usetex = True)

plt.rcParams.update(  # try to match font sizes of document
  {'axes.labelsize': 20,
   'axes.titlesize': 20,
   'legend.fontsize': 20,
   'xtick.labelsize': 20,
   'ytick.labelsize': 20,
   'text.usetex': True,
   'font.family': 'serif',
   'font.serif': ['palatino'],
   'savefig.dpi': 300
   })

%pylab inline

from glue.ligolw import ligolw
from glue.ligolw import table
from glue.ligolw import lsctables
from glue.ligolw import ilwd
from glue.ligolw import utils as ligolw_utils

import pycbc.strain
import pycbc.psd
from pycbc.pnutils import mass1_mass2_to_mchirp_eta
from pycbc.waveform import td_approximants, fd_approximants
from pycbc.waveform import get_two_pol_waveform_filter, get_td_waveform
from pycbc import DYN_RANGE_FAC
from pycbc.types import FrequencySeries, zeros
from pycbc.filter import match, overlap, sigma, make_frequency_series
from pycbc.scheme import CPUScheme, CUDAScheme
from pycbc import pnutils

from GWNRTools.Utils.SupportFunctions import make_padded_frequency_series
from GWNRTools.DataAnalysis import get_unique_hex_tag
import GWNRTools.DataAnalysis as DA

sys.path.append('/home/prayush.kumar/src/GWNRTools/bin/')
sys.path.append('/home/prayush/src/GWNRTools/bin/')
#sys.path.append('/home/prayush.kumar/local/venv/pycbc_master_enigma/src/GWNRTools/bin/')

Populating the interactive namespace from numpy and matplotlib


`%matplotlib` prevents importing * from pylab and numpy
  "\n`%matplotlib` prevents importing * from pylab and numpy"


Could not import ligolw in /home/prayush/src/GWNRTools/GWNRTools/Stats/FisherMatrixUtilities.pyc, LIGO XML tables wont be read


In [2]:
run_dir = '/home/prayush/research/test_pycbc_inf'
try: os.makedirs(run_dir)
except: pass
os.chdir(run_dir)

In [3]:
!ls

data.ini		    make_inj.sh
emcee_pt-gw150914_like.ini  output_strain_after_inj0.txt
gw150914_like.ini	    output_strain_after_inj2.txt
inference.hdf.bkup	    output_strain_after_inj4.txt
inference.hdf.checkpoint    output_strain_after_inj6.txt
injection.10.hdf	    output_strain_after_inj8.txt
injection.hdf		    output_strain.txt
injection.ini		    run.sh
input_strain.txt


## Injector scripts

In [None]:
with open("injection.ini", "w") as fout:
    fout.write("""\
[static_params]
tc = 1126259462.420
;mass1 = 37
mass2 = 32
ra = 2.2
dec = -1.25
inclination = 2.5
coa_phase = 1.5
polarization = 1.75
distance = 100
f_ref = 20
f_lower = 18
approximant = ENIGMA
taper = start

[variable_params]
mass1 =
eccentricity =
mean_per_ano =

[prior-mass1]
name = uniform
min-mass1 = 10.
max-mass1 = 80.

[prior-eccentricity]
name = uniform
min-eccentricity = 0.
max-eccentricity = 0.2

[prior-mean_per_ano]
name = uniform
min-mean_per_ano = 0.
max-mean_per_ano = 3.1416
""")

In [None]:
!cat injection.ini

In [None]:
with open("make_inj.sh", "w") as fout:
    fout.write("""#!/bin/sh
pycbc_create_injections --verbose \\
        --config-files injection.ini \\
        --ninjections 10 \\
        --seed 10 \\
        --output-file injection.hdf \\
        --variable-params-section variable_params \\
        --static-params-section static_params \\
        --dist-section prior \\
        --force
""")

In [None]:
!cat make_inj.sh

In [None]:
!./make_inj.sh

In [None]:
!h5ls -rv injection.hdf

In [None]:
from pycbc.inject import InjectionSet

In [None]:
inj = InjectionSet("injection.hdf")

In [None]:
for r in inj.table:
    print r

In [None]:
for n in r.dtype.names:print(n, r[n])

In [None]:
inj_f = h5py.File('injection.hdf')

In [None]:
inj_f.attrs['mass2']

In [None]:
inj_f['mass1'][()]

In [None]:
inj_f.close()

In [None]:
opt = (analysis_end_time=2,
       analysis_start_time=-6,
       asd_file=None,
       autogating_cluster=5.0,
       autogating_max_iterations=1,
       autogating_pad=16, autogating_taper=0.25, autogating_threshold=None, autogating_width=0.25,
       channel_name='H1:STRAIN',
       dq_segment_name='DATA', dq_server='segments.ligo.org', dq_source='any',
       fake_strain='aLIGOaLIGODesignSensitivityT1800044',
       fake_strain_from_file=None, fake_strain_seed=44,
       frame_cache=None, frame_files=None, frame_sieve=None, frame_type=None,
       gate=None, gate_overwhitened=False, gating_file=None,
       gps_end_time=1126259468, gps_start_time=1126259452,
       hdf_store=None,
       injection_f_final=None, injection_f_ref=None,
       injection_file='injection.hdf', injection_scale_factor=1.0,
       instruments=['H1', 'L1'],
       low_frequency_cutoff=20.0,
       normalize_strain=None, pad_data=8,
       psd_end_time=1126259718, psd_estimation='median-mean', psd_file=None,
       psd_gate=None, psd_inverse_length=8.0, psd_model=None, psd_num_segments=None,
       psd_output=None, psd_segment_length=8.0, psd_segment_stride=4.0,
       psd_start_time=1126259206, psdvar_high_freq=None, psdvar_long_segment=None,
       psdvar_low_freq=None, psdvar_psd_duration=None, psdvar_psd_stride=None, psdvar_segment=None,
       psdvar_short_segment=None,
       sample_rate=2048,
       sgburst_injection_file=None,
       strain_high_pass=15.0, taper_data=0,
       trigger_time=1126259462, veto_definer=None, zpk_k=None, zpk_p=None, zpk_z=None)

In [None]:
!ls *txt

In [None]:
inj_tc = 1126259462.420

In [None]:
in_s = np.loadtxt('input_strain.txt')
out_s = np.loadtxt('output_strain.txt')

In [None]:
inj_s = {}
for ii in range(10):
    print("Making figure for injection {}".format(ii))
    inj_s[ii] = np.loadtxt('output_strain_after_inj{}.txt'.format(ii))
    
    figure(figsize = (12, 5))
    plot(in_s[:,0], in_s[:,1], alpha = 0.4, label = 'input')
    plot(inj_s[ii][:,0], inj_s[ii][:,1], alpha = 0.6, label = 'output')
    
    xlim(inj_tc - 8, inj_tc + 2)
    ylim( -2e-20, 2e-20 )
    legend(loc = 'best')
    grid()
    title('INJECTION {}'.format(ii))

## Code

>> **DECISION**: We will generate a new `injection.hdf` for each individual run,
even if the intrinsic parameters of the injection are common amongst many.
>>> **pros**: Each `pycbc_inference` run remains self-contained and independent.<br>
>>> **cons**: Redundant calls to `pycbc_create_injections`, but this is a one-time cost.

> **`ConfigWriter`**:
> - takes in an `opts` object that contains info on sections and options
> - writes them to desired output file

In [None]:
class ConfigWriter():
    def __init__(self, opts, run_dir):
        self.opts = opts
        self.run_dir = run_dir
    def write(self):
        return

> **`InferenceConfigs`**:
> - stores all `config.ini` files
> - returns on demand. Compatible with ConfigWriter

In [None]:
# InferenceConfigs
class InferenceConfigs():
    def __init__(self, opts, run_dir, configs = {}):
        '''
 - stores all config.ini files
 - returns on demand. Compatible with ConfigWriter
        '''
        self.opts = opts
        self.run_dir = run_dir
        # Make this >>
        assert(isinstance(configs, dict))
        self.configs = configs
        self.configs['data'] = """\
[data]
instruments = H1 L1
trigger-time = 1126259462.42
analysis-start-time = -6
analysis-end-time = 2
; strain settings
sample-rate = 2048
fake-strain = H1:aLIGOaLIGODesignSensitivityT1800044 L1:aLIGOaLIGODesignSensitivityT1800044
fake-strain-seed = H1:44 L1:45
; psd settings
psd-estimation = median-mean
psd-inverse-length = 8
psd-segment-length = 8
psd-segment-stride = 4
psd-start-time = -256
psd-end-time = 256
; even though we're making fake strain, the strain
; module requires a channel to be provided, so we'll
; just make one up
channel-name = H1:STRAIN L1:STRAIN
; Providing an injection file will cause a simulated
; signal to be added to the data
injection-file = injection.hdf
; We'll use a high-pass filter so as not to get numerical errors from the large
; amplitude low frequency noise. Here we use 15 Hz, which is safely below the
; low frequency cutoff of our likelihood integral (20 Hz)
strain-high-pass = 15
; The pad-data argument is for the high-pass filter: 8s are added to the
; beginning/end of the analysis/psd times when the data is loaded. After the
; high pass filter is applied, the additional time is discarded. This pad is
; *in addition to* the time added to the analysis start/end time for the PSD
; inverse length. Since it is discarded before the data is transformed for the
; likelihood integral, it has little affect on the run time.
pad-data = 8
"""
        self.configs['emcee_pt-gw150914_like'] = """\
[sampler]
name = emcee_pt
nwalkers = 200
ntemps = 20
effective-nsamples = 1000
checkpoint-interval = 2000
max-samples-per-chain = 1000

[sampler-burn_in]
burn-in-test = nacl & max_posterior

;
;   Sampling transforms
;
[sampling_params]
; parameters on the left will be sampled in
; parametes on the right
mass1, mass2 : mchirp, q

[sampling_transforms-mchirp+q]
; inputs mass1, mass2
; outputs mchirp, q
name = mass1_mass2_to_mchirp_q
"""
        self.configs['gw150914_like'] = """\
[model]
name = gaussian_noise
low-frequency-cutoff = 20.0

[variable_params]
; waveform parameters that will vary in MCMC
delta_tc =
mass1 =
mass2 =
spin1_a =
spin1_azimuthal =
spin1_polar =
spin2_a =
spin2_azimuthal =
spin2_polar =
distance =
coa_phase =
inclination =
polarization =
ra =
dec =

[static_params]
; waveform parameters that will not change in MCMC
approximant = ENIGMA
f_lower = 20
f_ref = 20
; we'll set the tc by using the trigger time in the data
; section of the config file + delta_tc
trigger_time = ${data|trigger-time}

[prior-delta_tc]
; coalescence time prior
name = uniform
min-delta_tc = -0.1
max-delta_tc = 0.1

[waveform_transforms-tc]
; we need to provide tc to the waveform generator
name = custom
inputs = delta_tc
tc = ${data|trigger-time} + delta_tc

[prior-mass1]
name = uniform
min-mass1 = 10.
max-mass1 = 80.

[prior-mass2]
name = uniform
min-mass2 = 10.
max-mass2 = 80.

[prior-spin1_a]
name = uniform
min-spin1_a = 0.0
max-spin1_a = 0.99

[prior-spin1_polar+spin1_azimuthal]
name = uniform_solidangle
polar-angle = spin1_polar
azimuthal-angle = spin1_azimuthal

[prior-spin2_a]
name = uniform
min-spin2_a = 0.0
max-spin2_a = 0.99

[prior-spin2_polar+spin2_azimuthal]
name = uniform_solidangle
polar-angle = spin2_polar
azimuthal-angle = spin2_azimuthal

[prior-distance]
; following gives a uniform volume prior
name = uniform_radius
min-distance = 10
max-distance = 1000

[prior-coa_phase]
; coalescence phase prior
name = uniform_angle

[prior-inclination]
; inclination prior
name = sin_angle

[prior-ra+dec]
; sky position prior
name = uniform_sky

[prior-polarization]
; polarization prior
name = uniform_angle
"""
        self.config_names = self.configs.keys()
        
        self.config_writers = {}
        for config_name in self.config_names:
            self.config_writers[config_name] = ConfigWriter(opts, run_dir)
    def available_configs(self):
        return self.config_names
    def get(self, config_name):
        return self.configs[config_name]
    def set(self, config_name, config):
        self.configs[config_name] = configs

> **`InjectionInferenceAnalysis`**:
> - setup individual analysis dir
> - setup all analysis dirs
> - start / stop / restart individual analysis
> - check status of individual analysis

In [150]:
confs.sections()

['executables', 'workflow', 'inspinj']

In [154]:
confs.get('workflow', 'data').split()

['data.ini']

In [174]:
confs.get('workflow', 'sampler').split()

['sampler.ini', 'sampler2.ini']

In [175]:
from six.moves import configparser as ConfigParser

In [176]:
sampler_configs = {}
for f in confs.get('workflow', 'sampler').split():
    sampler_configs[f] = ConfigParser.ConfigParser()
    sampler_configs[f].read(f)

In [166]:
!cat sampler.ini

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;      SAMPLER
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
[sampler]
name = emcee_pt
nwalkers = 200
ntemps = 20
effective-nsamples = 1000
checkpoint-interval = 2000
max-samples-per-chain = 1000

[sampler-burn_in]
burn-in-test = nacl & max_posterior

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;   Sampling transforms
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
[sampling_params]
; parameters on the left will be sampled in
; parametes on the right
mass1, mass2 : mchirp, q

[sampling_transforms-mchirp+q]
; inputs mass1, mass2
; outputs mchirp, q
name = mass1_mass2_to_mchirp_q


In [179]:
cp = sampler_configs['sampler2.ini']

In [180]:
cp.sections()

['sampler',
 'sampler-burn_in',
 'sampling_params',
 'sampling_transforms-mchirp+q']

In [171]:
for s in cp.sections():
    print(s)
    print(cp.items(s))

sampler
[('name', 'emcee_pt'), ('nwalkers', '200'), ('ntemps', '20'), ('effective-nsamples', '1000'), ('checkpoint-interval', '2000'), ('max-samples-per-chain', '1000')]
sampler-burn_in
[('burn-in-test', 'nacl & max_posterior')]
sampling_params
[('mass1, mass2', 'mchirp, q')]
sampling_transforms-mchirp+q
[('name', 'mass1_mass2_to_mchirp_q')]


In [231]:
confs.items('executables')

[('inspinj', '/home/prayush/local/venv/pycbc_inf/bin/pycbc_create_injections'),
 ('inference', '/home/prayush/local/venv/pycbc_inf/bin/pycbc_inference')]

In [245]:
confs.items('inspinj')

[('config', 'injection.ini'),
 ('ninjections', '10'),
 ('seed', '1234'),
 ('output-file', 'injection.hdf'),
 ('variable-params-section', 'variable_params'),
 ('static-params-section', 'static_params'),
 ('dist-section prior', ''),
 ('force', '')]

In [244]:
confs.set('inspinj', 'seed', '1234')

In [238]:
def test_local_dir_change():
    idir = os.getcwd()
    print(idir)
    os.chdir('/home/prayush/research/')
    print(os.getcwd())    
    os.chdir(idir)

In [239]:
test_local_dir_change()

/home/prayush/research
/home/prayush/research


In [189]:
np.random.randint(10, 10000)

3274

In [195]:
os.getcwd()

'/home/prayush/research/test_pycbc_inf2'

In [209]:
'data.ini'.split('.ini')[0]

'data'

In [196]:
import itertools

In [199]:
for p in itertools.product(['data.ini', 'data2.ini'], ['sampler.ini', 'sampler2.ini'], ['model.ini']):
    print(p)

('data.ini', 'sampler.ini', 'model.ini')
('data.ini', 'sampler2.ini', 'model.ini')
('data2.ini', 'sampler.ini', 'model.ini')
('data2.ini', 'sampler2.ini', 'model.ini')


In [200]:
list(itertools.product(['data.ini', 'data2.ini'], ['sampler.ini', 'sampler2.ini'], ['model.ini']))

[('data.ini', 'sampler.ini', 'model.ini'),
 ('data.ini', 'sampler2.ini', 'model.ini'),
 ('data2.ini', 'sampler.ini', 'model.ini'),
 ('data2.ini', 'sampler2.ini', 'model.ini')]

In [229]:
len('')

0

In [206]:
confs['x'], confs['y'], confs['z'] = ['data.ini', 'data2.ini', 'data3.ini']

In [224]:
confs.items('executables')

[('inspinj', '/home/prayush/local/venv/pycbc_inf/bin/pycbc_create_injections'),
 ('inference', '/home/prayush/local/venv/pycbc_inf/bin/pycbc_inference')]

In [131]:
with open('data.ini', 'r') as d:
    x = d.read()

In [133]:
with open('/home/prayush/test.ini', 'w') as f:
    f.write(x)

In [137]:
!git diff --no-index data.ini /home/prayush/test.ini

## Workflow generator

In [27]:
from glue.pipeline import CondorDAGJob, CondorDAGNode, CondorDAG, CondorJob

In [104]:
!pwd

/home/prayush/research/test_pycbc_inf


In [2]:
os.chdir('/home/prayush/research//test_pycbc_inf2/')

In [249]:
!ls

config.ini  delme	   injection.ini  plots		sampler.ini
data.ini    inference.ini  log		  sampler2.ini	scripts


In [157]:
# Write CONFIGS
with open("injection.ini", "w") as fout:
    fout.write("""\
[static_params]
tc = 1126259462.420
;mass1 = 37
mass2 = 32
ra = 2.2
dec = -1.25
inclination = 2.5
coa_phase = 1.5
polarization = 1.75
distance = 100
f_ref = 20
f_lower = 18
approximant = ENIGMA
taper = start

[variable_params]
mass1 =
eccentricity =
mean_per_ano =

[prior-mass1]
name = uniform
min-mass1 = 10.
max-mass1 = 80.

[prior-eccentricity]
name = uniform
min-eccentricity = 0.
max-eccentricity = 0.2

[prior-mean_per_ano]
name = uniform
min-mean_per_ano = 0.
max-mean_per_ano = 3.1416
""")


# data.ini
with open("data.ini", "w") as fout:
    fout.write("""\
[data]
instruments = H1 L1
trigger-time = 1126259462.42
analysis-start-time = -6
analysis-end-time = 2
; strain settings
sample-rate = 2048
fake-strain = H1:aLIGOaLIGODesignSensitivityT1800044 L1:aLIGOaLIGODesignSensitivityT1800044
fake-strain-seed = H1:44 L1:45
; psd settings
psd-estimation = median-mean
psd-inverse-length = 8
psd-segment-length = 8
psd-segment-stride = 4
psd-start-time = -256
psd-end-time = 256
; even though we're making fake strain, the strain
; module requires a channel to be provided, so we'll
; just make one up
channel-name = H1:STRAIN L1:STRAIN
; Providing an injection file will cause a simulated
; signal to be added to the data
injection-file = injection.hdf
; We'll use a high-pass filter so as not to get numerical errors from the large
; amplitude low frequency noise. Here we use 15 Hz, which is safely below the
; low frequency cutoff of our likelihood integral (20 Hz)
strain-high-pass = 15
; The pad-data argument is for the high-pass filter: 8s are added to the
; beginning/end of the analysis/psd times when the data is loaded. After the
; high pass filter is applied, the additional time is discarded. This pad is
; *in addition to* the time added to the analysis start/end time for the PSD
; inverse length. Since it is discarded before the data is transformed for the
; likelihood integral, it has little affect on the run time.
pad-data = 8
""")
    

# sampler.ini
with open("sampler.ini", "w") as fout:
    fout.write("""\
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;      SAMPLER
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
[sampler]
name = emcee_pt
nwalkers = 200
ntemps = 20
effective-nsamples = 1000
checkpoint-interval = 2000
max-samples-per-chain = 1000

[sampler-burn_in]
burn-in-test = nacl & max_posterior

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;   Sampling transforms
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
[sampling_params]
; parameters on the left will be sampled in
; parametes on the right
mass1, mass2 : mchirp, q

[sampling_transforms-mchirp+q]
; inputs mass1, mass2
; outputs mchirp, q
name = mass1_mass2_to_mchirp_q
""")

subprocess.call(['cp', '-v', 'sampler.ini', 'sampler2.ini'])

# inference.ini
with open("inference.ini", "w") as fout:
    fout.write("""\
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;   Model
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
[model]
name = gaussian_noise
low-frequency-cutoff = 20.0


;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;   Sampling parameters
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
[variable_params]
; waveform parameters that will vary in MCMC
delta_tc =
mass1 =
mass2 =
spin1_a =
spin1_azimuthal =
spin1_polar =
spin2_a =
spin2_azimuthal =
spin2_polar =
distance =
coa_phase =
inclination =
polarization =
ra =
dec =

[static_params]
; waveform parameters that will not change in MCMC
approximant = IMRPhenomPv2
f_lower = 20
f_ref = 20
; we'll set the tc by using the trigger time in the data
; section of the config file + delta_tc
trigger_time = ${data|trigger-time}

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;   Sampling priors
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
[prior-delta_tc]
; coalescence time prior
name = uniform
min-delta_tc = -0.1
max-delta_tc = 0.1

[waveform_transforms-tc]
; we need to provide tc to the waveform generator
name = custom
inputs = delta_tc
tc = ${data|trigger-time} + delta_tc

[prior-mass1]
name = uniform
min-mass1 = 10.
max-mass1 = 80.

[prior-mass2]
name = uniform
min-mass2 = 10.
max-mass2 = 80.

[prior-spin1_a]
name = uniform
min-spin1_a = 0.0
max-spin1_a = 0.99

[prior-spin1_polar+spin1_azimuthal]
name = uniform_solidangle
polar-angle = spin1_polar
azimuthal-angle = spin1_azimuthal

[prior-spin2_a]
name = uniform
min-spin2_a = 0.0
max-spin2_a = 0.99

[prior-spin2_polar+spin2_azimuthal]
name = uniform_solidangle
polar-angle = spin2_polar
azimuthal-angle = spin2_azimuthal

[prior-distance]
; following gives a uniform volume prior
name = uniform_radius
min-distance = 10
max-distance = 1000

[prior-coa_phase]
; coalescence phase prior
name = uniform_angle

[prior-inclination]
; inclination prior
name = sin_angle

[prior-ra+dec]
; sky position prior
name = uniform_sky

[prior-polarization]
; polarization prior
name = uniform_angle
""")

# Workflow.ini
with open("config.ini", "w") as fout:
    fout.write("""\
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;   Executables
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
[executables]
inspinj = ${which:pycbc_create_injections}
inference = ${which:pycbc_inference}
plot = ${which:pycbc_inference_plot_posterior}

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;   Workflow
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
[workflow]
accounting-group = ligo.dev.o3.cbc.explore.test
templates-per-job = 100
log-path = log
banksim-request-memory = 8G
data = data.ini
sampler = sampler.ini sampler2.ini
inference = inference.ini

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;   Injections
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
[inspinj]
config-files = injection.ini
ninjections = 10
seed = 10
output-file = injection.hdf
variable-params-section = variable_params
static-params-section = static_params
dist-section prior =
force =

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;   Inference
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
[inference]
verbose =
seed = 12
config-files = inference.ini data.ini sampler.ini
output-file = inference.hdf
nprocesses = 10
force =

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;   Visualize
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
[plot]
input-file = inference.hdf
output-file = plots/posteriors.png
plot-scatter =
plot-marginal =
plot-prior = inference.ini data.ini
""")

In [130]:
!cat injection.ini
!cat data.ini
!cat sampler.ini
!cat inference.ini
!cat config.ini

[static_params]
tc = 1126259462.420
;mass1 = 37
mass2 = 32
ra = 2.2
dec = -1.25
inclination = 2.5
coa_phase = 1.5
polarization = 1.75
distance = 100
f_ref = 20
f_lower = 18
approximant = ENIGMA
taper = start

[variable_params]
mass1 =
eccentricity =
mean_per_ano =

[prior-mass1]
name = uniform
min-mass1 = 10.
max-mass1 = 80.

[prior-eccentricity]
name = uniform
min-eccentricity = 0.
max-eccentricity = 0.2

[prior-mean_per_ano]
name = uniform
min-mean_per_ano = 0.
max-mean_per_ano = 3.1416
[data]
instruments = H1 L1
trigger-time = 1126259462.42
analysis-start-time = -6
analysis-end-time = 2
; strain settings
sample-rate = 2048
fake-strain = H1:aLIGOaLIGODesignSensitivityT1800044 L1:aLIGOaLIGODesignSensitivityT1800044
fake-strain-seed = H1:44 L1:45
; psd settings
psd-estimation = median-mean
psd-inverse-length = 8
psd-segment-length = 8
psd-segment-stride = 4
psd-start-time = -256
psd-end-time = 256
; even though we're making fake strain, the strain
; module requires a channel to be prov

In [13]:
!/home/prayush/src/GWNRTools/bin/gwnrtools_create_injection_inference_workflow -h

Could not import ligolw in /home/prayush/src/GWNRTools/GWNRTools/Stats/FisherMatrixUtilities.pyc, LIGO XML tables wont be read
usage: /home/prayush/src/GWNRTools/bin/gwnrtools_create_injection_inference_workflow [--options]

Setup workflow to perferm Bayesian parameter estimation runs on a custom set
of simulated signals

optional arguments:
  -h, --help            show this help message and exit
  --version             Prints version information.
  --verbose             Print logging messages.
  --skip-creating-injections
                        Skip calling lalapps_inspinj and assume injections
                        already exist
  --output-file OUTPUT_FILE
                        Output file path.
  --force               If the output-file already exists, overwrite it.
                        Otherwise, an OSError is raised.
  --save-backup         Don't delete the backup file after the run has
                        completed.
  --nprocesses NPROCESSES
                        Nu

In [25]:
!which pycbc_create_injections

/home/prayush/local/venv/pycbc_inf/bin/pycbc_create_injections


In [158]:
!/home/prayush/src/GWNRTools/bin/gwnrtools_create_injection_inference_workflow --verbose\
    --config-file=config.ini --force

Could not import ligolw in /home/prayush/src/GWNRTools/GWNRTools/Stats/FisherMatrixUtilities.pyc, LIGO XML tables wont be read
2020-02-25 11:43:26,023 Using seed 0
2020-02-25 11:43:26,024 Will setup analyses in /home/prayush/research/test_pycbc_inf2
2020-02-25 11:43:26,024 Running with CPU support: 1 threads
2020-02-25 11:43:26,120 Reading configuration file
2020-02-25 11:43:26,121 Making workspace directories
2020-02-25 11:43:26,144 Creating DAG
('config options: ', <pycbc.workflow.configuration.WorkflowConfigParser instance at 0x7f4e7f1e6488>)
2020-02-25 11:43:26,144 Making injection000/data/sampler2/inference in /home/prayush/research/test_pycbc_inf2
2020-02-25 11:43:26,180 Copying config files to injection000/data/sampler2/inference
2020-02-25 11:43:26,180 Copying executables to injection000/data/sampler2/inference/scripts/
2020-02-25 11:43:26,184 Making injection001/data/sampler2/inference in /home/prayush/research/test_pycbc_inf2
2020-02-25 11:43:26,217 Copying config files to in

In [42]:
os.path.dirname(os.path.abspath('injection000/data/sampler/inference/make_injection'))

'/home/prayush/research/test_pycbc_inf2/injection000/data/sampler/inference'

In [149]:
import copy

def populate(options, k):
    
    output = []
    
    if k == 0: return [[]]
    
    for i, first_elem in enumerate(options, 1):
        
        #first_elem = options[i]
        
        #reduced_options = copy.deepcopy(options)
        #reduced_options.pop(i)
        reduced_options = options[i:]
                
        last_k_minus_1_elems = populate(reduced_options, k - 1)
        
        for j in range(len(last_k_minus_1_elems)):
            li = last_k_minus_1_elems[j]
            li.extend([first_elem])
        
        print(last_k_minus_1_elems)
        
        output.extend(last_k_minus_1_elems)
            
    return output

def combinations(sequence, length, NULL = object()):
    if length <= 0:
        combos = [NULL]
    else:
        combos = []
        for i, item in enumerate(sequence, 1):
            rem_items = sequence[i:]
            rem_combos = combinations(rem_items, length - 1)
            
            print([item if combo is NULL else [item, combo] for combo in rem_combos])
            
            combos.extend(item if combo is NULL else [item, combo] for combo in rem_combos)
    return combos

In [150]:
populate([1,2,3,4], 2)

[[2]]
[[3]]
[[4]]
[[2, 1], [3, 1], [4, 1]]
[[3]]
[[4]]
[[3, 2], [4, 2]]
[[4]]
[[4, 3]]
[]


[[2, 1], [3, 1], [4, 1], [3, 2], [4, 2], [4, 3]]

In [136]:
combinations([1,2,3,4], 2)

[2]
[3]
[4]
[[1, 2], [1, 3], [1, 4]]
[3]
[4]
[[2, 3], [2, 4]]
[4]
[[3, 4]]
[]


[[1, 2], [1, 3], [1, 4], [2, 3], [2, 4], [3, 4]]

In [96]:
'SEOBNRv4' in td_approximants()

True

In [None]:
# DATA.INI
with open("data.ini", "w") as fout:
    fout.write("""\
[data]
instruments = H1 L1
trigger-time = 1126259462.42
analysis-start-time = -6
analysis-end-time = 2
; strain settings
sample-rate = 2048
fake-strain = H1:aLIGOaLIGODesignSensitivityT1800044 L1:aLIGOaLIGODesignSensitivityT1800044
fake-strain-seed = H1:44 L1:45
; psd settings
psd-estimation = median-mean
psd-inverse-length = 8
psd-segment-length = 8
psd-segment-stride = 4
psd-start-time = -256
psd-end-time = 256
; even though we're making fake strain, the strain
; module requires a channel to be provided, so we'll
; just make one up
channel-name = H1:STRAIN L1:STRAIN
; Providing an injection file will cause a simulated
; signal to be added to the data
injection-file = injection.hdf
; We'll use a high-pass filter so as not to get numerical errors from the large
; amplitude low frequency noise. Here we use 15 Hz, which is safely below the
; low frequency cutoff of our likelihood integral (20 Hz)
strain-high-pass = 15
; The pad-data argument is for the high-pass filter: 8s are added to the
; beginning/end of the analysis/psd times when the data is loaded. After the
; high pass filter is applied, the additional time is discarded. This pad is
; *in addition to* the time added to the analysis start/end time for the PSD
; inverse length. Since it is discarded before the data is transformed for the
; likelihood integral, it has little affect on the run time.
pad-data = 8
""")

In [None]:
!cat data.ini

In [None]:
# emcee_pt-gw150914_like
with open("emcee_pt-gw150914_like.ini", "w") as fout:
    fout.write("""\
[sampler]
name = emcee_pt
nwalkers = 200
ntemps = 20
effective-nsamples = 1000
checkpoint-interval = 2000
max-samples-per-chain = 1000

[sampler-burn_in]
burn-in-test = nacl & max_posterior

;
;   Sampling transforms
;
[sampling_params]
; parameters on the left will be sampled in
; parametes on the right
mass1, mass2 : mchirp, q

[sampling_transforms-mchirp+q]
; inputs mass1, mass2
; outputs mchirp, q
name = mass1_mass2_to_mchirp_q
""")

In [None]:
!cat emcee_pt-gw150914_like.ini

In [None]:
# gw150914_like
with open("gw150914_like.ini", "w") as fout:
    fout.write("""\
[model]
name = gaussian_noise
low-frequency-cutoff = 20.0

[variable_params]
; waveform parameters that will vary in MCMC
delta_tc =
mass1 =
mass2 =
spin1_a =
spin1_azimuthal =
spin1_polar =
spin2_a =
spin2_azimuthal =
spin2_polar =
distance =
coa_phase =
inclination =
polarization =
ra =
dec =

[static_params]
; waveform parameters that will not change in MCMC
approximant = ENIGMA
f_lower = 20
f_ref = 20
; we'll set the tc by using the trigger time in the data
; section of the config file + delta_tc
trigger_time = ${data|trigger-time}

[prior-delta_tc]
; coalescence time prior
name = uniform
min-delta_tc = -0.1
max-delta_tc = 0.1

[waveform_transforms-tc]
; we need to provide tc to the waveform generator
name = custom
inputs = delta_tc
tc = ${data|trigger-time} + delta_tc

[prior-mass1]
name = uniform
min-mass1 = 10.
max-mass1 = 80.

[prior-mass2]
name = uniform
min-mass2 = 10.
max-mass2 = 80.

[prior-spin1_a]
name = uniform
min-spin1_a = 0.0
max-spin1_a = 0.99

[prior-spin1_polar+spin1_azimuthal]
name = uniform_solidangle
polar-angle = spin1_polar
azimuthal-angle = spin1_azimuthal

[prior-spin2_a]
name = uniform
min-spin2_a = 0.0
max-spin2_a = 0.99

[prior-spin2_polar+spin2_azimuthal]
name = uniform_solidangle
polar-angle = spin2_polar
azimuthal-angle = spin2_azimuthal

[prior-distance]
; following gives a uniform volume prior
name = uniform_radius
min-distance = 10
max-distance = 1000

[prior-coa_phase]
; coalescence phase prior
name = uniform_angle

[prior-inclination]
; inclination prior
name = sin_angle

[prior-ra+dec]
; sky position prior
name = uniform_sky

[prior-polarization]
; polarization prior
name = uniform_angle
""")

In [None]:
!cat gw150914_like.ini

In [None]:
# RUN.SH
with open("run.sh", "w") as fout:
    fout.write("""#!/bin/sh

# sampler parameters
PRIOR_CONFIG=gw150914_like.ini
DATA_CONFIG=data.ini
SAMPLER_CONFIG=emcee_pt-gw150914_like.ini
OUTPUT_PATH=inference.hdf

# the following sets the number of cores to use; adjust as needed to
# your computer's capabilities
NPROCS=10

# run sampler
# Running with OMP_NUM_THREADS=1 stops lalsimulation
# from spawning multiple jobs that would otherwise be used
# by pycbc_inference and cause a reduced runtime.
OMP_NUM_THREADS=1 \
pycbc_inference --verbose \
    --seed 12 \
    --config-file ${PRIOR_CONFIG} ${DATA_CONFIG} ${SAMPLER_CONFIG} \
    --output-file ${OUTPUT_PATH} \
    --nprocesses ${NPROCS} \
    --force
""")

In [None]:
!cat run.sh