# To start a PyCBC run:

Everything you need to start your runs is in this directory **which you need to copy** to where you'd like to run from:

In [None]:
cp -r /home/hannah.griggs/nu/pynu_tests/o2grbs/pynuruns

Once you copy that directory, you will need to adjust some of the files to point to your directories/namespace. To see all of the places in here that point to MY directory, run 

In [None]:
grep -i "hannah.griggs" *

This should print all of the files that contain "hannah.griggs" and the line it's in. Yay grep! 

**You only need to change the INI_LOC and STATISTIC_FILE lines in runhlo2.sh**. 

The STATISTIC_FILE should be changed to the location of the custom dtdp PDF that you made.

**Launch the necessary environment (pynumods)**.

You must deactivate the (igwn) environment first, then source the modified pynu environment like this with the "source" command.

In [None]:
conda deactivate # Get out of the (igwn) environment. Make sure to have no active environment at all before sourcing.
source src/nu-dev/pynumods/bin/activate # Activate the pynu environment.

## Part 1: Setting up your GRB

For the O2 rerun experiments, we are rerunning the O2 PyCBC boxes with targeted TPA PDFs. 

**Gather your GRB information**

Visit https://docs.google.com/spreadsheets/d/1FuLUsVUoQGJPYha1vU2znyChYIt1mCyerwVhiiYxeBU/edit?usp=sharing to see the list of GRBs for O2. This contains the information you will use to generate the sky-phase-amplitude PDFs, following the TPA PDF tutorial.

Once the PDF is generated, identify which PyCBC offline chunk the GRB GPS time falls into.

Chunk times are gathered into this directory:

In [None]:
/home/hannah.griggs/nu/pynu_tests/o2grbs/chunkinis

Which I also copied into pynuruns. The files within are actually the full configurations for the offline O2 PyCBC runs, but to see the GPS time span for each, run:

In [None]:
head ch2

Replacing `ch2` with the chunk number you want to look at. The `head` command prints the first 10 lines of a file. Likewise, `tail` prints the last 10 lines.

Once you find the correct chunk, note the start and end times. You will use these for the analysis.

## Part 2: Running an analysis with reused data:

In the pynuruns directory are ini files for the pycbc workflow. 

### **runhlo2.sh** will need to be edited as:

In [None]:
WORKFLOW_NAME=mmatest ## you can keep this the same
CONFIG_TAG=v2.3.2.3  ## keep this
GITLAB_URL="https://git.ligo.org/pycbc/offline-analysis/-/raw/${CONFIG_TAG}/production/o4/broad/config"
ID=grb161210524 ## Change this to your GRB name
RUNID=_4 ## If this is a rerun, use this line to indicate which rerun. Otherwise this can be blank.

In [None]:
INI_LOC="/home/hannah.griggs/nu/pynu_tests/o2grbs/" # Change to your pynuruns location
STATISTIC_FILE="home/hannah.griggs/nu/pynu_tests/skyloc/dtdphase/L1H1-stat-GRB161212652.hdf" # Change to your custom PDC location

### To rerun an analysis using existing results, we need to use a **cache file**

In the `maps` directory in pynuruns, you will notice a file called `chunk2.map`. This is an example of a cache file that tells PyCBC the jobs it doesn't need to redo. In that file are two `HDF_TRIGGER_MERGE` files, one for each IFO.

With the GPS start and end times you identified for the PyCBC chunk corresponding to your GRB, locate the `HDF_TRIGGER_MERGE` file that matches the chunk times in this archive directory:

In [None]:
/home/ian.harry/aLIGO/O2/analyses/ALL_TRIGGER_FILES/

Copy the file to your `chunkinis` directory to make transferring more efficient (by trial and error, it seems like transfers succeed more often if the file is in your namespace):

In [None]:
cp /home/ian.harry/aLIGO/O2/analyses/ALL_TRIGGER_FILES/H1-HDF_TRIGGER_MERGE_FULL_DATA-1164556817-1929600.hdf chunkinis

Copy the `chunk2.map` cache file and name it after the chunk you are working with. 

The files have entries which tell PyCBC where to find the files it can be reused, like:

In [None]:
L1-HDF_TRIGGER_MERGE_FULL_DATA-1164556817-1929600.hdf /home/hannah.griggs/nu/pynu_tests/o2grbs/chunkinis/L1-HDF_TRIGGER_MERGE_FULL_DATA-1164556817-1929600.hdf pool=
"local"

**Replace the GPS times in the file with those matching the `HDF_TRIGGER_MERGE` file you found.**

Edit `runhlo2.sh` to point to your cache file instead of `chunk2.map`.

In [None]:
  --cache-file maps/chunk2.map \

If you happen to need chunk 2, then these steps are done for you.

### Good to go! Run the analysis with:

In [None]:
./runhlo2.sh

You will be prompted to enter your password, then it'll be off. 

## Part 3: Troubleshooting if jobs are struggling:

You'll need to babysit the jobs since they've been having issues with disk space.
Check how the queue is doing from within the run directory with:

In [None]:
./status

**If a small cluster of jobs fail**, let the analysis get as far as it can until the status updates to (FAILURE).

### Restarting a job that failed

Once it fails, edit the "start" script (in your run output directory) to include the preamble for the run.sh script (for authentication reasons):

In [None]:
ecp-get-cert --destroy
htdestroytoken
kinit hannah.griggs ## REMEMBER TO CHANGE TO YOUR NAME
unset XDG_RUNTIME_DIR
htgettoken -a vault.ligo.org -i igwn --scopes dqsegdb.read,gwdatafind.read,read:/frames,read:/ligo,read:/virgo,read:/kagra
condor_vault_storer -v igwn
export GWDATAFIND_SERVER="datafind.ligo.org:443"
PEGASUS_PYTHON=/home/ian.harry/conda_envs/pegasus_python/bin/python PATH=/home/ian.harry/conda_envs/pegasus_python/bin/:${PATH}

pegasus-run /local/hannah.griggs/pycbc-tmp_u_1hqa3g/work $@

Then you can restart the job with:

In [None]:
./start

### Restarting a job that's held

**If jobs are getting held**, see the reason with:

In [None]:
condor_q better-analyze

This will tell you which job requirements are insufficient and by how much. If memory or disk space are the problem, update held jobs like this:

In [None]:
condor_qedit -constraint "JOBSTATUS==5" RequestDisk=newrequestamount

Change RequestDisk to RequestMemory as needed, and only request a little over what the jobs seem to need.

Release jobs again with 

In [None]:
condor_release -constraint "JOBSTATUS==5"