# To start a PyCBC run:

Everything you need to start your runs is in this directory **which you need to copy** to where you'd like to run from:

In [None]:
cp -r /home/hannah.griggs/nu/pynu_tests/o2grbs/pynuruns

Once you copy that directory, you will need to adjust some of the files to point to your directories/namespace. To see all of the places in here that point to MY directory, run 

In [None]:
grep -i "hannah.griggs" *

This should print all of the files that contain "hannah.griggs" and the line it's in. Yay grep! 

**You only need to change the INI_LOC and STATISTIC_FILE lines in runhlo2.sh**. 

The STATISTIC_FILE should be changed to the location of the custom dtdp PDF that you made.

#### Refer to DtDpPDFTutorial for how to generate the TPA PDF for your GRB

**Launch the necessary environment (pynumods)**.

You must deactivate the (igwn) environment first, then source the modified pynu environment like this with the "source" command.

In [None]:
conda deactivate # Get out of the (igwn) environment. Make sure to have no active environment at all before sourcing.
source src/nu-dev/pynumods/bin/activate # Activate the pynu environment.

## Part 0: **BEFORE STARTING**, Check if your GRB has an associated potential PyCBC trigger at all.

I have a script which checks the existing PyCBC all sky offline search for evidence of *any* trigger which might fall into an acceptable time window with the GRB event: `timewindow.py`

This lives in the `pynuruns` directory you just copied over.

To run, gather the GRB's T90 GPS time and the chunk number of your GRB. We will use +/- 10 seconds as a generous window for when a GW might arrive associated with a GRB. This time window can change depending on the event you're working with, but for GRBs, we'll use 10s.

Run:

In [None]:
python timewindow.py <chunk-number> <T90-time> <time-window>

For example, for GRB 170127067:

In [None]:
python timewindow.py 4 1169516165.79 10

**If the script returns the following**: `No triggers found within the given time window. Skip this event.`, then that means there is no viable trigger. Move on to your next GRB. Enter `NaN	NaN	0` into the GRB table to indicate this result.

**If it returns a line of the allsky output file**, then continue to Part 1.

## Part 1: Setting up your GRB

For the O2 rerun experiments, we are rerunning the O2 PyCBC boxes with targeted TPA PDFs. 

**Gather your GRB information**

Visit https://docs.google.com/spreadsheets/d/1FuLUsVUoQGJPYha1vU2znyChYIt1mCyerwVhiiYxeBU/edit?usp=sharing to see the list of GRBs for O2. This contains the information you will use to generate the sky-phase-amplitude PDFs, following the TPA PDF tutorial.

Once the PDF is generated, identify which PyCBC offline chunk the GRB GPS time falls into.

Chunk times are gathered into this directory:

In [None]:
/home/hannah.griggs/nu/pynu_tests/o2grbs/chunkinis

Which I also copied into pynuruns. The files within are actually the full configurations for the offline O2 PyCBC runs, but to see the GPS time span for each, run:

In [None]:
head ch2

Replacing `ch2` with the chunk number you want to look at. The `head` command prints the first 10 lines of a file. Likewise, `tail` prints the last 10 lines.

I also collected the times in a list in the file `times`, also in `chunkinis`, if that format works better for you.

Once you find the correct chunk, note the start and end times. You will use these for the analysis.

## Part 2: Running an analysis with reused data:

In the pynuruns directory are ini files for the pycbc workflow. 

### **runhlo2.sh** will need to be edited as:

In [None]:
# Edit for O2 PyNu runs
TYPE=grb
ID=170219002 ## GRB identifier from theh spreadsheet
RUNID= ## If this is a rerun, use this line to indicate which rerun. Otherwise this can be blank.

INI_LOC="/home/hannah.griggs/nu/pynu_tests/o2grbs/" ## Change to the location of your run
STATISTIC_FILE="home/hannah.griggs/nu/pynu_tests/skyloc/dtdphase/L1H1-stat-GRB${ID}.hdf" ## Change path to where your stat files are

CHUNK=6 ## Change to chunk you are working with
GPS_START_TIME=1170948618 ## Change to start and end times of the chunk
GPS_END_TIME=1171632618
 

### To rerun an analysis using existing results, we need to use a **cache file**

In the `maps` directory in pynuruns, you will notice a file called `chunk2.map`. This is an example of a cache file that tells PyCBC the jobs it doesn't need to redo. In that file are two `HDF_TRIGGER_MERGE` files, one for each IFO.

I have collected the results of previous PyCBC runs into my version of the `chunkinis` directory. In each `.map` file in the `maps` directory are the relevant output files that we don't need to rerun with the PyNu trigger re-ranking. 

**Test a run with these map files left as is.** If file transfers are failing during your run, you may need to copy the files in my `chunkinis` directory over into yours. This makes transferring more efficient during runs. If you need to do this, note the new locations of the files in the `.map` file for that chunk.

### Good to go! Run the analysis with:

In [None]:
./runhlo2.sh

You will be prompted to enter your password, then it'll be off. 

## Part 3: Troubleshooting if jobs are struggling:

You'll need to babysit the jobs since they've been having issues with disk space. Your run will live in a directory named after your GRB, like `output<GRBNAME_RUNID>`.
Check how the queue is doing from within the run directory with:

In [None]:
./status

**If a small cluster of jobs fail**, let the analysis get as far as it can until the status updates to (FAILURE).

### Restarting a job that failed

Once it fails, edit the "start" script (in your run output directory) to include the preamble for the run.sh script (for authentication reasons):

In [None]:
ecp-get-cert --destroy
htdestroytoken
kinit hannah.griggs ## REMEMBER TO CHANGE TO YOUR NAME
unset XDG_RUNTIME_DIR
htgettoken -a vault.ligo.org -i igwn --scopes dqsegdb.read,gwdatafind.read,read:/frames,read:/ligo,read:/virgo,read:/kagra
condor_vault_storer -v igwn
export GWDATAFIND_SERVER="datafind.ligo.org:443"
PEGASUS_PYTHON=/home/ian.harry/conda_envs/pegasus_python/bin/python PATH=/home/ian.harry/conda_envs/pegasus_python/bin/:${PATH}

pegasus-run /local/hannah.griggs/pycbc-tmp_u_1hqa3g/work $@

**I made a script that does this for you. From your output directory, simply run:**

In [None]:
./../startmodifier.sh start

Then you can restart the job with:

In [None]:
./start

### Restarting a job that's held

**If jobs are getting held**, see the reason with:

In [None]:
condor_q better-analyze

This will tell you which job requirements are insufficient and by how much. If memory or disk space are the problem, update held jobs like this:

In [None]:
condor_qedit -constraint "JOBSTATUS==5" RequestDisk=newrequestamount

Change RequestDisk to RequestMemory as needed, and only request a little over what the jobs seem to need. If a job fails continuously for memory issues, you can up the RequestMemory to `100000`. That is a lot of memory but the job will finish.

Release jobs again with 

In [None]:
condor_release -constraint "JOBSTATUS==5"

## Part 4: When the run is done

The run is done when a file called:

In [None]:
H1L1-PAGE_FOREGROUND_FULL_DATA-......html

appears in the `output<GRBNAME_RUNID>/results/8._open_box_result` directory. The file will be tagged with the GPS start time and duration of the chunk you used.

**Once this file populates**, copy it to the `results` directory that I put in pynuruns. Rename it to indicate the GRB it reflects, as:

In [None]:
cp ${pwd}/output${GRBNAME}/results/8._open_box_result/H1L1-PAGE_FOREGROUND_FULL_DATA-1239800000-200000.html ${RESULTS_PATH}/results/output${GRBNAME}_FG.html

Adjust the specifics of the copy command to reflect the html file you wish to copy, the location of the `results` directory to which you want to copy, and the name you want it to have. 

Now, **copy the PyCBC all-sky file from the corresponding chunk** from this directory archived by Derek Davis:

In [None]:
/home/derek.davis/public_html/cbc/O2/clean_data_runs/

For example, **if I am working with Chunk 2**, I would copy this file into my `results` directory and rename it:

In [None]:
cp /home/derek.davis/public_html/cbc/O2/clean_data_runs/o2-c02-clean-analysis-2-v1.9.1/7._open_box_result/H1L1-PAGE_FOREGROUND_HTML-1164556817-1929600.html /home/hannah.griggs/<path-to-results>/outputallskychunk2_FG.html

## Part 5: Calculating a p-value for your box

### I wrote a wrapper script `pvalue.sh` which does the three parts of the PyNu processing: `csv_maker`, `backgroundpval`, and `foregroundpval`. 

#### Edit  `pvalue.py` with the relevant information for your GRB as well as your results location:

In [None]:
# Identificaiton info for PyNu runs
grbid='170121067'
grbtime=1168997831.64
chunk='3'
input_directory='/home/hannah.griggs/nu/pynu_tests/o2grbs/results'

To keep things organized, the results from here out will be dumpted into a directory within `results` called `pvals`.

There are three scripts to run here, `csv_maker.py`, `backgroundpval.py` and `foregroundpval.py`. `csv_maker` combines the results from PyNu with the corresponding PyCBC allsky chunk. `background` calculates the incidence of a range of modified Z-scores in the full box. With this frequency of Z-values established, we can compare the significance of our trigger Z-scores to the background, which is what `foreground` does.

### You don't have to run these individually. With all of the variables in place in `pvalue.sh`, it will run `csv_maker.py`, `backgroundpval.py`, and `foregroundpval.py`. Simply set the run off with:

In [None]:
./pvalue.sh cfb

The `cfb` indicates which of the scripts you want to run. IF you want to just run the foreground script again, for instance, just put `f`.

### This will print four lines which you need to save in any way you see fit. **Please report the p-value printed, the trigger time identified as most significant for the GRB, and it's modified Z-score in the spreadsheet by your GRB.**

For example, the run for GRB 161212652 prints the following:

In [None]:
Top Z-score for signal end time 1165408451: 0.6744712302670444 at 1165407882.6
Number of non-signal end times with Z-score >= 0.6744712302670444: 13243
Probability of another end time having a Z-score >= 0.6744712302670444: 0.3645

So, I would say that the trigger at time **1165407882.6** recieved a **Z-score of 0.6744712302670444** and **p-value of 0.3645**. 

Note that this trigger time is very far away from the GRB T0. In reality, within a generous +/-10 second allowance, there was no trigger above ranking statistic of 5, whichis the lowest that PyCBC saved back in O2. So, for illustration, I expanded the trigger_timewindow to +/-1000 seconds. 

If there's no matching time, it will look more like:

In [None]:
Top Z-score for signal end time 1165408451: nan at nan
Number of non-signal end times with Z-score >= nan: 0
Probability of another end time having a Z-score >= nan: 0.0000

In O3, triggers were saved down to far lower ranking statistics, so this shouldn't be a problem for the O3 GRB analysis.

## That's all! Please reach out with any questions or if things are not working.