# Cookbook

## 0 - Archive update
[0]: SNfactoryDataProcessing.html#a-archive-update

**Remark**: Only the step [0.1][0.1] is needed and should be checked prior to the data processing. The other steps are not needed for data processing (but are needed to secure our raw data by multi-copy...) and are under the responsability of the SNIFS run coordinator.

### 0.1 - Check that the disk copy at CC-IN2P3 is up to date
[0.1]: SNfactoryDataProcessing.html#check-that-the-disk-copy-at-cc-in2p3-is-up-to-date

* log at summit on lbl1 in the visitor VNC ([port 33](https://snf-doc.lbl.gov/twiki/bin/view/Top/ConnectingToMaunaKea))

        ssh -C -L 5903:128.171.72.171:5933 snifs@snifsgateway.ifa.hawaii.edu
        vncviewer -FullColor -Shared  localhost:3

* check available data under /lbl2/data/11/ 
* launch:
   
       export_sync -d day_start,day_end -q
       
* this will perform any needed copy in quiet mode (=-q= = no questions asked as it runs) to the CC-IN2P3, to bring up to date not only the raw data archive, but also logs and finding charts.

**Remark**:

* `day_end` should be over and keep in mind that we are talking in UTC days. In practice, the copy of the data taken a given night can only be checked the morning (french time) of the day after the end of the data taking.
* At the summit a cron job is run each day at 1h UTC to automatically prepare the logs of the past day and copy them to CC-IN2P3. This cron job will also check that the archives at CC-IN2P3 for the corresponding day are OK. This automated update of the archive at CC-IN2P3 may have a problem, that's why this "by hand check" should be done. Known problems:
    * SRB / network / CC-IN2P3 could be down at the time of the cron job run... leading to a CC-IN2P3 archive not up to date. This happens ~ twice a year.
    * it happened from time to time that a file is not correctly copied even after the cron process. This was related to to hiccups of the name server at the summit (should be over now). To run by hand =export_sync= as described above should protect us against this kind of problem. If a file is found as not copied (and then updated by hand with =export_sync=) this is an indication that some problem occurred, try to see if there is a good reason for that (CC-IN2P3 down at some point... lbl2 down?)
* there is a problem than this step will not fix:
    * ls -tlr /sps/snovae/SRBregister/log/11/28*/snifs_run*
    * lbl2 could have been down when the cron job should have run: so not only the archive may not be up to date at CC-IN2P3, but also the logs on lbl2 (and at CC-IN2P3) for the corresponding day may be missing. Such problem is not automatically detected by this step, if you have a doubt check that there is a *snifs_run* file in */lbl2/log/YY/DAY/* directory and corresponding one in */sps/snovae/SRBregister/log/YY/DAY* in CCIN2P3. If this file is not at CC-IN2P3 at the registration of the corresponding night _nothing_ will happen (no registration)... This is a major problem, which can "easily" be fixed by Main.PierreAntilogus (= bad guy which does not provide a full documentation).
* Empty files: It happens sometimes that some guiding videos are not saved properly (P_vid), for exemple in the case where a `stop_script -t now` is executed. In that case, the corresponding files will be empty (0k), and export_sync will ignore them while doing the transfer to the CC. If you see a difference between the number of files at the summit and on the CC, check on this side first. If this is the case, then there is nothing you can do about it. 
   
### 0.2 - Update the LBL HPSS archive
[0.2]: SNfactoryDataProcessing.html#update-the-lbl-hpss-archive

Can only be run once [0.1][0.1] has been completed.

   * you can check the availability of HPSS at [here](http://www.nersc.gov/users/live-status/)
   * log under `snprod@ccage.in2p3.fr` then:

        cd data_copy
        hsi_import -d DDD,DDD -y YY

* copy and paste the last line returned by the last command and execute it, it will submit the archive update in batch. `qstat` will show you status of the job, once the job is over you'll have to check the error log file (`jobname.exxxx`) to see if the backup was done successfully.

* to check that the data was transferred do:

        hsi (now have to be run under scientific linux 5, snprod@ccagesl5.in2p3.fr)
        cd /nersc/projects/snifs/hawaii_new/
        [check the year and days you expect to find]
        [compare them with the data in /sps/snovae/SRBregister/hawaii/ (for old data go to /nersc/projects/snifs/hawaii/) under CC-IN2P3]
        exit

* note that sometimes HPSS has hiccups or goes down for routine maintenance (see the webpage - usually it's Tuesdays). If it has a hiccup on one particular day of your data as indicated in the `jobname.exxxx` file, you can do `hsi_import -d day` to simply re-import that day when HPSS is available again.
   
### 0.3 - Clean the summit 
[0.3]: SNfactoryDataProcessing.html#cleanthe-summit

Once [0.2][0.2] is completed, you can clean the raw data summit directory which have a double archive (on disk at CC-IN2P3 and in one of the HPSS archives).

* clean /lbl2/data/ and /lbl3/data/ (but leave /lbl1/) with rm -Rf /lbl2/data/11/ddd. Do one night at the time
* at the summit, the log and finding chart directories should not be "cleaned"
    * for them the double copy is: disk at summit and disk at CC-IN2P3

### 0.4 - Update the CC-IN2P3 HPSS archive 
[0.4]: SNfactoryDataProcessing.html#update-the-cc-in2p3-hpss-archive

Can only be run once [0.1][0.1] has been completed.

* this is done under the *snprod* account in /afs/in2p3.fr/group/snovae/snprod/backup.
* useful when [0.2][0.2] doesn't work for some large period of time, otherwise PierreAntilogus takes care of that once a year.
* at this time (Feb 29, 2016), it is unclear when was the last CC-IN2P3 HPSS update.
   
**From now on, you should be logged into CC-IN2P3 as `snprod@ccage.in2p3.fr`.** You can check Job status with "qstat" or "qstat2"

## A - Header DB update    
[A]: SNfactoryDataProcessing.html#a-header-db-update

Can only be run once [0.1][0.1] has been completed.

   1. edit *~/db/SGE/snf_header* by changing the *-u* argument (i.e, the last day to update) to the `SnfDjangoMigrUpdate` command
   2. submit the update job: 
      * `qsub snf_header`
   3. when job has finished, check the SNF-PRODSETUP-header.oxxxxxxx output
   4. test for weird PI_NAME entries
      * `../test_header`
      * is everything OK?

Remarks

* due to the stupid logic used to select the files to load in the DB before starting to load data for a new year, you have first to fill by hand the first day of the year with data (be careful, all the days before the one you select will never be entered in the DB...). For example for 2009 we had to do first: =SnfDjangoMigrUpdate -y '09' -d '010' -v=, as day 010 was the first day with data in 2009. Once this is done you can start to fill the files of the year as described in [A.1][A.1].
* it happened that the header job got stuck (I think it didn't liked Pxxxx files in the same directory as the corresponding xxx files... someone at summit did a preprocess in the raw data directory... I even found a script in the same directory to do that...). In this case you may have to kill the job, or it may happen that some other problem leads to a premature end of the job... So one night may end up to not be fully integrated into the header DB... to finish the night, you cannot use [A.1][A.1], you should use the same method than in the first remark, by setting the arguments to the "incomplete day".
* you should also know that this "per day" command, if the requested day is already 100% entered in the DB, will not do anything if you send it a second time... so there is no risk to run it twice for a given day.
   
## B - DB update
[B]: SNfactoryDataProcessing.html#b-db-update

Can only be run once [A][A] has been completed.

### B.1 - Pickle file
[B.1]: SNfactoryDataProcessing.html#b-1-pickle-file

* update by hand =~/db/SGE/snf_db_make= with new days
* submit the job
    * `qsub snf_db_make`  [ =-me -mu user@node= ]
* if you need a different code version (`HEAD`) to run a modified !SnfUpdate, submit the job instead using
    * `./qsub_version snf_db_make HEAD`
* look carefully the job log, 
    * check if for the new period there are PI programs other than SNfactory, did they fill this correctly in their event header?
    * check for warnings
        * `grep WARN <*.e* log file>`
    * check for bugs on target association
        * `grep "is known as" <*.e* log file>`
    * check for bugs on new targets
        * `grep -E "New .* Target" <*.e* log file>`
    * make sure Meteo has run. If not wait till you get Meteo by testing with
        * `Meteo -a`

Remark: the CFHT meteo files and a current IAUC list are fetched automatically during the execution of `snf_db_make`.
   
### B.2 - Fill the DB
[B.2]: SNfactoryDataProcessing.html#b-2-fill-the-db

* submit the job
    * `qsub snf_db_fill`   [ =-me -mu user@node= ]
* if you need a different code version (=HEAD=) to run a modified !SnfUpdate, submit the job instead using
    * `./qsub_version snf_db_fill HEAD`

Remark:

* `snf_db_fill` does not need to be modified: it will search for the *latest* created pickle file and insert its contents into the DB.
* the [night plots](http://snovae.in2p3.fr/snprod/PhotoNight/]) and SkyProbe photometricity information are fetched automatically in this step.
   
## C - Update Target & Run info
[C]: SNfactoryDataProcessing.html#c-update-target-run-info

Can only be run once [B.2][B.2] has been completed.

* the values of **Target.Kind** & **Target.Type** can be synced with the !WareHouse DB and/or the IAUC listing by running
    * `SyncTarget -RE` to see what will happen
    * `SyncTarget -RE`= to register changes into the DB
* the value of =Run.Kind= can be updated to flag FinalRefs etc., by running
    * `FlagRunKind` to see what will happen
    * `FlagRunKind -rq` to register changes into the DB (remove the =-q= to have a question asked for each proposed change)

   
## D - Preprocessing
[D]: SNfactoryDataProcessing.html#d-preprocessing

Can only be run once [C][C] has been completed.

### D.1 - Preparation for the data processing
[D.1]: #d-1-preparation-for-the-data-processing

* under **snprod** go to the right area to submit batch jobs
    * cd $JOBDIR/
* generate the list of nights to process. In practice we limit ourself for the moment to the nights with at least 1 spectrum. For this we use a "simple" program which generates the list of nights, then you can edit this list to further limit to the nights you want. In the rest of the cookbook, the file with this list of nights is called `night.list`.
* to produce it you should run `list_night`.

### D.2 - Prepare and send the jobs
[D.2]: SNfactoryDataProcessing.html#d-2-prepare-and-send-the-jobs

* do not forget to `cd $JOBDIR/PFQ` (or any other subdirectory that is necessary) to submit batch jobs
* prepare the jobs
    * `batch_jobs -p Pr${SNF_VERSION_LITE} -m z night.list`
* submit them
    * `batch_jobs -p Pr${SNF_VERSION_LITE} -m z night.list -s`
* check the results:  jobErrors id-s 10/*/*Pr*s
* OK? Validate the production
    * `manage_jobs -j "SNF-0115-Pr0115z"` (*check* if it returns the proper jobs)
    * `manage_jobs -j "SNF-0115-Pr0115z" -s 2` (set each Process.Status to 2, *-b* for batch job)
    * `manage_jobs -j "SNF-0115-Pr0115z-10145" -s 2` (validate job 145 only)

Remark:

* you can use *-D* on any `batch_jobs` call to have it print the commands that it will execute without really doing it (dry-run)
* In case of problems:
    * Stop the jobs : `qdel jobname` (you get the full job name with =`job -wide`) 
    * Delete the information already in database `manage_jobs -j "SNF-0110-Pr0110z" -d`
      
## E - Cube generation
[E]: SNfactoryDataProcessing.html#e-cube-generation

Can only be run once [D][D] has been completed.

        cd $JOBDIR/PCG
        batch_jobs -p Cu${SNF_VERSION_LITE} -m c -a "--qcopt='-J' -D" night.list
        batch_jobs -p Cu${SNF_VERSION_LITE} -m c night.list

The arguments used for `plan_cube_generation` are:

* `-D`: perform dark correction with fit_background
* `--qcopt='-J'`: use extract_spec2 (replacement for '-O') in quick_cube

In case of problems with your commands, you can

* do not worry about the comments from `batch_jobs`: failed test p_ok, batch_jobs: failed test b_ok, batch_jobs: failed test m_ok=
* `qdel <jobname>`
* `manage_jobs -j <jobname> -d` (to clean the "job table" before resubmitting)
* check the results: `jobErrors -s YY/*/*Cu*sh`
* OK? Validate the production
    * =manage_jobs -j "SNF-0117-Cu0117c-10"= (*check* if it returns the proper jobs for jobs from 2010)
    * =manage_jobs -j "SNF-0117-Cu0117c-101" -s 2= (set each Process.Status to 2, =-b= for batch job)
   
## F - PSF extraction
[F]: SNfactoryDataProcessing.html#f-psf-extraction

Can only be run once [E][E] has been completed.

        cd $JOBDIR/PES
        batch_jobs -p ES${SNF_VERSION_LITE} -m e -a "--truncateR 5100,9700 -R" night.list
        batch_jobs -p ES${SNF_VERSION_LITE} -m e night.list -s

The arguments used for `plan_extract_star` are:

* `-truncateR=`: truncate the R spectrum wavelength range after extraction
* `-R=`: create residual spectrum of !StdStars

* OK? Validate the production
    * `manage_jobs -j "SNF-0115-ES0115e"` (*check* if it returns the proper jobs)
    * `manage_jobs -j "SNF-0115-ES0115e" -s 2` (set each Process.Status to 2, `-b` for batch job)
   
## G - Multi-filter ratios
[G]: SNfactoryDataProcessing.html#g-multi-filter-ratios

Can only be run once [D][D] has been completed.

### From scratch
* `cd $JOBDIR/PPR`
* `batch_jobs -p NEWMFR -m h -a "-D" targets.list`
* `batch_jobs -p NEWMFR -m h targets.list -s`

### Incrementation
* `batch_jobs -p INCMFR -m h -a "-ID -e PTF11kly" nights.list` (exclude =PTF11kly=, that's a special one for MFR)
* how to increment until a particular day (eg. *12_262*), instead of the latest one available in the DB
    * `batch_jobs -p INCMFR -m h -a "-ID -e PTF11kly --until 12262" nights.list`
* check the objects that will be incremented or remade
    * `egrep "args = \[" ??/???/*-?????.py` (incremented)
    * `egrep "args = \[" ??/???/*-?????new.py` (new)
* `batch_jobs -p INCMFR -m h nights.list -s1

### Validation
* just an *example*, it could be **NEWMFR** instead of **INCMFR**
* `manage_jobs -j "SNF-${SNF_VERSION_LITE}-INCMFRh-112[89]"` (*check* if it returns the proper jobs)
* `manage_jobs -j "SNF-${SNF_VERSION_LITE}-INCMFRh-112[89]" -s2`

### Scale factors

* the scale factors plan should *only* be run *after* PMS has been run on new nights and the photometricity updated
* there is *no need* to run this if there are no new photometric nights
    * you can test for that using:
    * =SnfPhotometricity nights.txt | grep True=

* `cd $JOBDIR/PPR/SF`
* `plan_photometric_ratios --scale_factors -I -p SNF-${SNF_VERSION_LITE} -e PTF11kly`
* `snf_qsub SNF-${SNF_VERSION_LITE}h-SFxxxxx.py`

* Validation
    * `manage_jobs -j SNF-${SNF_VERSION_LITE}h-SFxxxxx` (*check* if it returns the proper jobs)
    * `manage_jobs -j SNF-${SNF_VERSION_LITE}h-SFxxxxx" -s2`
   
## H - PSF estimations
[H]: SNfactoryDataProcessing.html#h-psf-estimations

Can only be run once [F][F] has been completed.

* `list_followed_targets > sne.list`
* Full run (redo everything)
    * `batch_jobs -p !gsPSF -m g -k sne.list`
    * if there is a new calibration by Gerard, only rerun =spectrophoto= (this will not happen frequently)
    * `batch_jobs -p !gsPSF -m g -a "-P" -k sne.list`
* Incrementation
    * `batch_jobs -p !gsPSF -m g -a "-i" -k sne.list`
* `batch_jobs -p !gsPSF -m g -k sne.list -s`

* validate
    * `manage_jobs -j "SNF-${SNF_VERSION_LITE}-gsPSFg" -s 2`
   
## I - Flux calibration
[I]: SNfactoryDataProcessing.html#i-flux-calibration

Can only be run once steps [F][F] (PSF/spectrum extraction) and [G][G] (multi-filter ratios) have been completed.

       cd $JOBDIR/MoreFlux/
       meta_plan_flux YY_DDD YY_DDD YY_DDD with list of new nights to process

This sets up a directory like FluxABCD with the necessary target and night lists, subdirectories, and a set of instructions.

* Follow instructions in the produced =FluxABCD/README.txt= file to run the production
*  BEWARE!!!! There are many steps in the README to follow before one gets calibrated spectra!

Notes:

* If you need to change the job configuration (e.g. a different prefix), you can do this via a yaml config file or via `meta_plan_flux` command line options.  Run `meta_plan_flux --help` for details.
     
## Additional steps

### Cleaning Up

Occasionally things get really screwed up and it is better to cleanup a set of jobs and start again.  You can use `batch_jobs` to do this.  First select the jobs you want to cleanup:

        manage_jobs -j JOBPREFIX

Make sure that list contains the jobs you want _and only those jobs_.
Incremental production jobs have similar names and if you get this wrong, you
could accidentally delete a lot.  When you are really sure you have the correct
set of jobs:

        manage_jobs -j JOBPREFIX --delete --batch

And then use qsub (not snf_qsub) to submit the script it creates.  If you are only
deleting one or two jobs, you can leave off the =--batch= option and it will
delete the jobs without needing to submit a batch job.

### Full Flux Calibration

Occasionally we need to do a complete, self-consistent flux calibration on all targets on all nights including DDT host galaxy subtraction and lightcurve fits. This is not necessary on a nightly basis. Instructions for a full flux calibration processing run are at Offline.SnifsPipelineHowTo#Flux_Pipeline_Putting_it_all_tog.