# HST WFC3/UVIS Reduction and Dark Correction for a Set of Observations

This notebook describes the step-by-step procedure required to improve the reduction of darks and include these in the reduction of a set of Hubble Space Telescope (HST) Wide Field Camera 3 (WFC3)/UV-Visible (UVIS) observations. This is based on the Space Telescope Science Institute (ST/STScI) standard WFC3/UVIS darks reduction pipeline as copied from GitHub 19th April 2019. All changes made to the ST codes should be denoted by "LP" in the comments. Written by Laura Prichard, April–September 2019.

___________________
# Setup

This notebook should come in a darks reduction package (`hst_wfc3_uvis_reduction/`) in the directory `hst_wfc3_uvis_reduction/darks_codes/lp_darks/`. In `darks_codes/` is: 

    - st_darks/ – the ST standard darks reduction copied from GitHub 19th April 2019.
    - lp_darks/ – copy of the st_darks/ directory that was edited.
    - mr_darks/ – copy of Marc Rafelski's codes and reference tables some of which were implemented in the lp_darks/ version of the ST pipeline.
    
In `darks_codes/lp_darks/` is the following: 

    - the CTE correction code directory (cal_uvis_make_ctecorr_darks/), 
    - the directory of the main codes for the pipeline (cal_uvis_make_darks/, main code is cal_uvis_make_darks.py), 
    - a folder for reference tables i.e. dark_lookup.txt (ref_files/),
    - an empty folder for terminal outputs for the codes (logs/),
    - a code to download data with astroquery and organize files ready for reduction (download_data.py), 
    - a code to CTE correct science data (ctecorr_scidata.py),
    - and this notebook (darks_reduction.ipynb). 

The darks codes are kept in a root directory `hst_wfc3_uvis_reduction/` along with the data directory e.g., `darks_data/`. This should contain three more directories (all empty), one for the raw data (e.g., `raw_darks/`), reduced data (e.g., `red_darks/`) and a directory for STScI calibration files (e.g., `st_calib/`, more info below on this). These will be in the `hst_wfc3_uvis_reduction/darks_data/` directory, otherwise they should be made should be made in a location of choice but be kept seperate from the code directory, e.g.:

    - darks_data/red_darks/ – location for the reduced data pipeline outputs
    - darks_data/raw_darks/ – location of the raw data
    - darks_data/st_calib/ – location of STScI calibration files used by the pipeline (more info below)



**1) Set directories**

They should be defined and made if they don't exist. An example of this is below:

In [None]:
# Edit and run this to define and make the following directories
import os

# Setting path names
DARK_ROOT = "/user/lprichard/hst_wfc3_uvis_reduction"  #EDIT! Root directory for whole of the darks codes and data
CODE_DIR = os.path.join(DARK_ROOT, 'darks_codes')  #Location of the darks_codes directories that come with this darks reduction package
DAT_DIR = os.path.join(DARK_ROOT, 'darks_data')    #Data directory to contain the following sub directories
RAW_DIR = os.path.join(DAT_DIR, 'raw_darks')       #Raw data directory
RED_DIR = os.path.join(DAT_DIR, 'red_darks')       #Reduced data directory
CAL_DIR = os.path.join(DAT_DIR, 'st_calib')        #ST calibration files directory

print('DARK_ROOT =', DARK_ROOT)
print('CODE_DIR =', CODE_DIR)
print('DAT_DIR =', DAT_DIR)
print('RAW_DIR =', RAW_DIR)
print('RED_DIR =', RED_DIR)
print('CAL_DIR =', CAL_DIR)

# Making data directories if they don't exist
if not os.path.exists(DAT_DIR): 
    os.makedirs(DAT_DIR, 0o774)
if not os.path.exists(RAW_DIR): 
    os.makedirs(RAW_DIR, 0o774)
if not os.path.exists(RED_DIR): 
    os.makedirs(RED_DIR, 0o774)
if not os.path.exists(CAL_DIR): 
    os.makedirs(CAL_DIR, 0o774)

**2) Add the codes directory to your PYTHONPATH**

Add the folder *above* `cal_uvis_make_ctecorr_darks/` and `cal_uvis_make_darks/` to your PYTHONPATH,

    e.g. in ~/.bashrc `export PYTHONPATH="[CODE_DIR]/lp_darks:$PYTHONPATH"`

Initialize each directory if not done already, in `DARK_ROOT/darks_codes/lp_darks/cal_uvis_make_ctecorr_darks/` and `DARK_ROOT/darks_codes/lp_darks/cal_uvis_make_darks/`, do `touch __init__.py`

**3) Get STScI standard calibration files**

Copies of the STScI standard calibration files used in the reduction are required. These should be put in the `st_calib/` directory (or another if preferred). Set this location of the calibration files directory as an input to the main darks reduction code (`cal_uvis_make_darks.py`) using the flag `-c|--cal_dir`.
    
Get the latest copy of the calibration files (from Catherine Martlin - cmartlin@stsci.edu, or any of the WFC3 team with access to the files below) and put each in `st_calib/` (or your preferred location):

    /grp/hst/wfc3k/uvis_darks/exclude.list
    /grp/hst/wfc3k/uvis_darks/crr_for_dark.fits
    /grp/hst/wfc3k/uvis_darks/crr_for_hotpix.fits
    /grp/hst/wfc3b/calibration/history.txt
    
According to Catherine, history.txt is the only one that is regularly updated and is done by Sylvia Baggett (WFC3 Branch Manager) around every month with both anneal (publicly available) AND lockups (not public), making this file internal to ST. A new copy of this file and the others (to be sure) are required regularly.

____________________
# Download and Organize Darks

**4) Get anneal cycle dates**

Get the dates of the observations in question (e.g. from MAST https://mast.stsci.edu/portal/Mashup/Clients/Mast/Portal.html) and then find the WFC3 anneal cycles that span the dates of the observations. From http://www.stsci.edu/hst/instrumentation/wfc3/performance/monitoring/complete-anneal-history (try Safari or other browser if not working with Chrome, email WFC3 team member if not up to date), get a list of the dates & times for all anneal cycles that your spans your data, e.g. 

    58561.85943287  2019.078 20:37:35 Mar 19 anneal
    
This is the start date (MJD YEAR.DAY HH:MM:SS, Month YY anneal) of the anneal cycle. The anneal cycle ends the file before that of the next listed anneal date and time, e.g.:

    58590.47559028  2019.107 11:24:51 Apr 17  anneal
    
Get the start/end anneal cycle Modified Julien Date (MJD, first value) for ONE anneal cycle at a time as inputs to the codes below.

If wanting to process darks from the **current** anneal cycle, the anneal start date should still be given, and the anneal end date can be now. This is generated in MJD using the following code:

In [None]:
from astropy.time import Time

now = Time(Time.now(), format='iso').mjd

print('MJD Now:', now)

# Or can convert the YEAR.DAY HH:MM:SS format to MJD with the following
in_date = '2019.107 11:24:51'

# Calculate MJD
in_date = in_date.replace('.', ':').replace(' ', ':')
out_date = Time(in_date, format='yday').mjd
print('MJD of input date:', out_date)

**5) Download data with astroquery and organize**

Then run the following code that makes directories, downloads data with astroquery and sorts raw dark files ready for CTE corrections using `hst_wfc3_uvis_reduction/darks_codes/lp_darks/download_data.py`. Run using the following command in `hst_wfc3_uvis_reduction/darks_codes/lp_darks/`:

        python download_data.py [-t|--type] [-s|--data_start] [-e|--data_end] [-p|--proposal_id] [-r|--raw_dir] [-d|--download] [-l|--dload_date]

Required:
    
    [-t|--type] – String, the type of data to be downloaded, either ``dark`` for raw darks or ``science`` for raw science data. ``-s --data_start`` and ``-e --data_end`` must be set to the anneal start and end (or present) dates if --type==``dark``. ``-p --proposal_id`` must be set if --type==``science``, and start and end dates within that proposal ID are optional.
    [-s|--data_start] – String/float, required if --type==``dark``: MJD value of start date of ONE anneal cycle to at least 6 d.p. from the webpage above (Step 4). Optional if --type==``science``: science data start date if a section of a proposal's data is to be downloaded rather than the whole program.
    [-e|--data_end] – String/float, required if --type==``dark``: MJD value of end date of ONE anneal cycle to at least 6 d.p. from the webpage above (Step 4). Optional if type==``science``: science data end date if a section of a proposal's data is to be downloaded rather than the whole program. All files up to but NOT INCLUDING the end date are selected.
    [-p|--proposal_id] – String, proposal ID of the science data to be downloaded, REQUIRED if --type==``science``. NOT NEEDED needed for darks download.
    [-r|--raw_dir] – String, path to the directory where raw data should be stored. A directory for each anneal cycle will be made within that, along with seperate sub-folders for downloaded, raw and CTE corrected darks. No trailing slash "/".
    
Optional:

    [-d|--download] – Flag, is ``True`` if provided, ``False`` otherwise and is required if you want astroquery to download the raw files.
    [-l|--dload_date] – String, download date, should be set if -d|--download is not set as this points to the directory where the data were downloaded which is timestamped. Should be in the format ``YYYYMonDD`` e.g. ``2019Aug13``.
    
Execution of this script will create the following file tree:

    <anneal_date:YYYYMMMDD>anneal_rawdarks_aquery<download_date:YYYYMMMDD>/
        ctecorr_darks/
        mastDownload/HST/  (created by the astroquery download command)    
            id**/          (creates ID specific folders with the raw files in)
        raw_darks/         (folder for the raw darks combined from their individual download directories)
    
An example of using the code in a terminal is below:

In [None]:
# ************************************************************
# Example of inputs
# -t|type = 'dark'
# -s|--anneal_start = '58676.18376157'
# -e|--anneal_end = '58694.25990740'
# -r|--raw_dir = '/user/lprichard/hst_wfc3_uvis_reduction/darks_data/raw_darks'

# Set the following flag which will set download to ``True``
# -d|--download
# OTHERWISE set dload_date to point to a directory of data downloaded 
# on this date to do just the file organization and copying
# -l|--dload_date = '2019Aug13'
# ************************************************************

# Move into codes directory and start screen
# cd /user/lprichard/hst_wfc3_uvis_reduction/darks_codes/lp_darks
# screen -S darks_dload

# Use the following command in a terminal, the "mode" and "now" definitions ahead are for the log 
# name that the terminal output is piped to which is called at the end of the commnad. "pdb" ipython command is for debugging.
# Set to download
now=$(date +"%m_%d_%Y") && ipython --pdb -c "%run download_data.py  -t 'dark' -s '58676.18376157' -e '58694.25990740' \
    -r '/user/lprichard/hst_wfc3_uvis_reduction/darks_data/raw_darks' -d" 2>&1 | tee /user/lprichard/hst_wfc3_uvis_reduction/darks_codes/lp_darks/logs/log_aquery_$now.txt
# OR for file organization of downloaded data
now=$(date +"%m_%d_%Y") && ipython --pdb -c "%run download_data.py  -t 'dark' -s 58676.18376157 -e 58694.25990740 \
    -r '/user/lprichard/hst_wfc3_uvis_reduction/darks_data/raw_darks' -l '2019Aug13'" 2>&1 | tee /user/lprichard/hst_wfc3_uvis_reduction/darks_codes/lp_darks/logs/log_aquery_$now.txt

________________________
# CTE Correct Darks

**6) Check WFC3 CTE correction executable**

____________________________
NOTE: The following method is based on an old CTE correction code. As of Feb 2020, STScI are testing a new CTE correction code. Check back to the GitHub repository or the STScI website for updates prior to running this step in case this new code has been released.
____________________________

Using the edited version of `cal_uvis_make_ctecorr_darks.py`

    hst_wfc3_uvis_reduction/darks_codes/lp_darks/cal_uvis_make_ctecorr_darks/cal_uvis_make_ctecorr_darks.py
    
This does not depend on Quicklook (as used by STScI) and the files are organized using information in the header (rather than the default STScI file organization). This code is dependent on the standard CTE corrections code written by Jay Anderson and available here:

http://www.stsci.edu/hst/instrumentation/wfc3/software-tools/cte-tools

In a software directory of choice, download the file using `copy link address` on the above web page 
    
    e.g. cd /user/lprichard/software
    wget http://www.stsci.edu/~jayander/X/EXPORT_WFC3UV_CTE/wfc3uv_ctereverse.F
    
Following instructions on the web page, then do the following:

    gfortran wfc3uv_ctereverse.F -o wfc3uv_ctereverse.e
    
WARNING, the code must then be run on the computer that the software is compiled on. The SOFTWARE_DIR is a required input to the code.
    
The webpage says the following 

    "Then you can run the program.  This particular routine 
     insists that all the exposures be in the same directory.  
     The executable can be in a different directory, but the
     images you're reading in and operating on *must* be in 
     your current directory."

**7) Run CTE correction**

Calling sequence of the CTE correction code:

    python cal_uvis_make_ctecorr_darks.py [-c|--ctecorr_dir] [-r|--rwd_dir] [-s|--software_dir]
        
Required:

    [-c|--ctecorr_dir] -- String, path to the output CTE corrected data directory (as made in download_data.py to [raw_dir]/[anneal_dir]/ctecorr_darks, no trailing slash "/".
    [-r|--rwd_dir] -- String, path to the input raw data directory (all files in one folder as done in previous step with download_data.py to [raw_dir]/[anneal_dir]/raw_darks), no trailing slash "/".
    [-s|--software_dir] -- String, path to the compiled CTE correction code ./wfc3uv_ctereverse.e, no trailing slash "/".
    
The following is an example of running the code in a terminal:    

In [None]:
# ************************************************************
# Example of inputs
# -c|--ctecorr_dir = '/user/lprichard/hst_wfc3_uvis_reduction/darks_data/raw_darks/2019Mar19anneal_rawdarks_aquery2019Aug13/ctecorr_darks'
# -r|--rwd_dir = '/user/lprichard/hst_wfc3_uvis_reduction/darks_data/raw_darks/2019Mar19anneal_rawdarks_aquery2019Aug13/raw_darks'
# -s|--software_dir = '/user/lprichard/software' 
# ************************************************************

# Move into codes directory and start screen, should take ~1hour depending on processing power
# cd /user/lprichard/hst_wfc3_uvis_reduction/darks_codes/lp_darks/cal_uvis_make_ctecorr_darks
# screen -S ctecorr

# Use the following command in a terminal, the "mode" and "now" definitions ahead are for the log 
# name that the terminal output is piped to which is called at the end. "pdb" ipython command is for debugging.
now=$(date +"%m_%d_%Y") && ipython --pdb -c "%run cal_uvis_make_ctecorr_darks.py \
    -c '/user/lprichard/hst_wfc3_uvis_reduction/darks_data/raw_darks/2019Mar19anneal_rawdarks_aquery2019Aug13/ctecorr_darks' \
    -r '/user/lprichard/hst_wfc3_uvis_reduction/darks_data/raw_darks/2019Mar19anneal_rawdarks_aquery2019Aug13/raw_darks' \
    -s '/user/lprichard/software'" 2>&1 | tee /user/lprichard/hst_wfc3_uvis_reduction/darks_codes/lp_darks/logs/log_ctecorr_$now.txt


___________
# Update Headers to Point to Right Calibration Files

**If required**

Case specific: i.e. if you want to use a new bias file, check in the headers if the calibration files are as expected. If not, update them using something like the following:

In [None]:
import os
import glob
import astropy.io.fits as fits

# Move into CTE correction directory
os.chdir(CTE_CORR_DIR)

# For each file in there, check the headers and update:
for f in glob.glob('*rac.fits'):
    print(f)
    
    # Open image and get the header
    hdul = fits.open(f, 'update')
    hdr = hdul[0].header
    
    # Get and update bias
    print("Updating ", hdr['BIASFILE'], "to iref$38c19068i_bia.fits")
    hdr['BIASFILE'] = 'iref$38c19068i_bia.fits'    #Changed from old bias file 'iref$37n1502si_bia.fits'
    
    # Save changes
    hdul.flush()
    hdul.close()

__________________
# Reduce Darks

**8) Get anneal dates as inputs**

As in step 4), get the dates of the observations in question (e.g. from MAST https://mast.stsci.edu/portal/Mashup/Clients/Mast/Portal.html) and then find the anneal cycles needed for those data from http://www.stsci.edu/hst/instrumentation/wfc3/performance/monitoring/complete-anneal-history (try Safari or other browser if not working with Chrome, email WFC3 team member if not up to date).

Get start and end dates/times for **ONE** anneal cycle at a time in the form ``YYYYMMDD-HH:MM:SS``. An anneal cycle is labeled by its *start* date, i.e. March 19 2019 anneal has format `58561.85943287 2019.078 20:37:35 19-Mar anneal` on the webpage, this translates to an `--anneal_date` input of `20190319-20:37:35`. The appropriate `--endtime` of this anneal cycle is the start date of the next anneal cycle (April 17 2019) as all files *before* this time are included in the anneal cycle. So from the webpage: `58590.47559027 2019.107 11:24:51 17-Apr anneal` tanslates to `--endtime` input of `20190417-11:24:50`.

Again, if wanting to process darks from the **current** anneal cycle, the anneal start date should still be given, and the anneal end date can be now. This is generated in the right format below. For convenience, you can convert the start and end times to the required input format using the Modified Julien Date (MJD; values from link above) using the following code:

In [None]:
from astropy.time import Time

# ------------------------------
# INPUT
# Set the input anneal cycle start and end MJD dates
anneal_start = 58561.85943287
anneal_end = 58590.47559027
# ------------------------------

# Converting MJD to YYYYMMDD-HH:MM:SS format
# Anneal start date
t_start = Time(anneal_start, format='mjd')
an_start = t_start.strftime('%Y%m%d-%H:%M:%S')
print('Anneal start date: ', an_start)

# Anneal end date
t_end = Time(anneal_end, format='mjd')
an_end = t_end.strftime('%Y%m%d-%H:%M:%S')
print('Anneal end date: ', an_end)

# Anneal end date of now if in present cycle
now = Time(Time.now(), format='iso').mjd
t_now = Time(now, format='mjd')
an_now = t_now.strftime('%Y%m%d-%H:%M:%S')
print('Now end date: ', an_now)

**If running the code in mode (-m) "prod"**

This mode is currently the default used by STScI (although developments to change this default are underway as of Feb 2020). With this code, it is advised to use the ``dev`` mode which improves the quality of the darks significantly. ``dev`` takes darks from the concurrent anneal cycle to replace good pixels from, rather than from the previous anneal cycle (i.e. ``prod`` mode). This is usually done just to save time as one has to wait for the anneal cycle to end before processing data can begin. 

If you wish to use the code using the ``prod`` mode then this is possible but you have to retreive the masterdark from the previous anneal cycle in order to do so. These are stored in the following directory and can be requested from the WFC3 team:

    /grp/hst/wfc3k/uvis_darks/masterdarks/
    
Masterdarks have the following naming structure, where the below file is for e.g., Feb 21, 2019, the start date of the previous anneal cycle to that being run, e.g., March 19, 2019

    masterdark_2019-02-21_ctecorr.fits

These should be saved locally to the `masterdarks/` directory that should be made *within the reduction directory prior to running the code* and this masterdark placed there e.g.,

    /user/lprichard/hst_wfc3_uvis_reduction/darks_data/red_darks/masterdarks
    
NOTE: If running in ``dev`` mode, the `masterdarks/` directory will be made automatically in the next step.

**9) Run the code**

Calling sequence of the darks reduction code:

    python cal_uvis_make_darks.py [-a|--anneal_date] [-e|--endtime]
        [-c|--ctecorr] [-p|--postflash] [-m|--mode] [-r|--red_dir]
        [-d|--ctecorr_dir] [-l|--cal_dir] [-i|--iref_dir] [-f|--fitpix]
        
Required:

    [-a|--anneal_date] -- String, start date for ONE anneal in format generated in Step 8 from MJD date.
    [-e|--endtime] -- String, end date for ONE anneal in format generated in Step 8 from MJD date. OR today's date if within the anneal cycle to be reduced.
    [-r|--red_dir] -- String, path to reduction directory for outputs and reduced data, no trailing slash "/". A directory for each anneal cycle is made within that.
    [-d|--ctecorr_dir] -- String, path to raw CTE corrected data directory, no trailing slash "/".
    [-l|--cal_dir] -- String, path to STScI calibration files needed for reduction (Step 3), no trailing slash "/".

Optional:

    [-c|--ctecorr] -- Flag, advised with this version. Sets CTE corrected to ``True``, use for data that was CTE corrected prior to running code.
    [-p|--postflash] -- Flag, advised. Sets postflash to ``True``, postflashed darks will be used.
    [-m|--mode] -- String, method of replacing pixels in the superdarks, ``dev`` takes pixels from the concurrent anneal cycle's masterdark, ``prod`` takes pixels from the previous anneal's masterdark. Advised ``dev``, default ``dev``, other option ``prod`` (which is the current ST default).
    [-i|--iref_dir] -- String, path to IREF files. Default ``/grp/hst/cdbs/iref``, set if different. 
    [-f|--fitpix] -- Flag, advised. Option to create a hot pixel threshold function by row number to find the number of hot pixels to match that close to the read out, i.e. increases completeness of identifying hot pixels. The value is ``True`` (i.e. fit hot pixels with a threshold function) if provided and ``False`` (i.e. don't fit hot pixels with a threshold function) if not provided, in which case a constant hot pixel threshold (0.015 e-/s) will be used (standard in the STScI pipeline).
    
The following is an example of running the code in a terminal:

In [None]:
# *******************************************************************************
# Example of inputs
# Must be run for one anneal cycle at a time
# -a|--anneal_date = '20190319-20:37:35'
# -e|--endtime = '20190417-11:24:51'
# -m|--mode = 'dev'

# Directories set by user
# -r|--red_dir = '/user/lprichard/hst_wfc3_uvis_reduction/darks_data/red_darks'    #Reduction directory for all output files, advised not to put with data
# -d|--ctecorr_dir = '/user/lprichard/hst_wfc3_uvis_reduction/darks_data/raw_darks/2019Mar19anneal_rawdarks_aquery2019Aug13/ctecorr_darks'
# -l|--cal_dir = '/user/lprichard/hst_wfc3_uvis_reduction/darks_data/st_calib'
# -i|--iref_dir = '/grp/hst/cdbs/iref'   #same as default OR '/user/lprichard/hst_wfc3_uvis_reduction/red_dir_lp/myref/'

# Set the following flags which will set each to ``True``
# -c|--ctecorr
# -p|--postflash
# -f|--fitpix    
# *******************************************************************************

# Move into codes directory and start screen, should take ~1hour depending on processing power
# cd /user/lprichard/hst_wfc3_uvis_reduction/darks_codes/lp_darks/cal_uvis_make_darks
# screen -S darks
        
# Use the following command in a terminal, the "mode" and "now" definitions ahead are for the log 
# name that the terminal output is piped to which is called at the end. "pdb" ipython command is for debugging.
mode="dev" && now=$(date +"%m_%d_%Y") && ipython --pdb -c "%run cal_uvis_make_darks.py -a '20190319-20:37:35' \
    -e '20190417-11:24:51' -c -p -m '$mode' -r '/user/lprichard/hst_wfc3_uvis_reduction/darks_data/red_darks' \
    -d '/user/lprichard/hst_wfc3_uvis_reduction/darks_data/raw_darks/2019Mar19anneal_rawdarks_aquery2019Aug13/ctecorr_darks' \
    -l '/user/lprichard/hst_wfc3_uvis_reduction/darks_data/st_calib' -i '/grp/hst/cdbs/iref' -f" \
    2>&1 | tee /user/lprichard/hst_wfc3_uvis_reduction/darks_codes/lp_darks/logs/log_inputtest_$now.txt


__________________
# Copying and Delivering Darks

**10) Copy darks to the combined superdark directory**

If it doesn't exist (it should be created automatically in `cal_uvis_make_darks.py`), the following example code makes a `superdarks/` directory in the reduction directory e.g., `darks_data/red_darks/` for a combined directory of all the superdarks. It then copies the superdarks from a specified anneal cycle directory (made by `cal_uvis_make_darks.py` in the previous step) to that directory. The anneal cycle directory has the following form: `post-anneal-<anneal_date:YYYYMMDD>_procd-<process_date:YYYYMMDD>_<flags>/`.

In [None]:
import os
import glob
import shutil

# ------------------------------
# INPUTS
#These should be set by the user
RED_DIR = '/user/lprichard/hst_wfc3_uvis_reduction/darks_data/red_darks'  #General reduction directory containing the anneal directories
ANN_DIR = os.path.join(RED_DIR, 'post-anneal-20190319_procd-20190910_ctecorr_dev_fitpix')    #Anneal directory from which superdarks should be copied

# These shouldn't require editing if the above file structure is the same
PDARK_DIR = os.path.join(ANN_DIR, 'masterdark_create')  #This is the standard processed darks directory made by the pipeline
SDARK_DIR = os.path.join(RED_DIR, 'superdarks')         #Location of master superdark directory, this is made is cal_uvis_make_darks.py but checked for and made below if it doesn't exist
# ------------------------------

# Check for the superdark combined directory and make if it doesn't exist
if os.path.exists(SDARK_DIR):
    print('Superdark directory exists:', SDARK_DIR)
else:
    os.makedirs(SDARK_DIR, 0o774)
    print('Created superdark directory:', SDARK_DIR)

# Move to the processed superdark directory
os.chdir(PDARK_DIR)

# Copies superdarks out of the processed data directory to the superdarks/ directory if it doesn't exist
n=0
for sdark in glob.glob('d*_drk.fits'):
    # Define source and desitnation paths for each superdark
    src = os.path.join(PDARK_DIR, sdark)   #Source filepath of superdark to be copied
    dst = os.path.join(SDARK_DIR, sdark)   #Destination filepath of superdark to be copied
    
    # Check if the superdark is already in the superdarks/ combined directory, if not it is copied
    if not os.path.exists(dst):
        shutil.copy(src, dst)
        print('Copied {} to {}'.format(src, dst))
        n+=1
    else: 
        print('File exists {}, not copying superdark from {}'.format(dst, PDARK_DIR))
    
print('Copied {} files to combined superdark directory {}'.format(n, SDARK_DIR))

**11) Update the dark_lookup.txt file**

For processing your data using these darks, a reference table (`dark_lookup.txt`) is needed to identify which files to use. The following code reads in the current dark_lookup.txt file (**create a blank one if it doesn't yet exist**) and updates it with any new superdarks (names and useafter date) in the combined superdarks directory. It saves it to a specified output `dark_lookup.txt` file which can be the same as the input.

The input dark_lookup.txt file should have `|` delimiters and contain: superdark name (e.g. `d190320201_drk.fits`), and useafter dates (in long form e.g. `Mar 19 2019 20:37:35`, and MJD `58561.85943287037`) per row. 

Below is code to update the `dark_lookup.txt` file with options to set if the MJD useafter date is not listed in the input file (`calc_mjd=True`), and an overwrite option to overwrite existing filenames in the file (shouldn't need to overwrite usually). To save the output file, set `writef=True`, if `False` the combined output will be printed but the file not updated.

In [None]:
import os
import glob
from astropy.io.fits import getheader
from pdb import set_trace as st
from astropy.time import Time
from datetime import datetime as dt
import pandas as pd

# ---------------------------
# INPUTS
# Location of the copied superdarks, no trailing "/"
RED_DIR = '/user/lprichard/hst_wfc3_uvis_reduction/darks_data/red_darks'
SDARK_DIR = os.path.join(RED_DIR, 'superdarks')
# The location of the input dark_lookup.txt file, should have '|' delimeters
infile = '/user/lprichard/hst_wfc3_uvis_reduction/darks_codes/lp_darks/ref_files/dark_lookup.txt'
# The location of the output dark_lookup.txt file (can be the same as the input), will have filename, useafter date in long form and in MJD, and '|' delimeters
outfile = '/user/lprichard/hst_wfc3_uvis_reduction/darks_codes/lp_darks/ref_files/dark_lookup.txt'

# If the infile does not have a useafter MJD column, set this to True to calculate one
calc_mjd = False
# Set overwrite to 'False' if you don't want to replace entries for superdarks already listed in dark_lookup.txt
overwrite = False
# If wanting to write the output file, set to 'True', otherwise the output will be printed but not saved to a file
writef = True
# ---------------------------

# Read in the input dark_lookup.txt
# If no useafter_mjd column exists, make one
if calc_mjd==True:
    df_in = pd.read_csv(infile, names='superdark useafter'.split(), delimiter='|')

    # Convert useafter dates to MJD for old file format
    uain_mjds = []
    for i, ua_in in enumerate(df_in['useafter']):
        date = dt.strptime(ua_in, '%b %d %Y %H:%M:%S')
        iso = date.strftime('%Y-%m-%d %H:%M:%S.%f')
        uain_mjds.append(Time(iso, format='iso').mjd)

    # Add new column to the input file for MJD
    df_in['useafter_mjd'] = uain_mjds
    
# If a useafter_mjd column exists, read in file as is
else:
    df_in = pd.read_csv(infile, names='superdark useafter useafter_mjd'.split(), delimiter='|')
    
# Move to the directory with the copied superdarks
os.chdir(SDARK_DIR)    

# Starting arrays for inputs of new superdakrs
filenames = []
useafters = []
useafter_mjds = []

# For each new superdark, get the filenames, useafter dates from header and calculate an MJD useafter date
n=0
for f in glob.glob('d*_drk.fits'):

    # Check if the superdark is already listed in the dark_lookup.txt file
    if not df_in['superdark'].str.contains(f).any() and (overwrite==False):
        
        print('Adding {} to dark_lookup.txt'.format(f))
        
        # Read in superdark header
        hdr = getheader(f)

        # Converting useafter date to MJD to sort
        useafter = hdr['USEAFTER']
        date = dt.strptime(useafter, '%b %d %Y %H:%M:%S')
        iso = date.strftime('%Y-%m-%d %H:%M:%S.%f')
        mjd = Time(iso, format='iso').mjd

        # Storing values for each superdark
        filenames.append(f)
        useafters.append(useafter)
        useafter_mjds.append(mjd)
        
        n+=1
    else:
        print('{} already listed in dark_lookup.txt, not overwriting entry'.format(f))

# Store new superdark values in a data frame
df_new = pd.DataFrame({})
df_new['superdark'] = filenames
df_new['useafter'] = useafters
df_new['useafter_mjd'] = useafter_mjds

# Combine the old and new dark_lookup data frames
df_out = pd.concat([df_in, df_new])

# Remove duplicates if they exist in the combined data frame by superdark filename
df_out.drop_duplicates(subset ="superdark", keep = "first", inplace = True)

# Sort the combined data frame based on useafter MJD date
df_out = df_out.sort_values(by=['useafter_mjd']).reset_index(drop=True)

print(df_out)
print('Added {} new superdarks to {}'.format(n, outfile))

# Save the new dark_lookup.txt file to the specified output file (can be the same as input) with delimeter '|'
if writef==True:
    print('Writing to', outfile)
    df_out.to_csv(outfile, index=False, sep='|', header=False)

**If required**

Create a map of all raw dark file IDs that went into making each superdark.

In [None]:
import os
import glob
from astropy.io.fits import getheader
from pdb import set_trace as st

# ---------------------------
# INPUT, NO trailing "/"
RED_DIR = '/user/lprichard/hst_wfc3_uvis_reduction/darks_data/red_darks'
SDARK_DIR = os.path.join(RED_DIR, 'superdarks')
write_map = True
# ---------------------------

# Move to the download directory
os.chdir(SDARK_DIR)

# Defining the output filename above where the superdarks are kept
outfile = os.path.join(os.path.split(SHARE_DIR)[0], 'superdark_idmap.txt')
print('Writing to', outfile)
if write_map==True: file = open(outfile, "w")

if write_map==True: 
    # Writing ID map file pre-amble
    file.write('# Superdark raw input file ID map saved to' + '\n')
    file.write('# ' + outfile + '\n')
    file.write('# The parent directory has the following format:' + '\n') 
    file.write('# post-anneal-[anneal date YYYYMMDD]_procd-[processed date YYYYMMDD]_[reduction method]' + '\n')
    file.write('# Each superdark in the following directory (d*_drk.fits) was made of the raw IDs' + '\n') 
    file.write('# listed below each superdark in this file, info also in each superdark header' + '\n')
    file.write('# ' + SHARE_DIR + '\n')

# For each superdark, get input raws from header and print
for f in glob.glob('*'):
    # The superdark name
    print("superdark: ", f)
    if write_map==True: file.write(f + '\n')
    
    # Read in superdark header
    hdr = getheader(f)
    # get USEAFTER date
    print("USEAFTER =", hdr['USEAFTER'])
    if write_map==True: file.write("USEAFTER = " + hdr['USEAFTER'] + '\n') 
    
    # The raw ID names
    hist = hdr['HISTORY']
    print("raw IDs:") 
    for i, ht in enumerate(hist):
        if ('id' in hist[i]) and (' ' not in ht):
            print(ht)
            if write_map==True: file.write(ht + '\n')
            
if write_map==True: file.close() 

____________
____________
____________
____________
____________

# Download and CTE Correct Science Data

**12) Set up the science data directories**

Outside of the `hst_wfc3_uvis_reduction/` parent directory, set a science data directory, a project specific one will be made, this is just the parent directory for all science data. The following sets and makes example directories, e.g.,

In [None]:
import os

# Setting path names
SCI_ROOT = "/user/lprichard/project"         #EDIT! Root directory for a whole science progam
DAT_DIR = os.path.join(SCI_ROOT, 'sci_data')      #Location of the science data both raw and reduced
RAW_DIR = os.path.join(DAT_DIR, 'raw_sci')        #Raw science data directory
RED_DIR = os.path.join(DAT_DIR, 'red_sci')        #Reduced science data directory

print('SCI_ROOT =', SCI_ROOT)
print('DAT_DIR =', DAT_DIR)
print('RAW_DIR =', RAW_DIR)
print('RED_DIR =', RED_DIR)

# Making data directories if they don't exist
if not os.path.exists(DAT_DIR): 
    os.makedirs(DAT_DIR, 0o774)
if not os.path.exists(RAW_DIR): 
    os.makedirs(RAW_DIR, 0o774)
if not os.path.exists(RED_DIR): 
    os.makedirs(RED_DIR, 0o774)

**13) Download the raw data files with astroquery**

Get the proposal ID for the data (e.g., from MAST https://mast.stsci.edu/portal/Mashup/Clients/Mast/Portal.html). Then use `hst_wfc3_uvis_reduction/darks_codes/lp_darks/download_data.py` to download the science data and do file organization as for Step 5 with the darks, but now with different inputs for the science data. The code makes directories, downloads data with astroquery and sorts raw science data files ready for CTE corrections. Run using the following command in `hst_wfc3_uvis_reduction/darks_codes/lp_darks/`:

        python download_data.py [-t|--type] [-s|--data_start] [-e|--data_end] [-p|--proposal_id] [-r|--raw_dir] [-d|--download] [-l|--dload_date]
        
For the science data, ``--type='science'`` and ``--proposal_id`` should be set. If wanting to select files from part of a proprosal (e.g. a large or ongoing program), you can also set the ``--data_start`` and ``--data_end`` dates along with the required ``--proposal_id``. These must be in MJD, all files up to but **NOT INCLUDING** the end date are selected. Can set today's date as the end date using the code below. **UPDATE April 2020: use astroquery to get the dates (example in the first cell below) as data start times are slightly different between MAST and astroquery. Files will be missed with an astroquery command and MAST start time**.

Execution of this script will create the following file tree:

    PID<proposal_id>_rawdata_aquery<download_date:YYYYMonDD>/
        ctecorr_sci/       (for CTE corrected data)
        mastDownload/HST/  (created by the astroquery download command)    
            id**/          (creates ID specific folders with the files in)
        raw_sci/           (for raw data copied from the download directory into a single directory)
        calwf3_sci/        (for processing science data with new darks with calwf3)

For convenience, the code below produces a readable table of data and times for a project ID (PID) from astroquery, the cell below allows you to input a calendar date and returns the MJD date that it corresponds with.

In [None]:
# Search for the files, ASNs and the start times with astroquery rather than taking them from MAST

from astroquery.mast import Observations
import pandas as pd
import numpy as np
from astropy.time import Time

# ---------------------------------
# INPUTS
# Proposal ID
PID = '12345'
# ---------------------------------

def mjd_to_str(t):
    """Converts Modified Julien Date (MJD) input (t) 
    to a readable date string (t_str)."""
    t_mjd = Time(t, format='mjd')
    t_str = t_mjd.strftime('%Y-%m-%d %H:%M:%S')
    return t_str

# Select all science observations for the proposal ID 
sciobs = Observations.query_criteria(intentType='science', instrument_name="WFC3/UVIS", proposal_id='{}'.format(PID))
# See available columns in the result
print(sciobs.columns)

# Convert astropy table to a dataframe for manipulation
so_df = pd.DataFrame(np.array(sciobs))

# Get the easily readable date string from the MJD dates
so_df['t_min_str'] = so_df['t_min'].apply(mjd_to_str)
so_df['t_max_str'] = so_df['t_max'].apply(mjd_to_str)

# Print just the ASN numbers (obs_id), start and end times (t_min, t_max) in MJD and string form in descending date order
so_df[['obs_id','t_min','t_min_str','t_max','t_max_str']].sort_values('t_min', ascending=False)

In [None]:
from astropy.time import Time
from datetime import datetime as dt

# ---------------------------------
# Start and end dates copied from the "Start Time" column from MAST
# All files up to but NOT INCLUDING the end_date are selected
start_date = '2019-03-21 05:07:16'
end_date = '2019-04-10 18:12:42'
# ---------------------------------

# Determine the MJD for now
now = Time(Time.now(), format='iso').mjd
print('MJD Now:', now)

# Convert these to MJD
start_mjd = Time(start_date, format='iso').mjd
end_mjd = Time(end_date, format='iso').mjd

print('MJD start:', start_mjd)
print('MJD end:', end_mjd)

An example of running download_data.py from the command line using sample inputs for science raw files: 

In [None]:
# ************************************************************
# Example of inputs
# -t|--type = 'science'
# -p|--proposal_id = '12345'
# -r|--raw_dir = '/user/lprichard/project/sci_data/raw_sci'

# Set the following flag which will set download to ``True``
# -d|--download
# OTHERWISE set dload_date to point to a directory of data downloaded on this date to do just the file organization and copying
# -l|--dload_date = '2019Sep13'
# ************************************************************

# Move into codes directory and start screen
# cd /user/lprichard/hst_wfc3_uvis_reduction/darks_codes/lp_darks
# screen -S sci_dload

# Use the following command in a terminal, the "now" definition is for the log that the terminal
# output is piped to which is called at the end. "pdb" ipython command is for debugging.
# Set to download
now=$(date +"%m_%d_%Y") && ipython --pdb -c "%run download_data.py  -t 'science' -p '12345'\
    -r '/user/lprichard/project/sci_data/raw_sci' -d" 2>&1 | tee /user/lprichard/hst_wfc3_uvis_reduction/darks_codes/lp_darks/logs/log_sciaquery_$now.txt
# OR for file organization of downloaded data
now=$(date +"%m_%d_%Y") && ipython --pdb -c "%run download_data.py  -t 'science' -p '12345'\
    -r '/user/lprichard/project/sci_data/raw_sci' -l '2019Sep13'" 2>&1 | tee /user/lprichard/hst_wfc3_uvis_reduction/darks_codes/lp_darks/logs/log_sciaquery_$now.txt

**14) CTE correct the science data**

____________________________
NOTE: The following method is based on an old CTE correction code. As of Feb 2020, STScI are testing a new CTE correction code. Check back to the GitHub repository or the STScI website for updates prior to running this step in case this new code has been released.
____________________________

Then need to CTE correct the science data, this is done using `hst_wfc3_uvis_reduction/darks_codes/lp_darks/ctecorr_scidata.py` which is an adapted version of `cal_uvis_make_ctecorr_darks.py` that checks, copies and CTE corrects the science data. It is called using the following command:

    python ctecorr_scidata.py [-c|--ctecorr_dir] [-r|--rwd_dir] [-s|--software_dir]

Required:

    -c|--ctecorr_dir -- String, path to output CTE corrected data directory made in download_data.py, no trailing slash "/".

    -r|--rwd_dir -- String, path to raw data directory (all files in one folder as done in previous step with download_data.py), no trailing slash "/".

    -s|--software_dir -- String, path to the compiled CTE correction code ./wfc3uv_ctereverse.e, no trailing slash "/".

An example of running ctecorr_scidata.py from the command line using sample inputs: 

In [None]:
# *******************************************************************************
# Example of inputs
# -c|--ctecorr_dir = '/user/lprichard/project/sci_data/raw_sci/PID12345_rawdata_aquery2019Sep13/ctecorr_sci'
# -r|--rwd_dir = '/user/lprichard/project/sci_data/raw_sci/PID12345_rawdata_aquery2019Sep13/raw_sci'
# -s|--software_dir = '/user/lprichard/software' 
# *******************************************************************************

# Move into codes directory and start screen
# cd /user/lprichard/hst_wfc3_uvis_reduction/darks_codes/lp_darks
# screen -S ctesci

# Use the following command in a terminal, the "now" definition ahead is for the log  that the 
# terminal output is piped to which is called at the end. "pdb" ipython command is for debugging.
now=$(date +"%m_%d_%Y") && ipython --pdb -c "%run ctecorr_scidata.py  -c '/user/lprichard/project/sci_data/raw_sci/PID12345_rawdata_aquery2019Sep13/ctecorr_sci'\
    -r '/user/lprichard/project/sci_data/raw_sci/PID12345_rawdata_aquery2019Sep13/raw_sci' \
    -s '/user/lprichard/software'" 2>&1 | tee /user/lprichard/hst_wfc3_uvis_reduction/darks_codes/lp_darks/logs/log_ctecorrPID12345_$now.txt

____________
# Process Science Data with New Darks

**15) Select which superdarks to use for the science data**

For the CTE corrected science data files, the following code cross-references the superdarks listed in `dark_lookup.txt` that are within the `superdarks/` directory to find which superdarks to use to process the data. It creates two files within the CTE correction science data directory:

    file_summary.txt -- a list of the science data raw file names, date of the observations, exposure start time in MJD, and the superdark to use for each raw science file
    superdarks.txt -- is a comma separated version of dark_lookup.txt at the time of running the below code
    
If not all science data files have a corresponding superdark, a warning is given and the output files are **not saved**. You must therefore make sure you have the relevant superdarks for the science data, that these have been moved to the `superdarks/` directory, and that the `dark_lookup.txt` has been updated to reflect the `superdarks/` directory. You must have the saved `file_summary.txt` file to proceed to the next steps.

In [None]:
# Written by Ben Sunquist, adapted by Laura Prichard
# see what dark to use in calwf3 for each file

from astropy.time import Time
from datetime import datetime as dt
import astropy.io.fits as fits
import numpy as np
from pdb import set_trace as st
import os
import glob
import pandas as pd

# ----------------------------------
# INPUT
CTECORR_DIR = '/user/lprichard/project/sci_data/raw_sci/PID12345_rawdata_aquery2019Sep13/ctecorr_sci'
DARK_LOOKUP = '/user/lprichard/hst_wfc3_uvis_reduction/darks_codes/lp_darks/ref_files/dark_lookup.txt'
# ----------------------------------

# get info for rac files
files = glob.glob(os.path.join(CTECORR_DIR, '*rac.fits'))
times = []
expstarts = []
filenames = []
files_raw = []
for f in files:
    filenames.append(os.path.basename(f))
    files_raw.append(os.path.basename(f).replace('rac','raw'))
    info = '{} {}'.format(fits.getheader(f,0)['DATE-OBS'], fits.getheader(f,0)['TIME-OBS'])
    times.append(info)
    expstarts.append(fits.getheader(f,0)['EXPSTART'])

# Save rac file info to data frame
df = pd.DataFrame({})
df['file'] = filenames
df['file_raw'] = files_raw
df['date'] = times
df['expstart'] = expstarts
df = df.sort_values(by=['expstart']).reset_index(drop=True)

# get info for superdarks
df_sd = pd.read_csv(DARK_LOOKUP, names='superdark useafter useafter_mjd'.split(), delimiter='|')
df_sd = df_sd.sort_values(by=['useafter_mjd']).reset_index(drop=True)

# add superdark to use to df for each file
sds = []
t = np.array(df_sd['useafter_mjd'])
n=0
for i in range(len(df)):
    expstart = float(df['expstart'][i])
    if len(np.where(expstart>t)[0])>0:
        sd_match = df_sd['superdark'].iloc[np.where(expstart>t)[0][-1]]
        n+=1
    else:
        sd_match = ''
        print('There is no superdark for science data file {} with start time {}'.format(df['file'][i], df['expstart'][i]))
    sds.append(sd_match)

df['superdark'] = sds

# Check if all the raw files have a superdark, if so saving the output files, if not print warning
if n==len(df):
    print('All raw science data files were matched to a superdark')
    df.to_csv(os.path.join(CTECORR_DIR, 'file_summary.txt'), index=False)
    df_sd.to_csv(os.path.join(CTECORR_DIR, 'superdarks.txt'), index=False)
else: 
    print('WARNING: There are not superdarks for all of the raw science data, output not saved')

**16) Move science data to processing directory**

The processing directory is where all the science data and the relevant superdarks will be moved to, to run calwf3. This process also involves renaming files and updating header information so that calwf3 can run normally on them without factoring in the previous reduction stages that have been changed from the standard procedure.

The code below moves and renames the `*rac.fits` files to `*raw.fits` files from the user-defined CTE correction directory into the user-defined processing directory if the file doesn't already exist. These directories are both within the same proposal ID directory and are made in Step 13 by `download_data.py`.

In [None]:
# Move the science data *rac.fits files into a new directory, then rename to *raw.fits so that calwf3 can run on them
# ----------------------------------
# INPUT
CTECORR_DIR = '/user/lprichard/project/sci_data/raw_sci/PID12345_rawdata_aquery2019Sep13/ctecorr_sci'
PROC_DIR = '/user/lprichard/project/sci_data/raw_sci/PID12345_rawdata_aquery2019Sep13/calwf3_sci'
# ----------------------------------

# Check for the superdark combined directory and make if it doesn't exist
if os.path.exists(PROC_DIR):
    print('Directory to process data in exists:', PROC_DIR)
else:
    os.makedirs(PROC_DIR, 0o774)
    print('Created directory to process data in:', PROC_DIR)

# Move to the processed superdark directory
os.chdir(CTECORR_DIR)

# Copies science data *rac.fits files out of the CTE correction directory to the superdarks/ directory if it doesn't exist
n=0
for rac in glob.glob('*rac.fits'):
    # Define source and desitnation paths for each superdark
    src = os.path.join(CTECORR_DIR, rac)   #Source filepath of rac file to be copied
    dst = os.path.join(PROC_DIR, rac.replace('rac','raw'))   #Destination filepath of rac to be copied and renamed
    
    # Check if the superdark is already in the superdarks/ combined directory, if not it is copied
    if not os.path.exists(dst):
        shutil.copy(src, dst)
        print('Copied {} to {}'.format(src, dst))
        n+=1
    else: 
        print('File {} exists, not copying file from {}'.format(dst, CTECORR_DIR))
    
print('Copied {} rac.fits files to the directory {} to be processed by calwf3'.format(n, PROC_DIR))
    

**17) Move superdarks to processing directory**

Again, using information from the `file_summary.txt` file produced in Step 15, copy the superdarks necessary to process the science data into the processing directory if they don't already exist there.

In [None]:
import os
import glob
import pandas as pd

# ----------------------------------
# INPUT
CTECORR_DIR = '/user/lprichard/project/sci_data/raw_sci/PID12345_rawdata_aquery2019Sep13/ctecorr_sci'
PROC_DIR = '/user/lprichard/project/sci_data/raw_sci/PID12345_rawdata_aquery2019Sep13/calwf3_sci'
RED_DIR = '/user/lprichard/hst_wfc3_uvis_reduction/darks_data/red_darks'
SDARK_DIR = os.path.join(RED_DIR, 'superdarks')
# ----------------------------------

# Read in the file made previously for the data in the processing directory
df = pd.read_csv(os.path.join(CTECORR_DIR,'file_summary.txt'))

n=0
# For each superdark required for the data, copy it to the processing directory if it doesn't exist
for sd in df['superdark'].unique():
    
    # Define source and desitnation paths for each superdark
    src = os.path.join(SDARK_DIR, sd)   #Source filepath of rac file to be copied
    dst = os.path.join(PROC_DIR, sd)   #Destination filepath of rac to be copied and renamed
    
    # Check if the superdark is already in the superdarks/ combined directory, if not it is copied
    if not os.path.exists(dst):
        shutil.copy(src, dst)
        print('Copied {} to {}'.format(src, dst))
        n+=1
    else: 
        print('File exists {}, not copying superdark from {}'.format(dst, SDARK_DIR))
    
print('Copied {} superdarks to the directory {} to be processed by calwf3'.format(n, PROC_DIR))

**18) Update headers of darks and science data to point to current best standard reference files**

Use `crds.bestrefs` (https://hst-crds.stsci.edu/static/users_guide/command_line_tools.html) to automatically update the reference files in the headers to the current best files. Do this **before** setting any custom header keywords below.

From `prep_header_keys.py` in the darks pipline where `crds.bestrefs` is used:
    "In order for the ``crds.bestrefs.BestrefsScript`` to run, the
        following environments must be configured:

            setenv CRDS_PATH /grp/crds/cache
            setenv CRDS_SERVER_URL https://hst-crds.stsci.edu
 ...The ``FLSHFILE`` must also be set for postflash
        correction (if necessary).  However, as to avoid hardcoding the file in
        the script, the ``crds.bestrefs`` routine is used to determine the best
        postflash reference file to use."
        
Set the iref environement if not already set:
    
    e.g., export iref="/grp/hst/cdbs/iref/"
    
If you don't have access to this iref/ directory, get the iref/ files needed from STScI and point to that local directory: http://www.stsci.edu/itt/review/2008_HST_Docs/WFC3_DHB/wfc3_Ch53.html#63144 (use Safari or other browser if not working on Chrome).

In [None]:
def set_bestrefs(image_list, key='', update=True):
    """From prep_header_keys.py written by Matt Bourque adapted by Laura Prichard.
    Don't set `key` if you want all reference files to be updated or for a specific file,
    set `key` to the header keyword for which you want the best reference file: 
    e.g. ``FLSHFILE`` forthe best postflash reference file, ``BIASFILE`` for best bias.
    Set update=True to update the headers, or update=False to print the reference files 
    that will be updated.
    

    Parameters
    ----------
    image_list : list
        The list of absolute paths to the images to update.
    key : str
        Header key word to update. If not set, all refernece files will be updated.
    update : bool
        Set to True to update the reference files in the header, 
        False to print out the current and replacement reference files
    """
    # Check if the header keyword has been set
    if key=='':
        # Find and update the headers with the best reference file
        images = ' '.join(image_list)
        
        # Updating or just printing those that will be updated
        if update==True: 
            print('Updating all reference files with bestrefs')
            bestrefs_arg = "crds.bestrefs  --update-bestrefs --files {} --verbosity 0".format(images, key)
        else: 
            print('Will update the following reference files if update=True')
            bestrefs_arg = "crds.bestrefs --print-new-references --files {}".format(images)
        
        script = BestrefsScript(argv=bestrefs_arg)
        script.run()
    else:
        # Find and update the headers with the best reference file
        images = ' '.join(image_list)
        
        if update==True:
            print('Updating {} reference file'.format(key))
            bestrefs_arg = "crds.bestrefs  --update-bestrefs --files {} --types {} --verbosity 0".format(images, key)
        else: 
            print('Will update the following {} reference file if update=True'.format(key))
            bestrefs_arg = "crds.bestrefs --print-new-references --files {} --types {}".format(images)
        script = BestrefsScript(argv=bestrefs_arg)
        script.run()

def print_hdr_keys(image_list, key=''):
    """Prints reference files in a *raw.fits/*rac.fits or other HST file if `key` not set
    or any specifified header key word (`key` str) for a list of images.
    
        
    Parameters
    ----------
    image_list : list
        The list of absolute paths to the images to update.
    key : str
        Header key word to print. If not set, all reference files will be printed."""
    
    for image in image_list:
        print('File: {}'.format(image))

        # Open image and get the header
        hdul = fits.open(image)
        hdr = hdul[0].header

        if key=='':
            print('---------------------------------------')
            print('Reference files for {}:'.format(os.path.basename(image)))
            keys = ['ATODTAB', 'BIACFILE', 'BIASFILE', 'BPIXTAB', 'CCDTAB', 'CRREJTAB', 'D2IMFILE', 'DARKFILE', 'DRKCFILE', 'FLSHFILE', \
            'IDCTAB', 'IMPHTTAB', 'MDRIZTAB', 'NLINFILE', 'NPOLFILE', 'OSCNTAB', 'PCTETAB', 'PFLTFILE', 'SNKCFILE']

            for k in keys:
                print("{} is {}".format(k, hdr[k]))
            print('---------------------------------------')
        else:
            # Get and update bias
            print("{} is {}".format(key, hdr[key]))
        
        # Save changes
        hdul.close()

In [None]:
# Updating the header keys with the best reference files
from crds.bestrefs import BestrefsScript
import os
import glob
import astropy.io.fits as fits

# ------------------------------
# INPUTS
PROC_DIR = '/user/lprichard/project/sci_data/raw_sci/PID12345_rawdata_aquery2019Sep13/calwf3_sci'
# ------------------------------

# For each file in there, check the headers and update:
image_list = glob.glob(os.path.join(PROC_DIR,'*.fits'))

# Print existing reference files (or a key word) from the file headers
print_hdr_keys(image_list)

# To print reference files and any updates, set update=False, to update headers update=True
set_bestrefs(image_list, update=False)

# Print updated reference files (or a key word) from the file headers
# print_hdr_keys(image_list)

**19) Update the header of the science data with custom calibration files**

For each CTE corrected science file (that has been renamed to `raw` from `rac`), replace the dark file to use in the header to the relevant new superdark based on the output `file_summary.txt` file prodcued in Step 15. Also turn the CTE correction flag to `OMIT` as this has already been done separately. 

In [None]:
# Written by Ben Sunquist, adapted by Laura Prichard
# Update the headers of the science data to point to the right superdarks and run with calwf3

import astropy.io.fits as fits
import numpy as np
import os
import glob
import pandas as pd

# ----------------------------------
# INPUT
CTECORR_DIR = '/user/lprichard/project/sci_data/raw_sci/PID12345_rawdata_aquery2019Sep13/ctecorr_sci'
PROC_DIR = '/user/lprichard/project/sci_data/raw_sci/PID12345_rawdata_aquery2019Sep13/calwf3_sci'
# ----------------------------------

# update DARKFILE headers in the rac files to the new superdarksand turn off CTE (THESE ARE ALREADY RACS, I.E. CTE CORRECTED)
df = pd.read_csv(os.path.join(CTECORR_DIR, 'file_summary.txt'))

files = glob.glob(os.path.join(PROC_DIR, '*raw.fits'))
n=0
for f in files:
    h = fits.open(f)
    sd = df['superdark'][df['file']==os.path.basename(f)].values[0]
    h[0].header['DARKFILE'] = sd
    h[0].header['PCTECORR'] = 'OMIT'
    h.writeto(f, overwrite=True)
    h.close()
    n+=1
    print('Updated header for {}'.format(os.path.basename(f)))
    
print('Updated headers for {} files ready for processing by calwf3 in {}'.format(n, PROC_DIC))


**20) Update header info of superdarks so they can run with calwf3**

For the copied superdarks in the processing directory, change the `FILETYPE` key word to `DARK` so that calwf3 can run on them.

In [None]:
import os
import glob
import astropy.io.fits as fits

# ----------------------------------
# INPUT
PROC_DIR = '/user/lprichard/project/sci_data/raw_sci/PID12345_rawdata_aquery2019Sep13/calwf3_sci'
# ----------------------------------

# change superdark filetype to dark (so calwf3 will run)
for f in glob.glob(os.path.join(PROC_DIR,'d*.fits')):
    h = fits.open(f)
    h[0].header['FILETYPE'] = 'DARK'
    h.writeto(f, overwrite=True)
    h.close()
    print('Updated header for superdark {}'.format(f))
print('Updated headers for {} superdarks ready for processing by calwf3 in {}'.format(n, PROC_DIR))

**21) Run calwf3 on data**

Within the processing directory with the copied updated files, run calwf3 from the terminal using the following commands:

In [None]:
# From the terminal using the following example commands

# Move into the processing directory (PROC_DIR) with the science data and renamed darks with updated headers
cd /user/lprichard/project/sci_data/raw_sci/PID12345_rawdata_aquery2019Sep13/calwf3_sci

# Set the iref environement if not already set
export iref="/grp/hst/cdbs/iref/"

# Run calwf3 on each raw science file
ls *raw.fits | awk '{print "calwf3.e",$1}' | csh

**22) Correct for amp offsets (produces smoother images)**

This step subtracts the median background from each of the amp quadrants in the `*flt.fits` files produced by `calwf3` resulting in smoother final and Drizzled images. The median subtracted `*flt.fits` files produced by this step will be renamed `*flt_medsub.fits`. Go to the Ben Sunnquists's `make_uvis_skydark.py` GitHub page and download the latest version to the processing directory (PROC_DIR), e.g.:

    cd /user/lprichard/project/sci_data/raw_sci/PID12345_rawdata_aquery2019Sep13/calwf3_sci
    wget https://github.com/bsunnquist/uvis-skydarks/blob/master/make_uvis_skydark.py
    
Run the code from the terminal using the example below. Use the `--no_multiply_flat` flag for a simple median subtraction, or remove for a mediam amp value mutiplied by the flat to be subtracted (depends on data type if this is preferred). 

In [None]:
# Move into PROC_DIR
cd /user/lprichard/project/sci_data/raw_sci/PID12345_rawdata_aquery2019Sep13/calwf3_sci

# Set iref/ environement if not already set
export iref="/grp/hst/cdbs/iref/"

# To perform median amp subtraction 
ipython --pdb -c "%run make_uvis_skydark.py --no_multiply_flat"

**23) Rename and move files**

The files produced by `calwf3` will be labeled `*flt.fits` files as it doesn't know that the files have already been CTE corrected. The files that have had the medium amp background subtracted (medsub) in step 22) with `make_uvis_skydark.py` will be labeled `*flt_medsub.fits`. Move these files to a `final/` directory and correctly change their name to `flc.fits` files. These science files are ready to be drizzled together with AstroDrizzle.

In [None]:
# ONLY or the new CTE corrected code, make a final/ directory, copy the flts to there and rename flcs
import os
import shutil
import glob

# ---------------------------
# INPUTS
PROC_DIR = '/user/lprichard/project/sci_data/raw_sci/PID12345_rawdata_aquery2019Sep13/calwf3_sci'
FINAL_DIR = '/user/lprichard/project/sci_data/raw_sci/PID12345_rawdata_aquery2019Sep13/calwf3_sci/final'
# ---------------------------

# Check for the processing directory and make if it doesn't exist
if os.path.exists(FINAL_DIR):
    print('FINAL_DIR directory exists:', FINAL_DIR)
else:
    os.makedirs(FINAL_DIR, 0o774)
    print('Created FINAL_DIR directory:', FINAL_DIR)

# Move to the CTE corrected science data directory
os.chdir(PROC_DIR)

# Copies science data *flt.fits files to the final directory if they don't exist and renames them to *flc.fits
n=0
for flt in glob.glob('*flt_medsub.fits'):
    # Define source and destination paths for each flt file
    src = os.path.join(PROC_DIR, flt) 
    dst = os.path.join(FINAL_DIR, flt.replace('flt_medsub','flc'))
    
    # Check if the flc file is already in final directory, if so it is not copied
    if not os.path.exists(dst):
        shutil.copy(src, dst)
        print('Copied {} to {}'.format(src, dst))
        n+=1
    else: 
        print('{} exists, not copying file from {}'.format(dst, PROC_DIR))
    
print('Copied {} flt_medsubs and renamed to flcs to {}'.format(n, FINAL_DIR))
    