# Additional Corrections to HST FLCs

_________________

Written by Laura Prichard, May 2021 (on GitHub [here](https://github.com/lprichard/HST_FLC_corrections)). Includes codes developed Ben Sunnquist (on [GitHub](https://github.com/bsunnquist/uvis-skydarks)) & Marc Rafelski. 

Please reference Prichard et al. 2021, *in prep.* (check back for updated reference) and codes by Ben Sunnquist if you use any of the corrections outlined here.

Notebook tested with: `Python v3.7`, `astropy v4.0`, `astroquery v0.4`, `drizzlepac 3.1.8`, `photutils v1.0.2`, and `stwcs v1.6.1`. These versions and higher are recommended.
_________________

This notebook describes the step-by-step procedure to download data from the Mikulski Archive for Space Telescopes ([MAST](https://mast.stsci.edu/portal/Mashup/Clients/Mast/Portal.html)) with [`astroquery`](https://astroquery.readthedocs.io/en/latest/mast/mast.html) and apply additional calibrations to reduced Hubble Space Telescope (HST) single-visit images (called FLCs). Different corrections are needed for the Wide Field Camera 3 (WFC3)/UV-Visible (UVIS) and Advanced Camera for Surveys (ACS) FLCs that are listed below.  
<br /> 

**WFC3/UVIS corrections**

- As of May 2021, many improvements to the WFC3/UVIS darks pipeline (outlined on Laura Prichard's GitHub [here](https://github.com/lprichard/hst_wfc3_uvis_reduction)) have been adopted as standard by the WFC3 team. (These are detailed in Prichard et al. 2021.)
- Therefore, much improved cleaned images are available on MAST that also include the new [time-dependent photometry](https://www.stsci.edu/hst/instrumentation/wfc3/data-analysis/photometric-calibration/uvis-photometric-calibration) information for WFC3/UVIS.
- The additional corrections outlined here are to equalize the four amps of the WFC3/UVIS FLCs to reduce offsets (referenced as `medsub` in the image below). Correction code [`make_uvis_skydark.py`](https://github.com/bsunnquist/uvis-skydarks/blob/master/make_uvis_skydark.py) by Ben Sunnquist.
- The other correction is to flag read out cosmic rays (ROCRs). ROCRs are a residual effect of the new and improved charge transfer efficiency (CTE) correction code (`calwf3 v3.6.0`; [Anderson 2020](https://ui.adsabs.harvard.edu/abs/2020wfc..rept....8A/abstract), [Anderson et al. 2021](https://www.stsci.edu/contents/news/wfc3-stans/wfc3-stan-issue-35-April-1.html#1%20-%20New%20CTE%20correction)). The new CTE code reduces gradients and reduces noise in the images. However, CRs that fall on the array while it's being read out are over corrected by the new CTE code as it does not know their real location. This results in divots in the combined images if these pixels are not flagged. Correction code [`flag_rocrs.ipynb`](https://github.com/bsunnquist/uvis-skydarks/blob/master/flag_rocrs.ipynb) by Ben Sunnquist. An example of a ROCR (dark tail of cosmic ray) is shown below.
<img src="./images/ROCR.png" alt="FLC" width="200"/>

An example of the new corrections on the WFC3/UVIS FLCs (right panel) compared with the FLCs previously available on MAST (left) is shown below. The corrections shown in the image are the new CTE code, the improved darks (both now included as standard for WFC3/UVIS MAST FLCs), and the equalizing amps `make_uvis_skydark.py` (or `medsub`) routine.
<img src="./images/FLC_comp.png" alt="FLC" width="400"/>  
<br />

**ACS corrections**

The corrections for ACS data are typically only required for a few exposures, and may affect those observed in Continuous Viewing Zones (CVZs) more.

- Removing gradients caused by the reflection of the Sun on the Earth’s atmosphere into the telescope at low limb angles (see [Biretta et al. 2003](https://ui.adsabs.harvard.edu/abs/2003acs..rept....5B/abstract) for more information). This is a stronger effect at the redder wavelengths covered by ACS. The image shows an example FLC with a strong gradient (left) and with the gradient removed (right). Correction code [`remove_gradients.ipynb`](https://github.com/bsunnquist/uvis-skydarks/blob/master/remove_gradients.ipynb) by Ben Sunnquist.
<img src="./images/ACS_gradients.png" alt="FLC" width="400"/>
- NOTE: in future, the ACS FLCs may also need the ROCR correction as the CTE code for ACS will also be updated with similar improvements to the WFC3/UVIS CTE corrections. Check for any ACS CTE code updates [here](https://www.stsci.edu/hst/instrumentation/acs/performance/cte-information). The ROCR correction code is already written to handle both WFC3 and ACS FLCs.  
<br />
<br />

**Extra FLC corrections if required**

Ben Sunnquist also developed a routine to flag satellite trails or other anomalies in FLCs using DS9 region files. The code is not included below but the link to it with more details is here:
[flag_regions.ipynb](https://github.com/bsunnquist/uvis-skydarks/blob/master/flag_regions.ipynb)

The regions are flagged in the data quality (DQ) arrays and are then not included in the final drizzles. The image shows an example of the SCI extension (left) with DS9 region (green) and flagged DQ extension (right) of an FLC.
<img src="./images/bsunnquist_region_flag.png" alt="FLC" width="400"/>  
<br />
_________________
_________________

**Load packages and adjust display settings**

In [None]:
# Load packages
import os
import glob
import shutil
import datetime
import numpy as np
import pandas as pd
from astropy.io import fits
from astropy.time import Time
from astropy.modeling import models, fitting
from astropy.convolution import Gaussian2DKernel
from astropy.stats import sigma_clip, gaussian_fwhm_to_sigma, SigmaClip
from astroquery.mast import Observations
from stwcs import updatewcs
from stsci.tools import teal
from platform import python_version
from drizzlepac import astrodrizzle as ad
from photutils import Background2D, detect_sources, detect_threshold, MedianBackground

# Load LP's codes in this directory
import copy_files as cf

# Setting pandas display options
pd.set_option('display.expand_frame_repr', False)
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)

# 1) Download & Organize Data

**a) <span style="color:red">REQUIRED USER INPUTS:</span> Set your proposal ID, directories and options**

In [None]:
# -----------------------------------------------------
# INPUTS

# REQUIRED: Proposal ID
PID = '12345'

# REQUIRED: Root directory for FLCs, will be created if it doesn't exist
FLC_DIR = 'your_root_data_directory/flcs_PID{}'.format(PID)   # Set proposal ID above

# REQUIRED: Root path to the directory containing the downloaded codes
CODE_DIR = 'your_code_directory'

# REQUIRED: Set instrument for FLCs to download (either WFC3 or ACS), if running corrections for both ACS & WFC3 FLCs,
# run this whole section (1) for each instrument to create seperate directories
inst = 'WFC3' #'ACS' 

# Set to False to see the files to be downloaded, True to download the files
download=True

# If the files are downloaded (previously or with download=True), set copy=True (recommended) to copy each FLC
# out of it's own subdirectory in the download directory (DLD_DIR set below) to a combined directory ALL_DIR
copy=True

# NOTE: if copy=True and download=False, manually set DLD_DIR to that of the astroquery download directory 
# (includes PID and download date) 
# DLD_DIR = ''    # EXAMPLE: os.path.join(FLC_DIR, 'mastDownload_PID12345_ACS_2021May18', 'HST') 
# -----------------------------------------------------

In [None]:
# Set sub-directories to be made
ALL_DIR = os.path.join(FLC_DIR, 'all_flcs_{}'.format(inst.lower()))  # Directory for all FLCs combined
COR_DIR = os.path.join(FLC_DIR, 'cor_flcs_{}'.format(inst.lower()))  # Directory for correcting FLCs
    
# Make the download data directory if it doesn't exist
if os.path.exists(ALL_DIR):
    print('ALL_DIR directory exists: {}'.format(ALL_DIR))
else:
    os.makedirs(ALL_DIR, 0o774)
    print('Created ALL_DIR directory: {}'.format(ALL_DIR))
    
# Make the corrected data directory if it doesn't exist
if os.path.exists(COR_DIR):
    print('COR_DIR directory exists: {}'.format(COR_DIR))
else:
    os.makedirs(COR_DIR, 0o774)
    print('Created COR_DIR directory: {}'.format(COR_DIR))
    

**b) Check available data products on MAST**

In [None]:
# Search for science observations by proposal ID

# Convert date string to MJD
def mjd_to_str(t):
    """Converts Modified Julien Date (MJD) input (t) 
    to a readable date string (t_str)."""
    t_mjd = Time(t, format='mjd')
    t_str = t_mjd.strftime('%Y-%m-%d %H:%M:%S')
    return t_str

# Select all science observations for the proposal ID 
sciobs = Observations.query_criteria(intentType='science', proposal_id='{}'.format(PID))
# See available columns in the result
print(sciobs.columns)

# Convert astropy table to a dataframe for manipulation
so_df = pd.DataFrame(np.array(sciobs))

# Get the easily readable date string from the MJD dates
so_df['t_min_str'] = so_df['t_min'].apply(mjd_to_str)
so_df['t_max_str'] = so_df['t_max'].apply(mjd_to_str)

# Print just the ASN numbers (obs_id), start and end times (t_min, t_max) in MJD and string form in descending date order
so_df[['obs_id', 'target_name', 'instrument_name', 't_min','t_min_str','t_max','t_max_str']].sort_values('t_min', ascending=False).reset_index(drop=True)

**c) Download the data and organize**

If there are duplicate FLCs showing in the `mastflcs` list, `astroquery` download has caching so it won't actually download more than one of the FLCs with the same name. FLCs can be listed under different observation IDs (`parent_obsid`) so may listed more than once in `astroquery` but not on MAST that does not filter by that information.

Check print command to see if the WFC3/UVIS data is processed with the new calibration codes (`calwf3 v3.6.0(Dec-31-2020)` and above) prior to downloading (set `download=False` above).

In [None]:
def copy_dload_data(DLD_DIR, ALL_DIR, ext='_flc.fits'):
    """Copies the downloaded files out of the nested astroquery directories 
    and into one combined directory (raw downloaded data, RWD_DIR). Here, 
    the CTE correction code copies over only the darks with the right exposure 
    times to be CTE corrected. Or calwf3 is run on all the raw science data files.
    """
    # Move to the download directory
    os.chdir(DLD_DIR)

    # Copies data out of the astroquery sub-directories and into the RWD_DIR directory if empty
    print('Copying downloaded raws from astroquery subdirectories {}'.format(DLD_DIR))
    n=0
    
    for i, idob in enumerate(glob.glob('*')):
        # Check if the file is already in the directory, if not it is copied
        src = os.path.join(DLD_DIR, idob, idob +ext)
        dst = os.path.join(ALL_DIR, idob +ext)
        if not os.path.exists(dst):
            print('Copying {} to {}'.format(src, dst))
            shutil.copy(src, dst)
            n+=1
        else: 
            print('File exists {}, not copying file from {}'.format(dst, DLD_DIR))

    print('Copied {} files to combined raw data directory {}'.format(n, ALL_DIR))

In [None]:
# Search for files with astroquery by instrument and PID (specfied in section 1a) 
if inst=='WFC3': sciobs = Observations.query_criteria(intentType='science', instrument_name="WFC3/UVIS", proposal_id='{}'.format(PID)) 
elif inst=='ACS': sciobs = Observations.query_criteria(intentType='science', instrument_name="ACS/WFC", proposal_id='{}'.format(PID))

# Get the FLCs for each instrument
sciprod = Observations.get_product_list(sciobs)
mastflcs = Observations.filter_products(sciprod, productSubGroupDescription="FLC", type='S')
print('{} FLCs found:'.format(len(mastflcs)))
mastflcs['productFilename', 'project', 'prvversion'].pprint(max_lines=100)

# If download option is set, check if the download directory exists, if not then download
MDLD_DIR = os.path.join(FLC_DIR, 'mastDownload', 'HST')
if download==True:
    if os.path.exists(MDLD_DIR):
        print('Download directory exists! Not downloading files')
    else: 
        # Move to FLC directory and download FLCs with astroquery
        os.chdir(FLC_DIR)
        Observations.download_products(mastflcs, mrp_only=False) 
        
        # Rename the download directory to add the PID and date
        now = Time.now()
        pnow = now.strftime('%Y%b%d')
        DLD_DIR = MDLD_DIR.replace('mastDownload', 'mastDownload_PID{}_{}_{}'.format(PID, inst, pnow))
        os.rename(os.path.dirname(MDLD_DIR), os.path.dirname(DLD_DIR))
        print('Download to {} complete'.format(DLD_DIR))
        
# Copy files out of the sub-directories into a combined directory
if copy==True:
    if "DLD_DIR" in locals(): copy_dload_data(DLD_DIR, ALL_DIR, ext='_flc.fits')
    else: print('WARNING: Download directory (DLD_DIR) not set, set download=True or manually set in Step 1a')

**d) Optional data check** 

Check if the WFC3/UVIS data is processed with the new calibration codes (`calwf3 v3.6.0(Dec-31-2020)` and above) and have time-dependent photometry info in their headers.

In [None]:
# Move into the update WCS directory
os.chdir(ALL_DIR)

# List flcs
files = sorted(glob.glob('*_flc.fits'))

# Get FLC and processing information
for f in files:
    print('--------------------------------------------------------------')
    # Open file header
    hdr = fits.getheader(f, 0)
    
    # Get FLC info
    print('FILE: {}, INST: {}, DATE OBS:{}'.format(f, hdr['INSTRUME'], hdr['DATE-OBS']))

    # Get processing info
    if 'WFC3' in hdr['INSTRUME']: caltype='CALWF3'
    elif 'ACS' in hdr['INSTRUME']: caltype='CALACS'
    print('DATE PROCESSED: {}, {} VERSION: {}'.format(hdr['DATE'], caltype, hdr['CAL_VER']))
    
    # Check for time-dependent phototmetry header updates in WFC3/UVIS files only
    if '2020 Time-dependent Inverse Sensitivity' in str(hdr): timedep='True'
    else: timedep='False'
    if 'WFC3' in hdr['INSTRUME']: print('Includes new WFC3/UVIS time-dependent photometry: {}'.format(timedep))
    print('--------------------------------------------------------------')


# 2) Apply WFC3/UVIS FLC Corrections

**a) Equalize amp offsets**

This step equalizes the amplifiers (four quadrants) on WFC3/UVIS FLCs. The default behavior of this code measures the median of each amp, multiplies it by the flat, subtracts that from each amp, and equalizes the amps to the average amp value. This removes bias offsets between the quadrants to produce smoother images. The corrected output FLCs are the `*_flc_eq.fits` files.

The code [`make_uvis_skydark.py`](https://github.com/bsunnquist/uvis-skydarks/blob/master/make_uvis_skydark.py) used for this step was developed by Ben Sunnquist. A copy of the code, downloaded from GitHub 17 May 2021 (last updated  Apr 19, 2021), is included in this directory but check the link to Ben's GitHub code for the latest version.



In [None]:
if inst=='WFC3':
    # Set correction subdirectory 
    EQ_DIR = os.path.join(COR_DIR, 'eq_flcs')

    # Directory is made and files copied over to be corrected
    cf.copy_files_check(ALL_DIR, EQ_DIR, files='*_flc.fits')

    # Copy over the make_uvis_skydark.py code from the code directory to the FLC correction directory
    cf.copy_files_check(CODE_DIR, EQ_DIR, files='make_uvis_skydark.py')

    # Move into the correction directory
    os.chdir(EQ_DIR)

    # Run the amp offset code
    %run make_uvis_skydark.py
else: print('Set inst=WFC3 and ensure you have the WFC3 FLCs to apply this correction')

**b) Correct for read out cosmic rays (ROCRs)**

Applying the ROCR correction to FLCs is a multi-stage process that changes more information than necessary on the FLCs. Therefore, multiple copies of the FLCs are made to ensure that the only thing changed on the final clean output FLCs is an updated data quality (DQ) array. 

The processed FLCs are ran through a WCS update (with `updatewcs`) of the header so that they can be run through `astrodrizzle` grouped by association number. This is done to create cosmic ray maps/flags used in the ROCR correction. The flagging of ROCR pixels is then performed on the DQ arrays of those FLCs processed with `astrodrizzle`. These DQ arrays are then copied back into the untouched clean copy of the FLCs.  
<br />

i) Copy and rename files

In [None]:
# Set two new directories for updating WCS and for the final clean ROCR corrected FLCs
WCS_DIR = os.path.join(COR_DIR, 'wcs_updt')      # For the FLCs to be processed
ROCR_DIR = os.path.join(COR_DIR, 'rocr_clean')   # For the final clean FLCs that will have updated DQ arrays

# Creates directories, copies, and renames files if they don't exist
cf.copy_files_check(EQ_DIR, WCS_DIR, files='*flc_eq.fits', rename=True, src_str='flc_eq', dst_str='flc')
cf.copy_files_check(EQ_DIR, ROCR_DIR, files='*flc_eq.fits', rename=True, src_str='flc_eq', dst_str='flc')

ii) Update WCS (to avoid `astrodrizzle` errors) and move the updated files to a drizzle directory

In [None]:
# Move into the update WCS directory
os.chdir(WCS_DIR)

# List flcs
files = sorted(glob.glob('*_flc.fits'))

# Update WCS in header to avoid errors with astrodrizzle
updatewcs.updatewcs(files, use_db=False)    #use_db=False for use w drizzlepac 3.1.6 and above

# Set the drizzle directory
DRIZ_DIR = os.path.join(COR_DIR, 'rocr_driz')

# Makes destination directory if it doesn't exist, checks if files exist, copies them if not
cf.copy_files_check(WCS_DIR, DRIZ_DIR, files='*flc.fits')

iii) Get batches of FLC based on association (ASN i.e. observing group) number and run them through `astrodrizzle`

`astrodrizzle` creates cosmic ray maps in the DQ arrays of each FLC. The drizzles themselves are not used but the FLCs that are edited by `astrodrizzle` are. Only the basic parameters with CR flagging are used for the drizzle.

NOTE: The `driz_cr_snr` should be set to best suit your data. Tips from Ben Sunnquist: "[The ROCR correction code (step v)] typically flagged an additional ~ 5000-50,000 pixels in each chip [(this is printed out in step v)]. If you find it's over/under flagging... you could raise/lower the sigma in the ROCR code [(step e)], or set the `driz_cr_snr='5 4'` [(or higher, below in this drizzle step)] rather than `'3.5 3'` when making the cosmic ray maps, which sometimes over flags CRs (and thus can over flag ROCRs as well). To verify, I blink the FLC SCI (ext=1 & 4) and DQ (ext=3 & 6) extensions, and make sure the negative tails attached to some CRs are flagged."

In [None]:
# Move into the drizzle directory
os.chdir(DRIZ_DIR)

# Get all FLCs
files = sorted(glob.glob('*_flc.fits'))

# Loop to get batches of all files to run through astrodrizzle based on ASN ID
fields = []
asns_full = []
asns = []
for f in files:

    #Read in header for each file and get field and ASN ID
    h = fits.open(f)
    field = h[0].header['TARGNAME']
    asn = h[0].header['ASN_ID']

    # For each ASN ID, store the full ASN ID, file abreviation, and fields/target names of observations
    if asn not in asns_full:
        asns_full.append(asn)
        asns.append(f[0:6])
        fields.append(field)  

print('Unique ASNs: {}'.format(asns_full))
print('Unique ASN filenames: {}'.format(asns))
print('Fields: {}'.format(fields))

# Create lists of files associated with each ASN ID:
lists = []
for asn in asns:
    asn_files = [files[i] for i, s in enumerate(files) if asn in s]
    lists.append(asn_files)

print(' Lists of files for each ASN ID that will be ran by astrodrizzle in batches:')
print(lists)

# Get versions
teal.unlearn('astrodrizzle')
print('Python version {}'.format(python_version()))
ad.__version__

# Timestamp for drizzles
now = datetime.datetime.now()
print('*****************************************************************************')
print(DRIZ_DIR)
print('Drizzle started at ', now.strftime("%Y-%m-%d %H:%M"))
print('*****************************************************************************\n\n')

# Run astrodrizzle with lists and ASN IDs defined above
for l, asn in zip(lists, asns):
    ad.AstroDrizzle(l, 
        driz_cr_corr=True, 
        driz_combine=True,
        preserve=False,  
        clean=True, 
        build=True, 
        driz_cr_snr='3.5 3.0'    # Set this option to best suit your data ('5.0 4.0' or higher), see notes above
        output='{}'.format(asn))

# Timestamp for drizzles
now = datetime.datetime.now()
print('\n\n*****************************************************************************')
print(DRIZ_DIR)
print('Drizzle complete at ', now.strftime("%Y-%m-%d %H:%M"))
print('*****************************************************************************')

iv) If desired, make an additional copy of the drizzled, but not yet ROCR corrected, FLCs

The drizzle can take a while depending on your data set. If you would like to test the ROCR flagging parameters on a clean set of drizzled FLCs each time, then make an extra copy here.

In [None]:
# Makes the directory if it doesn't exist, checks for files, copies them over if not there
cf.copy_files_check(DRIZ_DIR, DRIZ_DIR.replace('rocr_driz', 'PRErocr_driz'), files='*flc.fits')

v) Run the ROCR corrections

This code was developed by Ben Sunnquist and the original version is here: [flag_rocrs.ipynb](https://github.com/bsunnquist/uvis-skydarks/blob/master/flag_rocrs.ipynb). Check there for the latest version. 

I made some small edits (denoted with LP in the comments) so that it would be compatible with `Python v3.7`/`astropy v4.0` (the original works with `Python v3.6` and `astropy v<4.0`).

Tips from Ben Sunnquist: "[The ROCR correction code (this step)] typically flagged an additional ~ 5000-50,000 pixels in each chip [(this is printed out)]." Adjust the threshold parameter below or adjust `driz_cr_snr` in step iii to get this level of flagging. To check outputs: "blink the FLC SCI (ext=1 & 4) and DQ (ext=3 & 6) extensions, and make sure the negative tails attached to some CRs are flagged."

In [None]:
# Flag pixels as "bad detector pixel" (DQ value 4) that are within 5 pixels of a CR hit (away from the
# readout direction) AND X sigma below the image mean (where the sigma and mean here are from a Gaussian 
# fit to the sigma-clipped image data).
#
# These pixels are read out cosmic rays (ROCRs), i.e. CRs that fall during readout and therefore 
# trick the CTE correction into over correcting them since it thinks they fell farther from the readout
# than they actually did.


###################################### USER INPUTS ######################################
# LP added: Move into the drizzle directory
os.chdir(DRIZ_DIR)

# the files to use to find the ROCRs (i.e. drizzling has been done on these files so they have CR flags)
files = sorted(glob.glob('*_flc.fits'))  # the files to flag ROCRs in

# the directory containing the files to add the ROCR flags to (no drizzling has been done on these files)
# LP edit: set to the pre-made clean FLC directory
untouched_files_dir = ROCR_DIR

# the sigma to use when determining the threshold for flagging ROCRs
sigma = 2.75   # LP note: adjust to flag the appropriate no. of ROCR pixels for your data
#########################################################################################

for f in files:
    basename = os.path.basename(f)
    print('Flagging ROCRs in {} ...'.format(basename))
    untouched_file = os.path.join(untouched_files_dir, basename)
    h = fits.open(f)
    h_untouched = fits.open(untouched_file)

    for ext in [1,4]:
        data = h[ext].data
        dq = h[ext+2].data
        dq_untouched = h_untouched[ext+2].data

        # Find lower limit for flaggincg ROCRs
        clipped = sigma_clip(data, sigma=3, maxiters=5)
        d = clipped[clipped.mask==False].data
        n, bins = np.histogram(d, bins=70)
        bin_centers = (bins[:-1] + bins[1:]) / 2
        #LP added [0:1] indexes
        g_init = models.Gaussian1D(amplitude=np.array(n[n==max(n)][0:1]), mean=np.array(bin_centers[n==max(n)][0:1]), stddev=np.std(d))
        
        fit_g = fitting.LevMarLSQFitter()
        g = fit_g(g_init, bin_centers, n)
        # LP removed [0] from g.mean.value
        thresh = g.mean.value - sigma*g.stddev.value
        print('\t Threshold Ext {} = {:.3f} - {}*{:.3f} = {:.3f}'.format(ext, g.mean.value, sigma, g.stddev.value, thresh)) 
        
        # Make mask of all CR hits
        cr_mask = np.zeros(dq.shape, dtype=int)
        cr_mask[dq&4096!=0] = 1

        # Flag pixels within 5 pixels of a CR hit (away from readout) that are below the threshold 
        coords = np.where(cr_mask==1)
        cr_mask_new = np.zeros(cr_mask.shape)
        for i in np.arange(len(coords[0])):
            x,y = coords[1][i], coords[0][i]

            # Get the first y-coordinate to check
            if ext==1:
                running_y = y + 1
            elif ext==4:
                running_y = y - 1
            else:
                print('extension {} not expected'.format(ext))

            # See if this coordinate has a value below the threshold
            count = 0
            while count < 5:  # stay within 5 pixels of cr hit
                if (running_y <= 2050) & (running_y >= 0):  # avoid going off the image y-dimension          
                    val = data[running_y, x]
                    if val < thresh:
                        cr_mask_new[running_y, x] = 1
                    if ext==1:
                        running_y += 1
                    elif ext==4:
                        running_y -= 1
                    else:
                        print('extension {} not expected'.format(ext))
                else:
                    pass

                count += 1

        # Add in new ROCR flags as 4 (bad detector pixel)
        dq_untouched[(dq_untouched&4==0) & (cr_mask_new==1)] += 4
        h_untouched[ext+2].data = dq_untouched

        # Write out ROCR flag map
        fits.writeto(f.replace('_flc.fits','_rocr_map_ext_{}.fits'.format(ext)), cr_mask_new, overwrite=True)
        print('\t # of ROCR flags in Ext {}: {}'.format(ext, len(cr_mask_new[cr_mask_new==1])))

    # Write out the ROCR-flagged flc file
    outname = untouched_file.replace('_flc.fits','_rocr_flagged_flc.fits')
    h_untouched.writeto(outname, overwrite=False)
    h_untouched.close()
    print('\t ROCR-flagged image saved to {}'.format(os.path.basename(outname)))

vi) Set a final directory, copy and rename FLCs

In [None]:
# Set final direcotry
FIN_DIR = os.path.join(FLC_DIR, 'final_flcs')

# Makes destination directory, checks if files exist, copies them if they don't, and renames files as specified
cf.copy_files_check(ROCR_DIR, FIN_DIR, files='*_rocr_flagged_flc.fits', rename=True, src_str='rocr_flagged_flc', dst_str='flc')


vii) Update headers of final corrected FLCs

In [None]:
# Move into directory
os.chdir(FIN_DIR)

# Set time now
now = Time(Time.now(), format='iso')

# List flcs
flcs = glob.glob('*flc.fits')

# update filenames, history, date
n=0
for f in flcs:
    print('Opening and updating {}...'.format(f))

    # Open file for editing
    h = fits.open(f)
    
    # Update header with corrections performed
    h[0].header['HISTORY'] = 'FLC corrections (from L Prichard) performed {}'.format(now)
    h[0].header['HISTORY'] = 'Code: https://github.com/lprichard/HST_FLC_corrections.ipynb'
    h[0].header['HISTORY'] = 'Includes FLC amp equalization (by B Sunnquist)'
    h[0].header['HISTORY'] = 'Code: https://github.com/bsunnquist/uvis-skydarks/blob/master/make_uvis_skydark.py'
    h[0].header['HISTORY'] = 'Includes ROCR corrections (by B Sunnquist)'
    h[0].header['HISTORY'] = 'Code: https://github.com/bsunnquist/uvis-skydarks/blob/master/flag_rocrs.ipynb'

    h.writeto(f, overwrite=True)
    h.close()
    n+=1
    print('Updated header for {}/{}: {}'.format(n, len(flcs), f))

print('Updated headers for {} final flcs in {}'.format(n, FIN_DIR))

# 2) Apply ACS FLC Corrections

**a) Gradient removal and chip equalization for ACS FLCs**

Gradients may only exist in a handful of FLCs and may be more likely to have strong gradients if observed in [Continuous Viewing Zones (CZVs)](https://www.stsci.edu/itt/review/cp_primer_cy17/CP_PRIMER/4_Observation_Types2.html). Correction code [`remove_gradients.ipynb`](https://github.com/bsunnquist/uvis-skydarks/blob/master/remove_gradients.ipynb) by Ben Sunnquist, check for latest version. By default, the code only applies corrections to FLCs with gradients larger than 5 electrons (`gradient_threshold=5`). To apply the corrections (gradient removal and chip equalization) to all FLCs (which results in basically the same output for those FLCs least affected), set `gradient_threshold=None`.

Examples of the outputs of the code are below:
<img src="./images/remove_grad_outputs.png" alt="grad" width="400"/>

Notes from Ben Sunnquist on the code functionality and tips for its application:

"The following is the full process used to remove the gradients from the input FLCs:
1. Subtract the clipped median to remove the overall background level
2. Find the 2D background gradient of #1 using `photutils` 2D median background estimator
3. Subtract the gradient found in #2 from the original FLC data to remove the gradient
4. Create a source segmap using the image from #3
5. Repeat steps 1-3 using the original, untouched FLCs, but this time masking the sources found in #4 when finding the background gradient
6. [Equalizing the gradient subtracted chips to the average chip level as for the WFC3/UVIS FLCs]

Documentation on the `photutils` background estimator [here](https://photutils.readthedocs.io/en/stable/background.html#d-background-and-noise-estimation) and on the `photutils` image segmentation for source finding [here](https://photutils.readthedocs.io/en/stable/segmentation.html#source-extraction-using-image-segmentation).

For each FLC, the code will output a corresponding `*bkg.fits` and `*segmap.fits` image to inspect the quality of the background gradient subtracted and the source finding. If you find remnants of e.g. diffuse, patchy sources showing in the `*bkg.fits` image, I would recommend either increasing the box size in the background estimate, or decreasing the `nsigma` threshold used in the source detection."  
<br />

i) Copy the FLCs to be corrected

In [None]:
if inst=='ACS':
    # Set correction subdirectory 
    EQ_DIR = os.path.join(COR_DIR, 'eq_flcs')

    # Directory is made and files copied over (if they don't exist) to be corrected
    cf.copy_files_check(ALL_DIR, EQ_DIR, files='*_flc.fits')
else: print('Set inst=ACS and ensure you have the ACS FLCs to apply this correction')

ii) Remove gradients from FLCs and equalize chip levels

(Only some minor edits have been made to the original code, denoted by LP in the comments.)

In [None]:
# Removes large-scale background gradients from the input flc files, and equalizes the overall
# background levels between chips


################################# USER INPUTS #################################
# LP added: Move into the correction sub-dir
os.chdir(EQ_DIR)

# The files to correct
files = glob.glob('./*flc.fits')

# Only those files whose gradients larger than this threshold will be corrected.
# Set to None to correct all files regardless of the measured gradient.
gradient_threshold = 5.0

# The box size to use when creating the 2D background image
box_size = (128, 128)

# Option to mask sources when finding the background gradient/pedestal level
mask_sources = True

###############################################################################


# Remove those files with no gradient from the processing list
if gradient_threshold:
    files_to_process = []
    for f in files:
        diffs = []
        for ext in [1, 4]:
            data = fits.getdata(f, ext)
            left, _,  _, right = np.split(data, 4, axis=1)
            clipped_left = sigma_clip(left, sigma=3, maxiters=5)
            clipped_right = sigma_clip(right, sigma=3, maxiters=5)
            med_left = np.nanmedian(clipped_left.data[clipped_left.mask==False])
            med_right = np.nanmedian(clipped_right.data[clipped_right.mask==False])
            diffs.append(abs(med_left - med_right))
        diffs = np.array(diffs)
        if len(diffs[diffs > gradient_threshold]) > 0:
            files_to_process.append(f)
else:
    files_to_process = files

# LP added files check
if len(files_to_process)>0:
    
    # STEP 1: Remove the large-scale 2D background gradient from each chip
    print('STEP 1: Removing 2D gradients from input FLCs...')
    for f in files_to_process:
        basename = os.path.basename(f)
        print('Working on {}:'.format(basename))
        h = fits.open(f)
        for ext in [1,4]:
            print('\tWorking on extension {}:'.format(ext))
            data_orig = np.copy(h[ext].data)
            data = h[ext].data

            # Subtract off median
            clipped = sigma_clip(data, sigma=3, maxiters=5)
            data = data - np.nanmedian(clipped.data[clipped.mask==False])

            # Find sources in the gradient-removed image
            if mask_sources:
                print('\tMaking source segmap...')
                s = SigmaClip(sigma=3.)
                bkg_estimator = MedianBackground()
                bkg = Background2D(data, box_size=box_size, filter_size=(10, 10), 
                                   sigma_clip=s, bkg_estimator=bkg_estimator, exclude_percentile=15.0)
                skydark = bkg.background
                data_flat = data_orig - skydark
                threshold = detect_threshold(data_flat, nsigma=0.75)
                sigma = 3.0 * gaussian_fwhm_to_sigma
                kernel = Gaussian2DKernel(sigma, x_size=3, y_size=3)
                kernel.normalize()
                segm = detect_sources(data_flat, threshold, npixels=5, filter_kernel=kernel)
                segmap = segm.data
                fits.writeto(f.replace('_flc.fits', '_segmap_ext{}.fits'.format(ext)), 
                             segmap, overwrite=True)
            else:
                segmap = np.zeros(data.shape).astype(int)

            # Find the background gradient, incorporating the source mask
            print('\tFinding the background gradient...')
            s = SigmaClip(sigma=3.)
            bkg_estimator = MedianBackground()
            mask = (segmap > 0)
            bkg = Background2D(data, box_size=box_size, filter_size=(10, 10), 
                               sigma_clip=s, bkg_estimator=bkg_estimator, mask=mask, exclude_percentile=15.0)
            skydark = bkg.background
            fits.writeto(f.replace('_flc.fits', '_bkg_ext{}.fits'.format(ext)), 
                         skydark, overwrite=True)

            # Subtract the background gradient from the original image
            data_new = data_orig - skydark
            h[ext].data = data_new.astype('float32')

        h.writeto(f, overwrite=True)
        h.close()
        print('Finished removing background gradient from {}'.format(basename))
       
    
    # STEP 2: equalize the overall background/pedestal level between chips to the average of the two chips
    print('\nSTEP 2: Equalizing overall background levels in the input FLCs...')
    for f in files_to_process:
        basename = os.path.basename(f)
        print('Working on {}:'.format(basename))
        h = fits.open(f)

        # Find the overall background levels in each chip
        background_levels = []
        for ext in [1,4]:
            data = np.copy(h[ext].data)

            # Mask sources using the previous segmap
            if mask_sources:
                segmap = fits.getdata(f.replace('_flc.fits', '_segmap_ext{}.fits'.format(ext)))
            else:
                segmap = np.zeros(data.shape).astype(int)
            data[segmap > 0] = np.nan

            # Calculate the background level in the chip
            clipped = sigma_clip(data, sigma=3, maxiters=5)
            background_levels.append(np.nanmedian(clipped.data[clipped.mask==False]))
            print('\tBackground in ext {} = {:0.5f}'.format(ext, np.nanmedian(clipped.data[clipped.mask==False])))

        # Equalize the background levels of the two chips to the average of the two
        avg_bkg = np.mean(background_levels)
        ext1_orig = np.copy(h[1].data)
        ext1_diff = background_levels[0] - avg_bkg
        ext1_new = ext1_orig - ext1_diff
        h[1].data = ext1_new.astype('float32')
        ext4_orig = np.copy(h[4].data)
        ext4_diff = background_levels[1] - avg_bkg
        ext4_new = ext4_orig - ext4_diff
        h[4].data = ext4_new.astype('float32')

        # Write out the final calibrated file
        h.writeto(f, overwrite=True)
        h.close()
        print('Finished equalizing background levels between chips in {}'.format(basename))
        
# LP added
else: print('{} files with gradient_threshold>{}, no gradients removed'.format(len(files_to_process), gradient_threshold))


iii) Set a final directory and copy FLCs

In [None]:
# Set final direcotry
FIN_DIR = os.path.join(FLC_DIR, 'final_flcs')

# Makes destination directory, checks if files exist, copies them if they don't, and renames files as specified
cf.copy_files_check(ROCR_DIR, FIN_DIR, files='*flc.fits')

iv) Update headers of final corrected FLCs

In [None]:
# Move into directory
os.chdir(FIN_DIR)

# Set time now
now = Time(Time.now(), format='iso')

# List flcs
flcs = glob.glob('*flc.fits')

# update filenames, history, date
n=0
for f in flcs:
    print('Opening and updating {}...'.format(f))

    # Open file for editing
    h = fits.open(f)
    
    # Update header with corrections performed
    h[0].header['HISTORY'] = 'FLC corrections (from L Prichard) performed {}'.format(now)
    h[0].header['HISTORY'] = 'Code: Code: https://github.com/lprichard/HST_FLC_corrections.ipynb'
    h[0].header['HISTORY'] = 'Includes gradient removal & amp equalization (by B Sunnquist)'
    h[0].header['HISTORY'] = 'Code: https://github.com/bsunnquist/uvis-skydarks/blob/master/remove_gradients.ipynb'
    
    h.writeto(f, overwrite=True)
    h.close()
    n+=1
    print('Updated header for {}/{}: {}'.format(n, len(flcs), f))

print('Updated headers for {} final flcs in {}'.format(n, FIN_DIR))