<a id="title_ID"></a>
# JWST Pipeline Validation Testing Notebook: Calwebb_Detector1 for MIRI TSO imaging

<span style="color:red"> **Instruments Affected**</span>: MIRI

Tested on MIRI Simulated data

### Table of Contents
<div style="text-align: left"> 

<br>  [Introduction](#intro_ID) <br> [Imports](#imports_ID) <br>[Run JWST Pipeline](#pipeline_ID) <br> [Examine Input and Output Data](#examine_data)  <br> [About This Notebook](#about_ID) <br>


</div>

<a id="intro_ID"></a>
# Introduction

This notebook is meant to process a data set through the Detector1 pipeline for TSO imaging data (calwebb_tso1). The steps are as follow:

1) Read in an uncalibrated TSO imaging file.

2) Process through calwebb_detctor1 using parameters set in calwebb_tso1.cfg.

3) Test various steps and outputs from the pipeline run.

These steps are set up with an example simulated MIRI dataset.

The pipeline documentation can be found here: https://jwst-pipeline.readthedocs.io/en/latest/

The pipeline code is available on GitHub: https://github.com/spacetelescope/jwst

### Defining Terms

Here is where you will define terms or acronymns that may not be known a general audience (ie a new employee to the institute or an external user). For example

    JWST: James Webb Space Telescope
    MIRI: Mid-Infrared Instrument
    LRS: Low Resolution Spectrometer
    TSO: Time Series Observation



<a id="imports_ID"></a>
## Imports

* jwst.datamodels for building model for JWST Pipeline
* jwst.pipeline is the pipeline being tested
* matplotlib.pyplot.plt to generate plot
* numpy for array calculations and manipulation
* pysiaf to get coordinates of MIRI apertures 
* astropy.io and download_file allow downloading and accessing files
* ci_watson and get_bigdata allow accessing files stored in artifactory

In [None]:
import os
if 'CRDS_CACHE_TYPE' in os.environ:
    if os.environ['CRDS_CACHE_TYPE'] == 'local':
        os.environ['CRDS_PATH'] = os.path.join(os.environ['HOME'], 'crds', 'cache')
    elif os.path.isdir(os.environ['CRDS_CACHE_TYPE']):
        os.environ['CRDS_PATH'] = os.environ['CRDS_CACHE_TYPE']
print('CRDS cache location: {}'.format(os.environ['CRDS_PATH']))

In [None]:
from astropy.io import fits, ascii
from astropy.utils.data import download_file
from ci_watson.artifactory_helpers import get_bigdata
from jwst.datamodels import RampModel, ImageModel, dqflags, CubeModel
from jwst.pipeline import Detector1Pipeline
import matplotlib.pyplot as plt
import numpy as np
import os
import pysiaf

### Read in file and update headers to have needed keywords for TSO mode


In [None]:
# Create a temporary directory to hold notebook output, and change the working directory to that directory.
from tempfile import TemporaryDirectory
import os
data_dir = TemporaryDirectory()
os.chdir(data_dir.name)

In [None]:
def checkheaders(model):
    
    # check that header has keyword TSOVISIT set to true (all TSO data should have this set)
    
    if model.meta.visit.tsovisit != True:
        model.meta.visit.tsovisit = True
        print('Setting TSOVISIT keyword')
        
    # check that CRPIX1 and CRPIX2 are set to the center of the siaf aperture for the array being used.
    # Read in array being used
    array = model.meta.subarray.name
    print(array)
    if array == 'FULL':
        siaf = pysiaf.Siaf('MIRI') 
        full = siaf['MIRIM_FULL']
        model.meta.wcsinfo.crpix1 = full.XSciRef
        model.meta.wcsinfo.crpix2 = full.YSciRef
    if array == 'SUB64':
        # subarray siaf values are not quite right in MIRISim. Need to centroid to find x and y
        # start with siaf values
        siaf = pysiaf.Siaf('MIRI')
        sub = siaf['MIRIM_SUB64']
        x_initial = sub.XSciRef - 8 # known 8 pixel shift in subarray source position fixed in latest MIRISim
        y_initial = sub.YSciRef
        
        print(x_initial, y_initial)
        
        # Take initial estimate and centroid to find source
        center = centroids.centroid_sources(model.data[0,0,:,:], x_initial, y_initial, box_size=11)
        xcentroid = center[0][0]
        ycentroid = center[1][0]
        
        print(center[0][0], center[1][0])   
        model.meta.wcsinfo.crpix1 = xcentroid
        model.meta.wcsinfo.crpix2 = ycentroid

<a id="pipeline_ID"></a>
## Run JWST Pipeline

### Set up parameters for individual steps and run calwebb_detector1

In [None]:
# set up pipeline parameters and file names
# Input file names

# This section to download data from remote box directory and run local
mainurl ="https://data.science.stsci.edu/redirect/JWST/TSO/pipeline_testing_miri_ima_tso/"
filename = 'pipetest_miri_imtso_FULL_10g10i_F770W.fits'
file = download_file(mainurl+filename)

# open file into correct format and write to local disk for processing
with fits.open(file) as hdu:
    hdu.info()
    hdu.writeto(filename)

satfile = get_bigdata('jwst_validation_notebooks',
                     'validation_data',
                     'jump',
                     'jump_miri_test', 
                     'miri_sat_55k.fits')

readnoisefile = get_bigdata('jwst_validation_notebooks',
                     'validation_data',
                     'jump',
                     'jump_miri_test', 
                     'jwst_mirisim_readnoise.fits')

tag='_b75_tso'  # string tag to distinguish different tests in output file name

# Read in data file to model    
with RampModel(filename) as modelinput:
    # raises exception if file is not the correct model
    model = modelinput

In the next step we pick 7 pixel locations, and manually add in jumps that represent cosmic ray hits. The magnitude of the hit is different for each. The hit is added in frame 5 (zero-indexed), first integration. This will be used for testing the jump detection step.

In [None]:
# Set up cosmic ray jump testing by adding in cosmic rays
# set variables

# Choose selected pixels to put cr hits of varying fluxes in
xpos = [460, 480, 500, 520, 540, 560, 580]
ypos = [150, 150, 150, 150, 150, 150, 150]
crmags = [10, 25, 50, 100, 200, 500, 1000]

frame = 5  # frame to add cr
integration = 0  # integration to add crs
    
# loop through arrays of x, y and crmags to populate array with values
for x, y, crmag in zip(xpos, ypos, crmags):
    # add cr to ramps from point of 'frame' in ramp
    model.data[integration, frame:, y, x] = model.data[integration, frame:, y, x] + crmag    

Now we run the Detector1 pipeline. The jump detection threshold is set manually; this is important again for jump step testing. A number of reference files are overridden with versions that are compatible with MIRISim simulated data.

As we are not running with the tso1 config file, we have to ensure a few steps are skipped manually:
* ipc
* first frame correction
* last frame correction
* refpix

In [None]:
# Run detector1 pipeline

# step parameters
rej_thresh=8.0  # rejection threshold for jump step (higher for simulated data)
    
# set up pipeline parameters for input
pipe1 = Detector1Pipeline()
pipe1.jump.rejection_threshold = rej_thresh
pipe1.saturation.override_saturation = satfile
pipe1.jump.override_readnoise = readnoisefile
pipe1.ramp_fit.override_readnoise = readnoisefile

# skip steps to make it like 'tso1 config file'
pipe1.ipc.skip = True
pipe1.firstframe.skip = True
pipe1.lastframe.skip = True
    
# Until MIRISim is updated, best to skip refpix step for simulated data
pipe1.refpix.skip = True

# check that header has needed keywords set
        
checkheaders(model)

nints = model.meta.exposure.nints
print('CRPIX1 = ',model.meta.wcsinfo.crpix1)
print('CRPIX2 = ',model.meta.wcsinfo.crpix2)
    
# set up output file name
base, remainder = filename.split('.')

outname = base+tag
print(outname)

pipe1.saturation.output_file = outname+'.fits'
pipe1.jump.output_file = outname+'.fits'    
pipe1.ramp_fit.output_file = outname+'.fits'
pipe1.output_file = outname+'.fits'
            
# Run pipeline on each file
pipe1.run(model) 

print('Detector 1 steps completed.')

In [None]:
print(outname)

<a id="examine_data"></a>
## Examine input and output data

### Take a look at the input data
* Look at the last frame
* plot a pixel up the ramp from the source
* plot a pixel up the ramp from the background

In [None]:
sci_data = model.data

ngroups = model.meta.exposure.ngroups
nints = model.meta.exposure.nints

# identify a science pixel
sci_px = [512, 692]

# identify a pixel in blank sky
bgr_px = [560, 915]

fig, ax = plt.subplots(nrows=1, ncols=3, figsize=[12,4])

# plot 1: frame[-1] in the first integration
lastgrp = ax[0].imshow(sci_data[0,ngroups-1,:,:], origin='lower', interpolation='None', aspect='equal', cmap='Greys',
                      vmin=10000, vmax=12000)
ax[0].scatter(sci_px[1], sci_px[0], marker='x', color='r', label='sci pixel')
ax[0].scatter(bgr_px[1], bgr_px[0], marker='+', color='y', label='bgr pixel')
ax[0].set_title('Group {} Int 0'.format(ngroups-1))
ax[0].set_xlabel('px')
ax[0].set_ylabel('px')

# plot 2: pixel slope, spectrum

ax[1].set_title('Slopes, sci pixel (red x)')
for i in range(nints):
    ax[1].plot(sci_data[i, :, sci_px[0], sci_px[1]])
ax[1].set_xlabel('integration')
ax[1].set_ylabel('DN')

# plot 3: pixel slope, background

ax[2].set_title('Slopes, bgr pixel (yellow +)')
for i in range(nints):
    ax[2].plot(sci_data[i, :, bgr_px[0], bgr_px[1]])
ax[2].set_xlabel('integration')
ax[2].set_ylabel('DN')

fig.colorbar(lastgrp, ax=ax[0])
fig.tight_layout()

## Test individual output

### Test Saturation output

The saturation step should flag any saturated pixels in the DQ extension. We check this here by stepping through each integration and looking at the pixel values in a 25 x 25 px box, and checking the maximum counts against the groupdq attribute of the saturation output model. 

The code below should check that pixels with counts > satvalue are flagged, and report an error if the groupdq flag is incorrect.

In [None]:
# read in file output from saturation step

with RampModel(outname+'_saturation.fits') as satmodel:
    # raises exception if file is not the correct model
    data = satmodel.data
    satdq = satmodel.groupdq

satvalue = 55000
print('Saturation level is: ', satvalue, ' counts.')

# Test last frame of each integration for saturation and see if it was flagged.

ngroups = model.meta.exposure.ngroups
nints = model.meta.exposure.nints

for integration in range(nints):
    # check last frame for saturation in region of star
    box = data[integration, ngroups-1 , 500:525, 680:705]
    print()
    print('Max value in 25x25 box around star position: ',np.nanmax(box))
    satframe = satdq[integration, ngroups-1, 500:525, 680:705 ]
    
    satpix = (box >= satvalue)

    # if pixels greater than value, then check that they are flagged
    if satpix.any():
        print('Saturation detected in last frame of integration: ', integration)
        assert np.all(satframe[satpix] == dqflags.group['SATURATED'])
    else:
        print('No pixels saturate in last frame of integration: ', integration)

### Look at plots of output
Compare the slope image from the _rate output file to the median of the slopes of the _rateints file. They should be similar in appearance and flux levels.

In [None]:
# this plot will compare the slope image from the _rate file with the median of the slope images in the rateints file
# check criterion: they should look similar and the maximim values seen in both these images should be similar

#with ImageModel(outname+'_rate.fits') as rmod:
    # raises exception if file is not the correct model
#    rdata = rmod.data
 
#with CubeModel(outname+'_rateints.fits') as rimod:
    # raises exception if file is not the correct model
#    rimoddata = rimod.data

rmod = ImageModel(outname+'_rate.fits')
rdata = rmod.data
rimod = CubeModel(outname+'_rateints.fits')
rimoddata = rimod.data

fig, ax = plt.subplots(nrows=1, ncols=2, figsize=[8,10])

rplt = ax[0].imshow(rdata, origin='lower', aspect='equal', interpolation='None', vmin=0, vmax=10)
ax[0].set_title('Integrated rate file slope image')
ax[0].set_xlabel('px')
ax[0].set_ylabel('px')

riplt = ax[1].imshow(np.median(rimoddata, axis=0), origin='lower', aspect='equal', interpolation='None',
                    vmin=0, vmax=10)
ax[1].set_title('Median of rateints file slope images')
ax[1].set_xlabel('px')
ax[1].set_ylabel('px')

cbar = fig.colorbar(rplt, ax=ax, orientation='horizontal')
cbar.set_label('DN/s')
#fig.tight_layout()

print('Max DN/s in the rate.fits slope image: {} DN/s'.format(np.max(rdata)))
print('Max DN/s of the median of the rateints.fits slope images: {} DN/s'.format(np.nanmax(np.median(rimoddata, axis=0))))

### Test output of jump step to see if specified pixels (and their neighbors) were flagged

In this step we check the output of the jump detection step. This looks at the pixels to which a CR hit was added above, and checks whether they were flagged. The threshold can be adjusted above, and the step re-run to check consistence with the input. 

In [None]:
# load jump step output file
with RampModel(outname+'_jump.fits') as jumpim:
    # raises exception if file is not the correct model
    jumpdata = jumpim.data
    jumpdq = jumpim.groupdq
    
integration = 0
frame = 5

# look for cr flags in dq grpdq array in specified locations
dqframe = jumpdq[integration, frame, :, :]

# print output on which fluxes had neighbors flagged
# output should include pixel coord, average pixel value nearby, cr value, whether neighbors were flagged
print('   xpos       ypos      crmag      avgcounts  pixflagged  neighborflagged \n')
for x, y, crmag in zip(xpos, ypos, crmags):
    # check if pixel is flagged
    # set default flag
    pixflagged = False
    neighborflagged = False

    # get stats on flux values near cr hit
    avgcounts = np.mean(jumpdata[integration, frame, y - 10: y - 5, x - 10: x - 5])

    if dqframe[y, x] & dqflags.pixel['JUMP_DET'] > 0:
        pixflagged = True
        # check neighbor pixels
        if ((dqframe[y + 1, x] & dqflags.pixel['JUMP_DET'] > 0) and
            (dqframe[y - 1, x] & dqflags.pixel['JUMP_DET'] > 0) and
            (dqframe[y, x + 1] & dqflags.pixel['JUMP_DET'] > 0) and
            (dqframe[y, x - 1] & dqflags.pixel['JUMP_DET'] > 0)):
                neighborflagged = True

    # write output
    print('{:8.0f} {:8.0f} {:10.0f} {:15.2f} {:>10} {:>10} \n'.format(x, y, crmag, avgcounts, str(pixflagged), 
                                                                      str(neighborflagged)))

In [None]:
# plot data to see what is being flagged
i=10
nframes = model.meta.exposure.ngroups
frames = np.arange(nframes)

# set up titles for plot
plt.xlabel('Frame number')
plt.ylabel('DN value up the ramp')

for x, y in zip(xpos, ypos):
    # get locations of flagged pixels within the ramps
    jumps = jumpdq[integration, :, y, x] & dqflags.pixel['JUMP_DET'] > 0
    ramp = jumpdata[integration, :, y, x]

    # plot ramps of selected pixels and flagged jumps
    plt.plot(ramp+i*10)
    plt.plot(frames[jumps], ramp[jumps]+i*10, color='r', marker='o')
    i = i+10

#plt.legend()
plt.show()

In [None]:
# show region of dq array to see if cross pixels were flagged 
data = jumpdq[integration, frame, 140:160, 440:600]
plt.imshow(data, cmap='Greys', origin='lower', vmin=0,vmax=5)
plt.show()

### RSCD testing
The RSCD step at the moment simply flags frames as 'DO_NOT_USE' in the groupdq array to avoid the frames that show the rscd effect being used for ramp fitting.

For FULL frame FAST mode data, the RSCD reference file indicates that the first four frames in all integrations greater than 1 (or 0 for 0-indexing), should be flagged as 'DO_NOT_USE', which is indicated by a value 1 in the groupdq array. If the value of the flag in the frame is odd, then this frame has been correctly flagged.

In [None]:
# Use the groupdq output of the jump step to test whether the proper frames are flagged.
# Choose any pixel in the frame (500, 500) to test, since they should all be flagged the same.

print(jumpdq[:,:,500,500])

<a id="about_ID"></a>
## About this Notebook
**Author:** Misty Cracraft, Senior Staff Scientist, MIRI Branch
<br>**Updated On:** 07/28/2020