<a id='top'></a>
# calwebb_detector2 step-by-step notebook
---
For running on a single file

**Author**: Jonathan Aguilar (jaguilar@stsci.edu) | **Latest Update**: 23 Oct 2023

* [Introduction](#intro)
* [Pipeline Resources and Documentation](#resources)
* [Imports](#imports)
* [Convenience tools](#convenience_tools)
* [File selection](#file_selection)
* [Run the individual pipeline steps](#image2_step_by_step)
   * [Import the pipeline steps](#import_pipeline)
   * [The `Background Subtraction` step](#bkg_subtract)
   * [The `Assign WCS` step](#assign_wcs)
   * [The `Flat Fielding` step](#flatfield)
   * [The `Photometric Scaling` step](#photom)
   * [The `Resampling` step](#resample)
   * [Saving to cal/calints](#close_out)

<a id='intro'></a>
## Introduction

The Stage 2 JWST pipeline for coronagraphy takes a `rateints` pipeline product as input, and performs the operations necessary to convert the units of the pixel counts into `MJy/sr`. These include:
- background subtraction, to remove background flux
- WCS assignment, to transform between pixel and sky coordinates
- flat fielding, to normalize pixel sensitivities
- photometric calibration, to scale DN/s to MJy/sr

The input `rateints` dataproduct consists of a 3-D slope cube that has units of DN/s, where DN stands for Data Number, a unit used to label uncalibrated counts coming from the detector readout electronics.

This notebook breaks the calwebb_image2 (also called Image2Pipeline) pipeline class into steps, runs each step independently, and examines the output. It demonstrates how to change step-specific parameters at the step level. Much of the material, especially the documentation, is based on the example notebook found [here](https://github.com/spacetelescope/jwebbinar_prep/blob/main/pipeline_inflight/imaging_mode_stage_2.ipynb), written by Bryan Hilbert. Materials and videos can be found [here](https://www.stsci.edu/jwst/science-execution/jwebbinars) under JWebbinar 18. Here, it has been tailored to the specific case of MIRI coronagraphy.

Since this notebook focuses on running a single file through the pipeline, it skips the background subtraction step, which requires background exposures. For examples that include background subtraction, please see the following notebooks:

<a id='resources'></a>
## Pipeline resources and documentation

Documentation on `calwebb_image2` and the steps run on MIRI coronagraphy data specifically can be found here: https://jwst-pipeline.readthedocs.io/en/latest/jwst/pipeline/calwebb_image2.html

We also refer the user to the Imaging example notebook: https://github.com/spacetelescope/jwebbinar_prep/blob/main/pipeline_inflight/imaging_mode_stage_2.ipynb. This contains a wealth of information that has been omitted here for brevity.




<a id='imports'></a>
## Non-pipeline imports

In [None]:
import os
from collections import OrderedDict
from pathlib import Path

In [None]:
import numpy as np

In [None]:
import matplotlib as mpl
from matplotlib import pyplot as plt
from astropy.io import fits

<a id='convenience_tools'></a>
## Convenience tools

Environment paths and functions that make life easier.

First, set up a local CRDS directory. When the pipeline pulls a reference file from CRDS for the first time, it will write a copy to this directory. All subsequent reads of the reference file will redirect to the local directory instead of sending the file again over the network.

See https://jwst-pipeline.readthedocs.io/en/latest/jwst/user_documentation/reference_files_crds.html#crds

In [None]:
os.environ['CRDS_PATH'] = '/Volumes/agdisk/crds/'
# os.environ['CRDS_PATH'] = ''
os.environ['CRDS_SERVER_URL'] = 'https://jwst-crds.stsci.edu'

Advanced users - uncomment the cell below and specify the context if you have a specific combination of reference files you want to use

In [None]:
# os.environ['CRDS_CONTEXT'] = 'jwst_1140.pmap'

In [None]:
# some plot formatting
mpl.rcParams['image.origin'] = "lower"

In [None]:
# you may need to run this command twice for plots to pop up correctly
# %matplotlib auto
%matplotlib inline

<a id="file_selection"></a>
## Collect Stage 1 product

By default, this notebook uses one of the results of the `calwebb_detector1-all_exposures.ipynb` notebook. If you wish to use a file on your own system, please replace the `rate_file` path in the cell below. 

If you have not run the Stage 1 notebook but you would still like to use this specific exposure as an example, you can retrieve it directly from MAST with the code snippet below.

<div class="alert alert-block alert-info">
Snippet for downloading Stage 1 ERS-1386 data:

```
from astroquery.mast import Observations
filename = "jw01386007001_04101_00001_mirimage_rateints.fits"
Observations.download_file(f"mast:JWST/product/{filename}", local_path= f"./stage1/input/{filename}")
rate_file = f"./stage1/input/{filename}"
```
    
</div>

In [None]:
rate_file = "stage1/output/jw01386007001_04101_00001_mirimage_rateints.fits"

Let's do a quick inspection of the file

In [None]:
# print some basic information
fits.info(rate_file)

In [None]:
# a simple plot
fig, ax = plt.subplots(1, 1)
img = np.nanmean(fits.getdata(rate_file, 1), axis=0)
imlims = dict(zip(['vmin', 'vmax'], np.nanquantile(img, [0.01, 0.99])))
ax.imshow(img, **imlims, origin='lower')

<a id='image2_step_by_step'></a>
## Run each step of calwebb_image2

We're going to save the output of each step separately, so the code below generates an output folder for each step. We will be running only the steps that are relevant to the MIRI coronagraphy pipeline. All the provided parameter values are the default values.

The steps, in order of execution, are:
- `bkg_subtract`
    - performs background subtraction
    - takes as input a datamodel and an optional list of background exposure filenames
    - returns datamodel with the backgorund exposures subtracted
    - step-specific arguments:
        - `sigma float [3.0]`: std for sigma-clipping
        - `maxiters int [None]`: number of clipping iterations to perform when combining multiple background images
        - `save_combined_background bool [False]`: save the master background, or no
        - `wfss_mmag_extract tuple [None]`: only for WFSS exposures; sets min mag for extracting sources
- `assign_wcs`
    - associates a WCS object with each the exposure
    - keywords used: RA_REF, DEC_REF, V2_REF, V3_REF, ROLL_REF, RADESYS
    - no step-specific arguments    
    - reference files used:
        - DISTORTION
        - FILTEROFFSET
 
- `flat_field`
    - multiplies by a flat field reference image to mitigate pixel-to-pixel sensitivity differences
    - step-specific arguments:
        - `save_interpolated_flat bool [False]`: only for NIRSpec; save the flat-field that was constructed on the fly
        - `user_supplied_flat str [None`: path to a user-supplied flat-field image to use
        - `inverse bool [False]`: multiply isntead of divide
    - reference files used:
        - FLAT
        
- `photom`
    - scales the data by a conversion factor between units of countrate to units of surface brightness 
    - step-specific arguments:
        - `inverse bool [False]`: divide instead of multiply
        - `source_type str [None]`: force processing for POINT or EXTENDED sources
        - `mrs_time_correction bool [True]`: turn on or off time-dependent correction for MRS sensitivity decay
    - reference files used:
        - PHOTOM

<a id='import_pipeline'></a>
### Import the pipeline steps

In [None]:
import jwst
jwst.__version__

In [None]:
from jwst import datamodels
from jwst.datamodels import dqflags

In [None]:
from jwst.pipeline import Image2Pipeline
from jwst.background import BackgroundStep
from jwst.assign_wcs import AssignWcsStep 
from jwst.flatfield import FlatFieldStep
from jwst.photom import PhotomStep
from jwst.resample import ResampleStep

We're going to write out the results of each step to disk, and also keep a copy in memory in the `results` dict generated in the cell below.

In [None]:
# Create a dictionary for storing the intermediate results.

# The get_pars() method returns a dictionary of allowed parameters for the pipeline stage, including
# a dictionary of parameters for each step. We will use this dictionary to get a list of steps that will
# index our results dictionary.
results = OrderedDict([(step, None) for step in Image2Pipeline().get_pars()['steps'].keys()])

In [None]:
print("Here is a list of the different steps:")
for step in results.keys():
    print("\t" + step)

Generate the output folders specific to this notebook

For this notebook, each step gets its own output folder: `stage2/output-steps//{step_name}`

In [None]:
for k in results.keys():
    p = Path(f"./stage2/output-steps//{k}")
    if not p.exists():
        p.mkdir(parents=True)
        print(str(p.resolve()), "made")
    else:
        print(f"{str(p)} found")

ASDF datamodels are the native data format used by the pipeline, so we're going to use them here

In [None]:
init_dm = datamodels.open(rate_file)

Add the starting data to the results dict

In [None]:
results['init'] = init_dm

<a id='bkg_subtract'></a>
## Background subtraction

<div class="alert alert-block alert-info">
Note: for this notebook, where we explicitly choose to not combine multiple exposures, we skip the background subtraction step. The final data products from this notebook will have visible glowsticks.
</div>

https://jwst-pipeline.readthedocs.io/en/latest/jwst/background_step/index.html#background-step

The background subtraction step performs image-from-image subtraction in order to accomplish subtraction of background signal. The step takes as input one target exposure, to which the subtraction will be applied, and a list of one or more background exposures.

There are 4 optional arguments:
- sigma: std for sigma-clipping
- maxiters: number of clipping iterations to perform when combining multiple background images
- save_combined_background: save the master background, or no
- wfss_mmag_extract: only for WFSS exposures; sets min mag for extracting sources

In [None]:
output_dir = Path(f"stage2/output-steps//background")

In [None]:
results['bkg_subtract'] = BackgroundStep.call(
    results['init'],
    # common
    save_results=True, 
    output_dir=str(output_dir),
    # For this case, we are going to skip it.
    skip=True
)

<a id='assign_wcs'></a>
## Assign WCS

https://jwst-pipeline.readthedocs.io/en/latest/jwst/assign_wcs/index.html#assign-wcs-step

This step associates a WCS object with each science exposure. The WCS object transforms positions in the detector frame to positions in a world coordinate frame - ICRS and wavelength.

There are no step-specific optional arguments for AssignWCS.

In [None]:
output_dir = "stage2/output-steps//assign_wcs"

results['assign_wcs'] = AssignWcsStep.call(
    results["bkg_subtract"],
    # common
    save_results=True, 
    output_dir=str(output_dir),
    # step-specific - None
)
    

In [None]:
dm = results['assign_wcs']

In [None]:
fig, ax = plt.subplots(1, 1, subplot_kw={'projection': dm.meta.wcs})
img = np.nanmean(dm.data, axis=0)
imlims = dict(zip(['vmin', 'vmax'], np.nanquantile(img, [0.01, 0.99])))
ax.imshow(img, **imlims, origin='lower')

overlay = ax.get_coords_overlay('icrs')
overlay.grid(color='white', ls='dotted')
overlay[0].set_axislabel('Right Ascension (J2000)')
overlay[1].set_axislabel('Declination (J2000)')

<a id='flatfield'></a>
## Flat Fielding

https://jwst-pipeline.readthedocs.io/en/latest/jwst/flatfield/index.html#flatfield-step

At its basic level this step flat-fields an input science dataset by dividing by a flat-field reference image. 

Step-specfic arguments:

- save_interpolated_flat (boolean, default=False)
    - A flag to indicate whether to save to a file the NIRSpec flat field that was constructed on-the-fly by the step. Only relevant for NIRSpec data.
- user_supplied_flat (string, default=None)
    - The name of a user-supplied flat-field reference file.
- inverse (boolean, default=False)
    - indicate whether the math operations used to apply the flat-field should be inverted (i.e. multiply instead of divide).
    
<div class="alert alert-block alert-info">
    Note: The discovery of the stray light "glowstick" anomaly impacting the coronagraphs has made it challenging to derive a proper in-flight flat field reference file, since the stray light is an additive effect. The current flat field reference files were derived pre-flight. For this reason, they appear to *add* the glowsticks, especially in the 1065 and 1140 coronagraphs, but this is only a result of the decreased sensitivity along the quadrant boundaries.
</div>

In [None]:
output_dir = "stage2/output-steps/flat_field"

results['flat_field'] = FlatFieldStep.call(
    results['assign_wcs'],
    # common
    save_results=True, output_dir=str(output_dir),
    # step-specific
    save_interpolated_flat = False,
    user_supplied_flat = None,
    inverse = False    
)

In [None]:
nrows, ncols = 1, 3

fig, axes = plt.subplots(nrows=nrows, ncols=ncols, figsize=(6*ncols, 6*nrows))
fig.suptitle("Flat-fielding")

before_img = np.nanmean(results['assign_wcs'].data, axis=0)
after_img = np.nanmean(results['flat_field'].data, axis=0)
vmin, vmax = np.nanquantile([before_img, after_img], [0.01, 0.99])


ax = axes.flat[0]
ax.set_title("Before")
ax.imshow(before_img, vmin=vmin, vmax=vmax)

ax = axes.flat[1]
ax.set_title("After")
imax = ax.imshow(after_img, vmin=vmin, vmax=vmax)

ax = axes.flat[2]
ax.set_title("Flat field (reconstructed)")
imax = ax.imshow(before_img/after_img)

## photom

https://jwst-pipeline.readthedocs.io/en/latest/jwst/photom/index.html#photom-step

The photom step applies flux (photometric) calibrations to a data product to convert the data from units of countrate to surface brightness (or, in some cases described below, to units of flux density)

Step-specfic arguments:

- inverse (boolean, default=False)
    - A flag to indicate whether the math operations used to apply the correction should be inverted (i.e. divide the calibration data into the science data, instead of the usual multiplication).

- source_type (string, default=None)
    - Force the processing to use the given source type (POINT, EXTENDED), instead of using the information contained in the input data.

- mrs_time_correction (boolean, default=True)
    - A flag to indicate whether to turn on the time and wavelength dependent correction for MIRI MRS data.
    

In [None]:
output_dir = "stage2/output-steps//photom"

results['photom'] = PhotomStep.call(
    results['flat_field'],
    # common
    save_results=True, output_dir=str(output_dir),
    # step-specific
)
    

The ratio between the before and after images should look like the flat field reference file

In [None]:
nrows, ncols = 1, 2

fig, axes = plt.subplots(nrows=nrows, ncols=ncols, figsize=(6*ncols, 6*nrows))
fig.suptitle("Photometric calibration\nsame numerical scale, different units")

before_img = np.nanmean(results['flat_field'].data, axis=0)
after_img = np.nanmean(results['photom'].data, axis=0)
vmin, vmax = np.nanquantile([after_img, before_img], [0.01, 0.99])

ax = axes[0]
ax.set_title("Before")
imax = ax.imshow(before_img, vmin=vmin, vmax=vmax)
cbar = fig.colorbar(imax, ax=ax, orientation='horizontal')
cbar.set_label(results['flat_field'].meta.bunit_data)

ax = axes[1]
ax.set_title("After")
imax = ax.imshow(after_img, vmin=vmin, vmax=vmax)
cbar = fig.colorbar(imax, ax=ax, orientation='horizontal')
cbar.set_label(results['photom'].meta.bunit_data)

## resample

<div class="alert alert-block alert-info">
    Resampling is not run on coronagraphic Stage 2 dataproducts. Is only run on 2-D images. If you wish to see the results of `resample`, you may run this notebook starting from the `rate.fits` Stage 1 data product instead of `rateints.fits`, and set `skip=False`.
    The Stage 2 `resample` step is only for user inspection. It outputs an "_i2d.fits" file that is not passed to the Stage 3 pipeline.
</div>

https://jwst-pipeline.readthedocs.io/en/latest/jwst/resample/index.html#resample-step

This routine will resample each input 2D image based on the WCS and distortion information, and will combine multiple resampled images into a single undistorted product.

The resample step has the following optional arguments that control the behavior of the processing and the characteristics of the resampled image.

- pixfrac (float, default=1.0)
  - The fraction by which input pixels are “shrunk” before being drizzled onto the output image grid, given as a real number between 0 and 1.

- kernel (str, default=’square’)
  - The form of the kernel function used to distribute flux onto the output image. Available kernels are square, gaussian, point, tophat, turbo, lanczos2, and lanczos3.

- pixel_scale_ratio (float, default=1.0)
  - Ratio of input to output pixel scale. A value of 0.5 means the output image would have 4 pixels sampling each input pixel. Ignored when pixel_scale or output_wcs are provided.

- pixel_scale (float, default=None)
  - Absolute pixel scale in arcsec. When provided, overrides pixel_scale_ratio. Ignored when output_wcs is provided.

- rotation (float, default=None)
  - Position angle of output image’s Y-axis relative to North. A value of 0.0 would orient the final output image to be North up. The default of None specifies that the images will not be rotated, but will instead be resampled in the default orientation for the camera with the x and y axes of the resampled image corresponding approximately to the detector axes. Ignored when pixel_scale or output_wcs are provided.

- crpix (tuple of float, default=None)
  - Position of the reference pixel in the image array in the x, y order. If crpix is not specified, it will be set to the center of the bounding box of the returned WCS object. When supplied from command line, it should be a comma-separated list of floats. Ignored when output_wcs is provided.

- crval (tuple of float, default=None)
  - Right ascension and declination of the reference pixel. Automatically computed if not provided. When supplied from command line, it should be a comma-separated list of floats. Ignored when output_wcs is provided.

- output_shape (tuple of int, default=None)
  - Shape of the image (data array) using “standard” nx first and ny second (as opposite to the numpy.ndarray convention - ny first and nx second). This value will be assigned to pixel_shape and array_shape properties of the returned WCS object. When supplied from command line, it should be a comma-separated list of integers nx, ny.

In [None]:
output_dir = Path("output-op/resample/")

In [None]:
try:
    results['resample'] = ResampleStep(
        results['flat_field'],
        # common
        save_results = True,
        output_dir=str(output_dir),
        skip=True,
        # step-specific
        pixfrac = 1.0,
        kernel = 'square',
        fillval = 'INDEF',
        weight_type = 'ivm',
        output_shape = None,
        crpix = None,
        crval = None,
        rotation = None,
        pixel_scale_ratio = 1.0,
        pixel_scale = None,
        output_wcs = '',
        single = False,
        blendheaders = True,
        allowed_memory = None,
        in_memory = True
    )
except TypeError as e:
    print("The `resample` step needs an ImageModel, not a CubeModel")
    print(f"Error message:\n\t{e}")

<a id='close_out'></a>
## Close it out

Just so that we can end up with the proper *rate{ints}.fits* filenames

In [None]:
img2 = Image2Pipeline(output_dir="stage2/output-steps")

img2.save_model(results['flat_field'], 
                suffix="calints",
                output_file=results['flat_field'].meta.filename)

For more pipeline notebooks, see the examples https://github.com/spacetelescope/jwebbinar_prep/tree/main/pipeline_inflight