<a id='top'></a>
# JWST Spectroscopic Data Calibration: Pipeline Stage 3
---
**Authors**:  
Jo Taylor (jotaylor@stsci.edu), Maria Pena-Guerrero (pena@stsci.edu)

Based on original notebooks by Bryan Hilbert (hilbert@stsci.edu)

**Latest Update**: June 10, 2022

<div class="alert alert-block alert-warning">
    <h3><u><b>Notebook Goals</b></u></h3>
<br>
Working with the Spectroscopic Stage 3 Calibration Pipeline, we will:
<ul>
    <li>Look at the different ways to call the pipeline</li>
    <li>Examine exactly what each pipeline step does to the science data</li>
    <li>Look in detail at the steps in common to all spectroscopic modes</li>
</ul>
<br>
<b><i>IMPORTANT</b></i>: The goal of this notebook is not to actually run the pipeline, since each step is only applied to specific JWST modes, but rather to illustrate <i>what</i> each step is doing and <i>how</i> you would run it if you need to. Any code below is provided to showcase how to run the pipeline or any steps, and is not intended to actually be executed.
</div>

## Table of Contents
* [Introduction](#intro)
* [Spectroscopy Modes](#spec_modes)
* [Pipeline Resources and Documentation](#resources)
   * [Installation](#installation)
   * [Reference Files](#reference_files)
* [Imports](#Imports_ID)
* [Association Files](#associations)
* [Methods for calling steps/pipelines](#calling_methods)
* [Parameter Reference Files](#parameter_reffiles)
* [calwebb_spec3](#spec3) 
   * [Run the entire pipeline](#spec3_at_once)
       * [Using the run() method](#run_method)
       * [Using the call() method](#call_method)
       * [Using the command line](#command_line)
   * [Run the individual pipeline steps](#spec3_step_by_step)
       * [The `Assign Moving Target WCS` step](#assign_mtwcs)
       * [The `Master Background Subtraction` step](#master_background)
       * [The `Exposure to Source` step](#exp_to_source)
       * [The `MIRI MRS Sky Matching` step](#mrs_imatch)
       * [The `Outlier Detection` step](#outlier_detection)
       * [The `Resample` step](#resample_spec)
       * [The `Cube Building` step](#cube_build)
       * [The `1D Extraction` step](#extract_1d)
       * [The `1D Combination` step](#combine_1d)

<a id='intro'></a>
## Introduction

This notebook covers the Stage 3 processing of the JWST spectroscopic data, also known as *calwebb_spec3*. This material should serve as a reference guide when you need to run *calwebb_spec3*, or any of its constituent steps, in the future. We will not address spectroscopic Time Series Observation (TSO) calibration, which is handled by a separate pipeline [*calwebb_tso3*](https://jwst-pipeline.readthedocs.io/en/latest/jwst/pipeline/calwebb_tso3.html#calwebb-tso3).

**_IMPORTANT_**: The goal of this notebook is not to actually run the pipeline, since each step is only applied to specific JWST modes, but rather to illustrate *what* each step is doing and *how* you would run it if you need to. Any code below is provided to showcase how to run the pipeline or any steps, and is not intended to actually be run.

The [*calwebb_spec3* pipeline](https://jwst-pipeline.readthedocs.io/en/stable/jwst/pipeline/calwebb_spec3.html) takes one or more calibrated slope images (`*_cal.fits` files) stored in an association file (detailed [below](#associations)) and outputs different types of files, depending on the spectroscopic mode in question. 

1D extracted spectral products, with suffix *_x1d*, are produced for all spectroscopic modes. Combined 1D spectral products, with suffix *_c1d* are produced for WFSS and non-TSO SOSS modes. Resampled 2D products (*_s2d* files) will be created for non-IFU modes, while resampled and combined 3D products (*_s3d* files) are created for IFU modes.

The steps performed in order to reach these final products vary with spectroscopic mode. Wide-Field Slitless Spectroscopy (WFSS) and non-TSO SOSS modes are minimally processed by *calwebb_spec3*, while NIRSpec and MIRI modes undergo more correction steps. 

All JWST spectroscopy mode data, except SOSS TSO data, are processed through the *calwebb_spec3* pipeline. The pipeline is a wrapper which will string together the appropriate steps for each exposure in the proper order.

<a id='spec_modes'></a>
## Spectroscopy Modes

The subset, and order, of steps run by *calwebb_spec3* depends on the mode of the input JWST exposure. There are eight different spectroscopic instrument mode combinations, listed below:
* NIRSpec FS = Fixed Slit
* NIRSpec MOS = Multi-Object Spectroscopy
* NIRSpec IFU = Integral Field Unit
* MIRI FS = LRS (Low Resolution Spectroscopy) Fixed Slit
* MIRI SL = LRS Slitless
* MIRI MRS = Medium Resolution Spectroscopy (IFU)
* NIRISS SOSS = Single Object Slitless Spectroscopy
* NIRISS and NIRCam WFSS = Wide-Field Slitless Spectroscopy

The list of all *calwebb_spec3* steps and which modes they apply to can be found [here](https://jwst-pipeline.readthedocs.io/en/latest/jwst/pipeline/calwebb_spec2.html#calwebb-spec3).

<a id='resources'></a>
## Pipeline Resources and Documentation

There are several different places to find information on installing and running the pipeline. This notebook will give a shortened description of the steps pulled from the detailed pipeline information pages, but to find more in-depth instructions use the links below.

* [JWST Documentation (JDox) for the Stage 3 pipeline](https://jwst-docs.stsci.edu/jwst-data-reduction-pipeline/stages-of-processing/calwebb_image3) including short a short summary of what each step does.

* [High-level description of all pipeline stages and steps](https://jwst-pipeline.readthedocs.io/en/latest/jwst/pipeline/main.html)

* [`jwst` package documentation](https://jwst-pipeline.readthedocs.io/en/latest/jwst/introduction.html) including how to run the pipeline, input/output files, etc.

* [`jwst` package GitHub repository, with installation instructions](https://github.com/spacetelescope/jwst/blob/master/README.md)

* [**Help Desk**](https://stsci.service-now.com/jwst?id=sc_cat_item&sys_id=27a8af2fdbf2220033b55dd5ce9619cd&sysparm_category=e15706fc0a0a0aa7007fc21e1ab70c2f): **If you have any questions or problems regarding the pipeline, submit a ticket to the Help Desk**

<a id='installation'></a>
### Installation

<div class="alert alert-block alert-info">
    During the JWebbinar, we will be working in a pre-existing environment where the <b>jwst</b> package has already been installed, so you won't need to install it yourself.
</div>

<div class="alert alert-block alert-warning">
If you wish to run this notebook outside of this JWebbinar, you will have to first install the <b>jwst</b> package.<br>

For more detailed instructions on the various ways to install the package, see the [installation instructions](https://github.com/spacetelescope/jwst/blob/master/README.md) on GitHub.

The easiest way to install the pipeline is via `pip`. Below we show how to create a new conda environment, activate that environment, and then install the latest released version of the pipeline. You can name your environment anything you like. In the lines below, replace `<env_name>` with your chosen environment name.

>`conda create -n <env_name> python`<br>
>`conda activate <env_name>`<br>
>`pip install jwst`

If you wish to install the development version of the pipeline, which is more recent than (but not as well tested compared to) the latest released version:

>`conda create -n <env_name> python`<br>
>`conda activate <env_name>`<br>
>`pip install git+https://github.com/spacetelescope/jwst`
    
</div>

<a id='reference_files'></a>
### Reference Files

[Calibration reference files](https://jwst-docs.stsci.edu/data-processing-and-calibration-files/calibration-reference-files) are a collection of FITS and ASDF files that are used to remove instrumental signatures and calibrate JWST data. For example, the dark current reference file contains a multiaccum ramp of dark current signal to be subtracted from the data during the dark current subtraction step. 

When running a pipeline or pipeline step, the pipeline will automatically look for any required reference files in a pre-defined local directory. If the required reference files are not present, they will automatically be downloaded from the Calibration Reference Data System (CRDS) at STScI.

<div class="alert alert-block alert-info">
    During the JWebbinar, our pre-existing existing environment is set up to correctly use and store calibration reference files, and you do not need to set the environment variables below.
</div>
    
<div class="alert alert-block alert-warning">
If you wish to run this notebook outside of this JWebbinar, you will have to specify a local directory in which to store reference files, along with the server to use to download the reference files from CRDS. To accomplish this, there are two environment variables that should be set prior to calling the pipeline. These are the CRDS_PATH and CRDS_SERVER_URL variables. In the example below, reference files will be downloaded to the "crds_cache" directory under the home directory.

>`$ export CRDS_PATH=$HOME/crds_cache`<br>
>`$ export CRDS_SERVER_URL=https://jwst-crds-pub.stsci.edu`<br>
OR:<br>
`os.environ["CRDS_PATH"] = "/user/myself/crds_cache"`<br>
`os.environ["CRDS_SERVER_URL"] = "https://jwst-crds-pub.stsci.edu"`<br>

The first time you run the pipeline, the [CRDS server](https://jwst-pipeline.readthedocs.io/en/latest/jwst/introduction.html#crds) should download all of the context and reference files that are needed for that pipeline run, and dump them into the CRDS_PATH directory. Subsequent executions of the pipeline will first look to see if it has what it needs in CRDS_PATH and anything it doesn't have will be downloaded from the STScI cache. 
</div>

<div class="alert alert-block alert-warning">NOTE: Users at STScI should automatically have access to the Calibration Reference Data System (CRDS) cache for running the pipeline, and can skip setting these environment variables.</div>

[Top of Notebook](#top)

<a id='Imports_ID'></a>
## Imports

Import packages necessary for this notebook

In [None]:
# Packages that allow us to get information about objects:
import asdf
import json
import os
# Update to a path in your system (see details below at "Reference files")
os.environ["CRDS_PATH"] = "./data/cal_ref_files"
os.environ["CRDS_SERVER_URL"] = "https://jwst-crds.stsci.edu"
# Astropy tools:
from astropy.io import fits

# JWST pipeline utilities
from jwst import datamodels
from jwst.associations import asn_from_list
from jwst.associations.lib.rules_level3_base import DMS_Level3_Base

# The entire calwebb_spec3 pipeline
from jwst.pipeline.calwebb_spec3 import Spec3Pipeline

# Individual steps that make up calwebb_spec3
from jwst.assign_mtwcs import AssignMTWcsStep
from jwst.master_background import MasterBackgroundStep
from jwst.exp_to_source import exp_to_source
from jwst.mrs_imatch import MRSIMatchStep
from jwst.outlier_detection import OutlierDetectionStep
from jwst.resample import ResampleSpecStep
from jwst.cube_build import CubeBuildStep
from jwst.extract_1d import Extract1dStep
from jwst.combine_1d import Combine1dStep

Check which version of the pipeline we are running:

In [None]:
import jwst
print(jwst.__version__)

Define the output directory.

In [None]:
# To make everything easier, all files saved by the pipeline
# and pipeline steps will be saved to the working directory.
output_dir = './'

# Make sure the output directory exists before downloading any data
if not os.path.exists(output_dir):
    os.makedirs(output_dir)

[Top of Notebook](#top)

<a id='associations'></a>
## Association Files

The Stage 3 pipeline must be called using a json-formatted file called an ["association" file](https://jwst-pipeline.readthedocs.io/en/stable/jwst/associations/index.html). The association file presents your data files in organized groups. When retrieving your observations from MAST, you will be able to download the association files for your data along with the fits files containing the observations.

<a id='diy_association'></a>
##### Creating your own association files

As we will see below, when calling the steps of the pipeline one at a time, you can provide each step with either the output object from the preceding step, or you can provide an association file. Generally the former is easier to do. 

Since each step outputs modified observation files, in order to provide an association file for each step, you'll have to make a new association file for each step that lists the new observation files. Here we will show an example of how to create a new association file for the Stage 3 pipeline or pipeline steps. Keep in mind that you may also simply make a copy of an existing association file and update the member filenames.

We'll use the [`asn_from_list()` function](https://jwst-pipeline.readthedocs.io/en/stable/api/jwst.associations.asn_from_list.asn_from_list.html#jwst.associations.asn_from_list.asn_from_list) to create the new association file designed to be used when running the entirety of the *calwebb_spec3* pipeline, rather than a single step. The same process is used to create an association file to be run through a single step as well. As input, we need the list of member files, as well as the product name. The product name will be prepended onto the output filenames by the pipeline or step that uses this association file.

In [None]:
# Define input cal files
input_files = ["myinput1_cal.fits", "myinput2_cal.fits", "myinput3_cal.fits"]

In [None]:
# Create the association object
out_asn = asn_from_list.asn_from_list(input_files, rule=DMS_Level3_Base, product_name="newoutput")

In [None]:
# Save the association to a json file
output_asn = os.path.join(output_dir, "manual_calwebb3_asn.json")
with open(output_asn, "w") as outfile:
    name, serialized = out_asn.dump(format='json')
    outfile.write(serialized)

You could see the contents of the association file by just printing the association object, `print(out_asn)`, but let's read it in to illustrate how to do that as well.

In [None]:
# Open the association file and load into a json object
with open(output_asn) as f_obj:
    asn_data = json.load(f_obj)

In [None]:
asn_data

Here we see that the association file begins with a few lines of data that give high-level information about the association. The most important entry here is the `asn_rule` field. Association files have different formats for the different stages of the pipeline. You should be sure that the `asn_rule` matches the pipeline that you will be running. In this case we'll be running the Stage 3 pipeline, and we see that the `asn_rule` mentions "Level3", which is what we want.

Beneath these lines, we see the `products` field. This field contains a list of dictionaries that specify the files that belong to this association, and the types of those files. When the Stage 3 pipeline is run on this association file, all files listed here will be run through the calibration steps.

You may also want to supply other ancillary datasets when running *calwebb_spec3*. For instance, you may also include exposures of dedicated background targets for use in the master background subtraction step. Or you may supply source catalog files when processing WFSS data. To include ancillary data such as these, the input files should be a list of tuples where the 0th tuple element is the filename and 1st tuple element defines the filetype. You must also set the optional argument `with_exptype` to `True`. For example:

In [None]:
input_files = [("myinput1_cal.fits", "science"), 
               ("myinput2_cal.fits", "science"), 
               ("myinput3_cal.fits", "science"),
               ("mybackground.fits", "background"),
               ("mysourcecatalog.ecsv", "sourcecat")]

In [None]:
out_asn = asn_from_list.asn_from_list(input_files, rule=DMS_Level3_Base, product_name="newoutput", with_exptype=True)

In [None]:
out_asn

Another way to create an association file is to use the [asn_from_list command line tool](https://jwst-pipeline.readthedocs.io/en/stable/api/jwst.associations.asn_from_list.asn_from_list.html#jwst.associations.asn_from_list.asn_from_list). The same association file that we created above can be created via the following command on the command line. **NOTE:** The command below will not work correctly until we have run the Tweakreg step and created some `*_tweakregstep.fits` files.

<div class="alert alert-block alert-info">
    
```
asn_from_list -o manual_calwebb3_asn.json --product-name newoutput myinput?_cal.fits
```  
</div>

You can see how creating a new association file for each step of the pipeline would become cumbersome. When running *calwebb_spec3* step-by-step, using the preceding step's output object into each step is more convenient.

[Top of Notebook](#top)

<a id='calling_methods'></a>
## Methods for calling steps/pipelines

There are three common methods by which the pipeline or pipeline steps can be called. From within python, the `run()` and `call()` methods of the pipeline or step classes can be used. Alternatively, the `strun` command can be used from the command line. Examples of each method are shown below.

When using the `call()` method or `strun`, optional input parameters can be specified via [parameter reference files](#parameter_reffiles). When using the `run()` method, these parameters are instead specified within python.

<a id='parameter_reffiles'></a>
## Parameter Reference Files

When calling a pipeline or pipeline step using the `call()` method or the command line, [parameter reference files](https://jwst-pipeline.readthedocs.io/en/stable/jwst/stpipe/config_asdf.html#config-asdf-files) can be used to specify values for input parameters. These reference files are in [asdf](https://asdf.readthedocs.io/en/stable/) format and appear somewhat similar to json files when examined in a text editor. 

Versions of parameter reference files containing default parameter values for each step and pipeline are available in CRDS. When using the `call()` method, if you do not specify a parameter reference file name in the call, the pipeline or step will retrieve and use the appropriate file from CRDS, which will then run the pipeline or step with the parameter values in that file. If you provide the name of a parameter reference file, then the parameter values in that file will take precedence. For any parameter not specified in your parameter reference file, the pipeline will use the default value.

When using `strun`, the parameter reference file is a required input.

Let's take a look at the contents of a parameter reference file. We'll open it using the asdf package, and use the `tree` attribute to see what's inside:

In [None]:
step = Spec3Pipeline()
spec3_param_reffile = os.path.join(output_dir, 'calwebb_spec3.asdf')
step.export_config(spec3_param_reffile)
spec3_reffile = asdf.open(spec3_param_reffile)

In [None]:
spec3_reffile.info(max_rows=None)
# or
# spec3_reffile.tree

The top part of the file contains various metadata entries about the file itself. Below that, you'll see a `'name'` entry, which lists `Spec3Pipeline` as the class to which these parameters apply. The next line contains the `parameters` entry, which lists parameters and values attached to the pipeline itself. Below this is the `steps` entry, which contains a list of dictionaries. Each dictionary refers to one step within the pipeline, and specifies parameters and values that apply to that step. If you look through these entries, you'll see the same parameters and values that we specified manually when using the `run()` method above.

Here's one way to programmatically edit a parameter reference file- it can be a bit tedious. It is often easier to open the ASDF file in a text editor and directly edit parameters.

In [None]:
for i in range(len(spec3_reffile["steps"])):
    if spec3_reffile["steps"][i]["name"] == "outlier_detection":
        # Set the radius (in pixels) from a bad pixel for neighbor rejection
        spec3_reffile["steps"][i]["parameters"]["grow"] = 1
    elif spec3_reffile["steps"][i]["name"] == "extract_1d":
        # Fit a polynomial to the background values for each column or row
        spec3_reffile["steps"][i]["parameters"]["bkg_fit"] = "poly"
    elif spec3_reffile["steps"][i]["name"] == "cube_build":
        # Change the method of building the data cube
        spec3_reffile["steps"][i]["parameters"]["output_type"] = "multi"

In [None]:
# Don't forget to close the file
spec3_reffile.close()

[Top of Notebook](#top)

---
<a id='spec3'></a>
## The calwebb_spec3 pipeline: spectroscopic processing

In the sections below, we will illustrate how to run the Stage 3 pipeline using an association file containing science and background exposures. The pipeline is a wrapper which will string together all of the appropriate steps in the proper order. 

After running the entire pipeline, we will go back to the original calibrated slope images and manually run them through each of the steps that comprise the Stage 3 pipeline. For each step we will describe in more detail what is going on and examine how the exposure files have changed.

See [Figure 1](https://jwst-docs.stsci.edu/jwst-data-reduction-pipeline/stages-of-processing/calwebb_spec3) on the calwebb_spec3 algorithm page for a map of which steps are performed depending on spectroscopic mode.

<a id='spec3_at_once'></a>
### Run the entire `calwebb_spec3` pipeline

In this section we show how to run the entire calwebb_spec3 pipeline with a single call using the `run()`, `call()`, and `strun` methods. In this case the pipeline code can determine which instrument was used to collect the data and runs the appropriate steps in the proper order.

You may set parameter values for some of the individual steps, save some outputs, etc, and then call the pipeline.


<a id='run_method'></a>
#### Call the pipeline using the run() method

When using the `run()` method to execute a pipeline (or step), the pipeline class is first instantiated without the data to be processed. Optional input parameters are specified using attributes of the class instance. Finally, the call to the `run()` method is made and the data are supplied.  See here for [more examples of the run() method](https://jwst-pipeline.readthedocs.io/en/stable/jwst/stpipe/call_via_run.html).

The `run()` method does not take any kind of parameter reference file as input. If you wish to set values for various parameters, you must do that manually. Below, we set several parameters in order to show how it's done. 

How do you know what parameters are available to be set and what their default values are? The `spec` property for individual steps will list them. The property is less useful for the pipelines themselves, as it does not show the parameters for the steps comprising the pipeline.

All steps and pipelines have several common parameters that can be set. 

* `save_results` specifies whether or not to save the output of that step/pipeline to a file. The default is False.
* `output_dir` is the directory into which the output files will be saved.
* `output_file` is the base filename to use for the saved result. Note that each step/pipeline will add a custom suffix onto output_file. 

<a id='spec3_using_run'></a>

<div class="alert alert-block alert-info">
Here's how you would run the entire <i>calwebb_spec3</i> pipeline. The output can be a little overwhelming. There will be multiple log entries printed to the screen for each step.
</div>

In [None]:
# Create an instance of the pipeline class
spec3 = Spec3Pipeline()

# Set some parameters that pertain to the entire pipeline
spec3.output_dir = output_dir
spec3.save_results = True

# Set some parameters that pertain to some of the individual steps
spec3.outlier_detection.grow = 1 # radius (in pixels) from a bad pixel for neighbor rejection
spec3.extract_1d.bkg_fit = 'poly' # Fit a polynomial to the background values for each column or row
spec3.cube_build.output_type = 'multi' # Change the method of building the data cube

# Call the run() method
# The line below is commented since we don't actually have a file to calibrate
# spec3.run(asn_file)

<a id='call_method'></a>
#### Call the pipeline using the call() method

When using the `call()` method, a single command will instantiate and run the pipeline (or step). The input data and optional parameter reference files are supplied in this single command. In this case, any desired input parameters cannot be set after instantiation, as with the `run()` method. See here for [example usage of call() method](https://jwst-pipeline.readthedocs.io/en/stable/jwst/stpipe/call_via_call.html).

The commands below will call the pipeline using the `call()` method and will supply the parameter reference file. 

<div class="alert alert-block alert-info">

<b>Method #1:</b>
<br>
Provide the name of the observation file, the pipeline-specific input parameters, and the name of the parameter reference file that specifies step-specific parameters
</div>

<div class="alert alert-block alert-info">

<b>Method #2:</b>
<br>
In this case, build a nested dictionary that specifies parameter values for various steps, and provide it in the call to call().
</div>

<a id='command_line'></a>
#### Call the pipeline from the command line

Calling a pipeline or step from the command line is similar to using the `call()` method. The data file to be processed, along with an optional parameter reference file and optional parameter/value pairs can be provided to the `strun` command. See here for [additional examples of command line calls](https://jwst-pipeline.readthedocs.io/en/stable/jwst/introduction.html?highlight=%22command%20line%22#running-from-the-command-line).

The cells below contains two different command line commands that use `strun` to call the calwebb_spec3 pipeline. 

<div class="alert alert-block alert-info">

<b>Method #1:</b>
<br>
We provide the name of the pipeline class, the observation file, and explicitly set pipeline- and step-specific parameters. You can see that the command quickly becomes quite large with the added parameter settings. 
    
```
    strun jwst.pipeline.Spec3Pipeline myinput_asn.json --save_results=True --output_dir='./' --steps.outlier_detection.grow=1 --steps.extract_1d.bkg_fit='poly'
```
</div>

<div class="alert alert-block alert-info">

<b>Method #2:</b>
<br>
This version of the command is much more succinct, as the parameter values to be set are all contained within the parameter reference file. The pipeline class is also contained in the parameter reference file, so there is no need to specify it in the command itself.
    
```
    strun spec3_modified_paramfile.asdf myinput_rate.fits
```
</div>

[Top of Notebook](#top)

<a id='spec3_step_by_step'></a>
## Run the individual pipeline steps

In the sections below, we explain what each step does and illustrate how to run it. We show only the `run()` method of executing steps, but the `call()` or `strun` methods could be used with the same syntax as shown above.

<a id='assign_mtwcs'></a>
## The `Assign Moving Target WCS` step

#### Summary
Since a moving target will be located at different RA and Dec across multiple exposures within a Moving Target (MT) observation, this step modifies the WCS output frame in each exposure of the MT observation association. This way the WCS is centered at the average location of the target within the whole association, which results in the correct alignment of multiple exposures.

The step is executed at the beginning of the the pipeline, so that all subsequent steps that rely on WCS information use the frame centered on the target.

#### Affected Modes
Assuming the target is moving, this step is run for the following modes:
- NIRSpec: all
- MIRI: all
- NIRISS: all
- NIRCam: all

#### Documentation
For further details please see the [full documentation](https://jwst-pipeline.readthedocs.io/en/latest/jwst/assign_mtwcs/main.html).

#### Arguments
This step does not take any specific arguments.

#### Reference files used
This step uses no specific reference files.

**Run the step**

[Top of Notebook](#top)

<a id='master_background'></a>
## The `Master Background Subtraction` step

#### Summary
This step is typically run during *calwebb_spec3* for all modes **except** NIRSpec MOS, for which the step is instead run in *calwebb_spec2*.

In general, the master background subtraction method works by taking a 1D background spectrum, interpolating it back into the 2D space of a science image, and then subtracting it. This allows for higher SNR background data to be used than what might be obtainable by doing direct image-from-image subtraction using only one or a few background images. The 1D master background spectrum can either be constructed on-the-fly by the calibration pipeline using available background data or supplied by the user.

The master background subtraction is different from the background subtraction step in the *calweb_spec2* processing, which is an image-from-image subtraction and is the default for spectroscopic data. Alternatively, the user can select to skip that step in *calwebb_spec2* and do the master background subtraction instead.

#### Affected Modes
This step is run for the following modes:
- NIRSpec: FS, IFU
- MIRI: all
- NIRISS: none
- NIRCam: none

#### Documentation
For further details please see the [full documentation](https://jwst-pipeline.readthedocs.io/en/latest/jwst/master_background/description.html).

#### Arguments
This step has several [specific arguments](https://jwst-pipeline.readthedocs.io/en/latest/jwst/master_background/arguments.html).

#### Reference files used
This step uses no specific reference files.

**Run the step**

[Top of Notebook](#top)

<a id='exp_to_source'> </a>
## The `Exposure to Source` step

#### Summary
This step reorganizes the Stage 2 exposure-based data to Stage 3 source-based data. It is only used when there is a known source list for the exposure data, which is required in order to reorganize the data by source. Hence it is only useable for NIRSpec MOS and FS, and WFSS data. The inputs are the calibrated products from *calwebb_spec2* that are ordered by exposure. 

The `exp_to_source` step re-arranges the data to return a python dictionary that contains all slits belonging to the same source. For example, if the input consists of a set of 3 exposure-based *_cal* files (from a 3-point nod dither pattern, for example), each one of which contains data for 5 defined sources, then the output consists of a set of 5 source-based *_cal* products (one per source), each of which contains the data from the 3 exposures for each source.

#### Affected Modes
This step is run for the following modes:
- NIRSpec: FS, MOS
- MIRI: none
- NIRISS: WFSS
- NIRCam: WFSS

#### Documentation
For further details please see the [full documentation](https://jwst-pipeline.readthedocs.io/en/latest/jwst/exp_to_source/index.html#exp-to-source).

#### Arguments
This step does not take any specific arguments.

#### Reference files used
This step uses no specific reference files.

**Run the step**

[Top of Notebook](#top)

<a id='mrs_imatch'> </a>
## The `MIRI MRS Sky Matching` step

#### Summary
This step “matches” image intensities of several input 2D MIRI MRS images by fitting polynomials to cube intensities (cubes built from the input 2D images), in such a way as to minimize inter-image mismatches in intensity. The “background matching” polynomials are defined in the frame of world coordinates (e.g. RA, DEC, lambda).

By default, the matching "backgrounds" are not subtracted from the exposure data.

#### Affected Modes
This step is run for the following modes:
- NIRSpec: none
- MIRI: MRS
- NIRISS: none
- NIRCam: none

#### Documentation
For further details please see the [full documentation](https://jwst-pipeline.readthedocs.io/en/latest/jwst/mrs_imatch/index.html#mrs-imatch-step).

#### Arguments
This step takes two optional arguments: `bkg_degree` (integer) which is the polynomial degree with default=1, and `subtract` (boolean) with default=`False`, which indicates if the computed matching "backgrounds" are to be subtracted from the image data.

#### Reference files used
This step uses no specific reference files.

**Run the step**

[Top of Notebook](#top)

<a id='outlier_detection'></a>
## The `Outlier Detection` step

#### Summary

This step uses the collection of input files to identify and flag any cosmic rays or other transient image artifacts that were not flagged by the jump step in calwebb_detector1. While the jump step looked for large pixel-based deviations in the signal from group-to-group within an integration, the outlier detection step looks for large sky-based exposure-to-exposure deviations in the signal. If a given location on the sky shows no signal above the noise in 4 out of 5 exposures, but a bright source in the remaining exposure, the outlier detection step will flag in the DQ map the pixels containing the source in the fifth exposure. These pixels will be ignored when the exposures are later combined into a final mosaic image.

#### Affected Modes
This step is run for the following modes:
- NIRSpec: all
- MIRI: all
- NIRISS: none
- NIRCam: none

#### Documentation

[Full description](https://jwst-pipeline.readthedocs.io/en/stable/jwst/outlier_detection/main.html) of the step.

#### Arguments

There are [numerous optional arguments](https://jwst-pipeline.readthedocs.io/en/stable/jwst/outlier_detection/arguments.html) for this step, including several that apply to the resample step, which this step makes use of.


#### Reference files used

This step does not use any step-specific reference files.

**Run the step**

[Top of Notebook](#top)

<a id='resample_spec'> </a>
## The `Resample Spec` step

#### Summary
This step resamples each input 2D image based on the WCS and distortion information, and combins multiple resampled images into a single undistorted product. 

#### Affected Modes
This step is run for the following modes:
- NIRSpec: FS, MOS
- MIRI: FS
- NIRISS: none
- NIRCam: none

#### Documentation
[Full description](https://jwst-pipeline.readthedocs.io/en/stable/jwst/resample/main.html) of the step.

#### Arguments
There is a list of [optional Astrodrizzle-style](https://jwst-pipeline.readthedocs.io/en/stable/jwst/resample/arguments.html) input parameters that can be used to customize the resampling process.

#### Reference files used
This step uses the [`DRIZPARS`](https://jwst-pipeline.readthedocs.io/en/stable/jwst/resample/reference_files.html) reference file. This file contains Astrodrizzle-style keywords that can be used to control the details of the resampling.

**Run the step**

[Top of Notebook](#top)

<a id='cube_build'> </a>
## The `Cube Building` step

#### Summary
This step is applied to MIRI or NIRSpec IFU calibrated 2D images to produce 3D spectral cubes. The 2D disjointed IFU slice spectra are corrected for distortion and assembled into a rectangular cube with three orthogonal axes: two spatial and one spectral.

This step is run in both the *calwebb_spec2* and *calwebb_spec3* pipelines. In *calwebb_spec3*, where the input can be a collection of data from multiple exposures covering multiple bands, the default behavior is to create a set of single-band cubes. For MIRI, for example, this can mean separate cubes for bands 1A, 2A, 3A, 4A, 1B, 2B, …, 3C, 4C, depending on what’s included in the input. For NIRSpec this may mean multiple cubes, one for each grating+filter combination contained in the input collection. The *calwebb_spec3* pipeline calls `cube_build` with `output_type=band`. These types of IFU cubes will have a linear-wavelength dimension. If the user wants to combine all the data together covering several band they can using the option `output_type=multi` and the resulting IFU cubes will have a non-linear wavelength dimension.

#### Affected Modes
This step is run for the following modes:
- NIRSpec: IFU
- MIRI: MRS
- NIRISS: none
- NIRCam: none

#### Documentation
For further details please see the [full documentation](https://jwst-pipeline.readthedocs.io/en/latest/jwst/cube_build/main.html).

#### Arguments
There are a number of arguments that control the sampling size of the cube, as well as the type of data that is combined to create the cube. See all the [optional arguments](https://jwst-pipeline.readthedocs.io/en/stable/jwst/cube_build/arguments.html).

#### Reference files used
This step uses the [`CUBEPAR` reference file](https://jwst-pipeline.readthedocs.io/en/stable/jwst/cube_build/reference_files.html).


**Run the step**

[Top of Notebook](#top)

<a id='extract_1d'> </a>
## The `1D Extraction` step

#### Summary
This step extracts a 1D signal from a 2D or 3D dataset and writes spectral data to an output 1D extracted spectral data product. For all non-TSO modes, this is a file with suffix *_x1d*, and for TSO modes the suffix is *_x1dints*. 

For non-WFSS modes, a reference file is used to specify the location and size of the target and background extraction regions. For WFSS modes, the extraction regions are defined as the 2D subarray/cutout for each source.

#### Affected Modes
This step is run for the following modes:
- NIRSpec: all
- MIRI: all
- NIRISS: all
- NIRCam: all

#### Documentation
For further details see [full documentation](https://jwst-pipeline.readthedocs.io/en/latest/jwst/extract_1d/index.html#extract-1d-step).

#### Arguments
This step takes several [specific arguments](https://jwst-pipeline.readthedocs.io/en/latest/jwst/extract_1d/arguments.html).

#### Reference files used
This step uses the [`EXTRACT1D` and `APCORR` reference files](https://jwst-pipeline.readthedocs.io/en/stable/jwst/extract_1d/reference_files.html).


**Run the step**

[Top of Notebook](#top)

<a id='combine_1d'> </a>
## The `1D Spectral Combination` step

#### Summary
This step computes a weighted average of 1D spectra and writes the combined 1D spectrum output.

For each pixel of each input spectrum, the corresponding pixel in the output is identified (based on wavelength), and the input value multiplied by the weight is added to the output buffer. Pixels that are flagged (via the DQ column) with “DO_NOT_USE” will not contribute to the output. After all input spectra have been included, the output is normalized by dividing by the sum of the weights. The weight will typically be the integration time or the exposure time, but uniform (unit) weighting can be specified instead.

The output wavelengths will be increasing, regardless of the order of the input wavelengths. In the ideal case, all input spectra would have wavelength arrays that were very nearly the same. In this case, each output wavelength would be computed as the average of the wavelengths at the same pixel in all the input files. The combine_1d step is intended to handle a more general case where the input wavelength arrays may be offset with respect to each other, or they might not align well due to different distortions. All the input wavelength arrays will be concatenated and then sorted. The code then looks for “clumps” in wavelength, based on the standard deviation of a slice of the concatenated and sorted array of input wavelengths; a small standard deviation implies a clump. In regions of the spectrum where the input wavelengths overlap with somewhat random offsets and don’t form any clumps, the output wavelengths are computed as averages of the concatenated, sorted input wavelengths taken N at a time, where N is the number of overlapping input spectra at that point.

#### Affected Modes
This step is run for the following modes:
- NIRSpec: none
- MIRI: none
- NIRISS: all
- NIRCam: all

#### Documentation
For further details see [full documentation](https://jwst-pipeline.readthedocs.io/en/latest/jwst/combine_1d/index.html#combine-1d-step).

#### Arguments
This step take one argument: `exptime_key` (string) which  identifies the metadata element (or FITS keyword) for the weight to apply to the input data. The default is the integration time. For further details see the [documentation](https://jwst-pipeline.readthedocs.io/en/latest/jwst/combine_1d/arguments.html).

#### Reference files used
This step does not use any step-specific reference files.

**Run the step**

[Top of Notebook](#top)