## Introduction to the JWST pipeline

This is a tutorial on running the JWST Science Calibration Pipeline (referred to as “the pipeline”) and individual pipeline steps.


### Outline


- Installation
- Pipeline Overview
- Running the pipeline or a single step on the command line
- Running the pipeline or a single step in a Python session
- Running a modified pipeline
- stpipe
- datamodels


## Installation

#### Initialize a conda environment

For the purpose of this tutorial we will install jwst in a new conda environment, called `jwst-test` using Python 3.8. This tutorial assumes a user has conda installed.
The package installs its dependencies but we need to install Python and numpy in the conda environment.  

- Make and activate the new conda environment

*% conda create -n jwst-test python=3.8*

*% source activate jwst-test*

- Install numpy

*% pip install numpy==1.19*

- Another useful (not required) package is jupyter. It installs ipython and jupyter notebook.

*% pip install jupyter*

#### Install a public release of jwst from PyPi.

The latest public release can be installed with one command.

*% pip install jwst*

#### Install the latest devlopment code from the jwst repository

The source code for the JWST pipeline is located on github in the spacetelescope organization

https://github.com/spacetelescope/jwst

To install the latest code from github use the command:

*% pip install git+https://github.com/spacetelescope/jwst*


#### Calibration Reference Data System (CRDS) Setup

The pipeline uses reference data stored in CRDS. Each step retrieves the the reference data it needs from CRDS at run time by matching certain keywords in the observation to keywords in the eference files. At STScI CRDS and its cache are installed in a central location available to everyone. When not at STScI we need to tell where CRDS should store the cache locally and which server it should use. This is done by setting two environmental variables

*% export CRDS_PATH=$HOME/crds_cache*

*% export CRDS_SERVER_URL=https://jwst-crds.stsci.edu*

## Pipeline Overview

The JWST pipeline is broken into 3 stages. Each stage consists of a series of steps. This tutorial uses stages 1 and 2 of the pipeline as examples. 

- Stage 1: Detector-level corrections and ramp fitting for individual exposures

  The first stage, called `calwebb_detecotr1`, is applied nearly universally for all instruments and modes. It consists of detector-level corrections that are performed on a group-by-group basis, followed by ramp fitting. The output of stage 1 processing is a countrate image per exposure, or per integration for some modes. 


- Stage 2: Instrument-mode calibrations for individual exposures

  The second stage is split into separate modules for imaging and spectroscopic data, called `calwebb_image2` and `calwebb_spec2` respectively. It consists of additional instrument-level and observing-mode corrections and calibrations to produce fully calibrated exposures. The details differ for imaging and spectroscopic exposures, and there are some corrections that are unique to certain instruments or modes.
  
  
- Stage 3: Combining data from multiple exposures within an observation

  The third stage is divided into five separate modules depending on the observing mode. It consists of routines that work with multiple exposures and in most cases produce some kind of combined product. There are unique pipeline modules for stage 3 processing of imaging, spectroscopic, coronagraphic, AMI, and TSO observations.

## Running the pipeline

#### Running from the command line

Individual steps and pipelines (consisting of a series of steps) can be run from the command line using the `strun` command and passing either a configuration file or the class name and an input file

*% strun `class_name` `input_file`*

*% strun `configuration_file` `input_file`*

Each pipeline step has a configuration file. Before running the pipeline we need to "collect" the configuration files by running 

*% mkdir cfgs*

*% collect_pipeline_cfgs cfgs*

where `cfgs` is the destination directory where all configuration files are copied.

Using the stage 1 pipeline as an example

*% strun cfgs/calwebb_detector1.cfg input_file.fits*

or using the class name

*% strun jwst.pipeline.Detector1Pipeline jw00017001001_01101_00001_nrca1_uncal.fits*

When a pipeline or step is executed in this manner (i.e. by referencing the class name), it will be run using a CRDS-supplied configuration merged with default values.

#### Running the pipeline with non-default values of parameters

If you want to use non-default parameter values, you can specify them as keyword arguments on the command line or set them in the appropriate configuration file.

To specify parameter values for an individual step when running a pipeline use the syntax --steps.<step_name>.<parameter>=value. For example, to override the default selection of a dark current reference file from CRDS when running a pipeline
    
*% strun jwst.pipeline.Detector1Pipeline jw00017001001_01101_00001_nrca1_uncal.fits 
    --steps.dark_current.override_dark='my_dark.fits'*
    

    

You can get a list of all the available arguments for a given pipeline or step by using the ‘-h’ (help) argument to strun:

*% strun dq_init.cfg -h*

*% strun jwst.pipeline.Detector1Pipeline -h*

#### Running the pipeline or a single step in a Python session

The pipeline or a single step can be executed from within a Python or IPython session by using the `call` method of the class:

In [None]:
from jwst.pipeline import Detector1Pipeline

result = Detector1Pipeline.call('jw00017001001_01101_00001_nrca1_uncal.fits')

In [None]:
from jwst.linearity import LinearityStep

result = LinearityStep.call('jw00001001001_01101_00001_mirimage_uncal.fits')