# Running the _JWST_ Pipeline

### Environment set-up:

First, cd to the file folder that has the relevant uncal.fits file.

Then, in the tsch shell,

Running the following four lines in the command line will create a new python environment titled, 'jwst_env'. The block will also install the ipykernel package and set up a kernel for the 'jwst_env' environment to be used in a Jupyter Notebook like this one. This allows all future blocks in this notebook to be run within 'jwst_env' including any installs using pip, such as the following line.

In [None]:
conda create -n jwst_env python=3.11
conda activate jwst_env
conda install -c anaconda ipykernel
python -m ipykernel install --user --name=jwst_env

In [1]:
%pip install jwst

Collecting jwst
  Using cached jwst-1.18.1-cp311-cp311-macosx_10_9_x86_64.whl.metadata (32 kB)
Collecting asdf<5,>=3.3 (from jwst)
  Using cached asdf-4.2.0-py3-none-any.whl.metadata (12 kB)
Collecting astropy>=6.1 (from jwst)
  Using cached astropy-7.1.0-cp311-cp311-macosx_10_9_x86_64.whl.metadata (10 kB)
Collecting BayesicFitting>=3.2.2 (from jwst)
  Using cached BayesicFitting-3.2.3-py3-none-any.whl.metadata (17 kB)
Collecting crds>=12.0.3 (from jwst)
  Using cached crds-12.1.10-py3-none-any.whl.metadata (7.4 kB)
Collecting drizzle>=2.0.1 (from jwst)
  Using cached drizzle-2.0.1-cp311-cp311-macosx_10_9_x86_64.whl.metadata (10 kB)
Collecting gwcs<0.25.0,>=0.24.0 (from jwst)
  Using cached gwcs-0.24.0-py3-none-any.whl.metadata (6.3 kB)
Collecting numpy>=1.25 (from jwst)
  Using cached numpy-2.3.1-cp311-cp311-macosx_14_0_x86_64.whl.metadata (62 kB)
Collecting opencv-python-headless>=4.6.0.66 (from jwst)
  Using cached opencv_python_headless-4.12.0.88-cp37-abi3-macosx_13_0_x86_64.whl.me

In [2]:
%pip install jupyter

Collecting jupyter
  Using cached jupyter-1.1.1-py2.py3-none-any.whl.metadata (2.0 kB)
Collecting notebook (from jupyter)
  Using cached notebook-7.4.4-py3-none-any.whl.metadata (10 kB)
Collecting jupyter-console (from jupyter)
  Using cached jupyter_console-6.6.3-py3-none-any.whl.metadata (5.8 kB)
Collecting nbconvert (from jupyter)
  Using cached nbconvert-7.16.6-py3-none-any.whl.metadata (8.5 kB)
Collecting ipywidgets (from jupyter)
  Using cached ipywidgets-8.1.7-py3-none-any.whl.metadata (2.4 kB)
Collecting jupyterlab (from jupyter)
  Using cached jupyterlab-4.4.4-py3-none-any.whl.metadata (16 kB)
Collecting widgetsnbextension~=4.0.14 (from ipywidgets->jupyter)
  Using cached widgetsnbextension-4.0.14-py3-none-any.whl.metadata (1.6 kB)
Collecting jupyterlab_widgets~=3.0.15 (from ipywidgets->jupyter)
  Using cached jupyterlab_widgets-3.0.15-py3-none-any.whl.metadata (20 kB)
Collecting async-lru>=1.0.0 (from jupyterlab->jupyter)
  Using cached async_lru-2.0.5-py3-none-any.whl.metada

Note the % sign that is used before these commands. It allows you to run command line prompts from within the Jupyter notebook. A single % indicates that line is to be run like in a terminal, while a double %% with a specified shell can indicate the entire block as command line prompts.

Note: do not use _pip install jwst --upgrade_ -- instead, uninstall and reinstall to ensure all dependency packages are also upgraded.

### Setting environment variables for the Calibration References Data System (CRDS):

In a bash shell (activate using the command 'bash' w/o the %%), or as discussed previously, using the following Jupyter block.

In [27]:
%%bash
mkdir $HOME/crds_cache
export CRDS_PATH=$HOME/crds_cache
export CRDS_SERVER_URL=https://jwst-crds.stsci.edu

mkdir: /Users/jake/crds_cache: File exists


Alternatively, the two environment variables can be set in tsch using:

In [None]:
setenv CRDS_PATH $HOME/crds_cache
setenv CRDS_SERVER_URL https://jwst-crds.stsci.edu

Using the command 'printenv,' confirm that the two environment variables have been correctly set.  They must be defined before importing any _JWST_ or CRDS software packages.  If the CRDS environment variables were not set before initializing Python, they can be set in the Python session using 'os.environ': or CRDS software pacakges.

Additionally, do not modify the 'context' parameter to anything other than the default, as it is a .pars file that changes.

If the CRDS environment variables were not set before initializing Python, they can be set in the Python session using 'os.environ':

In [None]:
import os

In [1]:
if (os.getenv('CRDS_PATH') is None):
    os.environ['CRDS_PATH'] = os.path.join(os.path.expanduser('~'), 'crds_cache')
else: print('Path exists at ' + str(os.getenv('CRDS_PATH')))
if (os.getenv('CRDS_SERVER_URL') is None):
    os.environ['CRDS_SERVER_URL'] = 'https://jwst-crds.stsci.edu'
else: print('Server URL exists as ' + str(os.getenv('CRDS_SERVER_URL')))

NameError: name 'os' is not defined

### Stage 1: calwebb\_detector1:

This initial stage applies basic detector-level corrections to all exposure types, one exposure at a time.

To run the full pipeline, 

In [5]:
import jwst
import crds

In [31]:
#file_name = 'EXAMPLE.FITS'
data_path = '/Users/jake/jwst_data/MAST_2025-07-08T0124/JWST'
uncal_files = [os.path.join(data_path,file) for file in os.listdir(data_path)]
from jwst.pipeline import Detector1Pipeline
result = Detector1Pipeline.call(uncal_files[0])

2025-07-08 15:21:17,260 - CRDS - ERROR -  Error determining best reference for 'pars-darkcurrentstep'  =   No match found.
2025-07-08 15:21:17,265 - CRDS - INFO -  Fetching  /Users/jake/crds_cache/references/jwst/nircam/jwst_nircam_pars-jumpstep_0003.asdf    1.8 K bytes  (1 / 1 files) (0 / 1.8 K bytes)
2025-07-08 15:21:17,623 - stpipe - INFO - PARS-JUMPSTEP parameters found: /Users/jake/crds_cache/references/jwst/nircam/jwst_nircam_pars-jumpstep_0003.asdf
2025-07-08 15:21:17,641 - CRDS - INFO -  Fetching  /Users/jake/crds_cache/references/jwst/nircam/jwst_nircam_pars-detector1pipeline_0003.asdf    1.7 K bytes  (1 / 1 files) (0 / 1.7 K bytes)
2025-07-08 15:21:17,827 - stpipe - INFO - PARS-DETECTOR1PIPELINE parameters found: /Users/jake/crds_cache/references/jwst/nircam/jwst_nircam_pars-detector1pipeline_0003.asdf
2025-07-08 15:21:17,856 - stpipe.Detector1Pipeline - INFO - Detector1Pipeline instance created.
2025-07-08 15:21:17,858 - stpipe.Detector1Pipeline.group_scale - INFO - GroupSca

The first time this is run, the CRDS server should download all of the context/reference files needed for that run and place them in the CRDS_PATH-indicated directory.

It is not recommended to use the 'run' method to run a pipeline or step.  Instead, use the 'call' method.

BE PATIENT!

The 'pipeline' module contains all full pipeline stages.  For example, to import Stage 3 Imaging Processing requires the following command:

In [32]:
from jwst.pipeline import Image3Pipeline

For individual pipeline steps, import by name from their respective module.  For example,

In [33]:
from jwst.saturation import SaturationStep

Once imported, a pipeline or step is execute using the class's .call() method.  The input can be a string path to a file or an open DataModel object.  By default, no files are written out.

Additionally, by default, CRDS chooses the pipeline/step parameters and reference files based on information such as instrument, observing mode, date, etc.  Provided the current context was not changed from the default, these represent the best choice of parameters and reference files for the provided dataset.

To save the final pipeline output products to a file, include the 'save_results' parameter.  The 'result' definition is then:

In [None]:
result = Detector1Pipeline.call(file_name, save_results=True)

#### In the command line:

The command line interface for individual steps and pipelines involves the 'strun' command.  Just as with the python environment, the CRDS environment variables must be defined before running any pipelines/steps.  Invoke 'strun,' then specify a pipeline or class name, then name the input file.  Either the pipeline class or alias name can be used ('jwst.pipeline.Detector1Pipeline' is the class which has the alias 'calwebb_detector1' and either name can be specified).  For example,

In [None]:
$ strun calwebb_detector1 jw04501038001_03104_00001_nrca1_uncal.fits

An exit status of '0' indicates successful completion of the step or pipeline, while '1' means a general error has occured and '64' indicates that no science data was found.

Just like the python environment, 'save_results' can be specified:

In [None]:
$ strun calwebb_detector1 jw04501038001_03104_00001_nrca1_uncal.fits
    --save_results=true

#### Logging Configuration

To save log information, first set up a configuration file to specify the desired level of logging messages.  This file must be in the same directory in which the pipeline is being run.

For example, if a file called 'pipeline_log.cfg' is created with the following contents,

In [None]:
[*]
handler = file:pipeline.log
level = INFO

then log information will be written to a file named 'pipeline.log' with the level of detail set to 'INFO' (other options include 'DEBUG,' 'WARNING,' 'ERROR,' and 'CRITICAL').  Only messages at or above the specified level will be displayed.

When running the pipeline in a python environment, simply add the 'logcfg' parameter.  The 'result' definition from before is then:

In [None]:
result = Detector1Pipeline.call(file_name, logcfg="pipeline-log.cfg")

In the command line, a similar option is added:

In [None]:
$ strun calwebb_detector1 <input_file>
    --logcfg=pipeline-log.cfg