<a id='top'></a>
# NIRSpec MOS pipeline processing

**Author**: James Muzerolle adapted by Cami Pacifici

Plotting function originally developed by Bryan Hilbert

**Latest Update**: 28 October 2021 by James, 8 August 2022 by Cami.

## Table of Contents
* [Introduction](#intro)
    * [Overview](#overview)
* [Imports](#imports)
* [Convenience functions](#func)
* [Pipeline processing flag](#flag)
* [Input simulations](#inputs)
* [Association files and metadata](#associations)
* [Run the calwebb_spec2 pipeline](#runspec2)

## Introduction <a id='intro'></a>

In this notebook, we will explore the stage 1 and 2 pipelines for NIRSpec MOS data. Here, we will focus on the mechanics of processing real data, including how to use associations for exposure specification and multi-exposure combination, the role of metadata, and what the primary data products at each stage look like. We will also see examples of how to interact and work with data models and metadata.

We are using pipeline version 1.7.+ for all data processing in this notebook. Most of the processing runs shown here use the default reference files from the Calibration Reference Data System (CRDS). Please note that pipeline software development is a continuous process, so results in some cases may be slightly different if using a different version. There are also a few known issues with some of the pipeline steps in this build that are expected to be fixed in the near future, though these do not significantly effect the products you will see here.

## Imports <a id='imports'></a>

Import packages necessary for this notebook

In [None]:
#Modify the path to a directory on your machine
import os
os.environ["CRDS_PATH"] = "/path/to/my/crds/folder" # set appropriate path
os.environ["CRDS_SERVER_URL"] = "https://jwst-crds.stsci.edu"

import numpy as np

import glob

import zipfile
import urllib.request

import json

from shutil import copyfile

from astropy.io import fits
from astropy.utils.data import download_file
import astropy.units as u
from astropy import wcs
from astropy.wcs import WCS
from astropy.visualization import ImageNormalize, ManualInterval, LogStretch, LinearStretch, AsinhStretch

Set up matplotlib for plotting

In [None]:
import matplotlib.pyplot as plt
import matplotlib as mpl

# Use this version for non-interactive plots (easier scrolling of the notebook)
%matplotlib inline

# Use this version (outside of Jupyter Lab) if you want interactive plots
#%matplotlib notebook

# These gymnastics are needed to make the sizes of the figures
# be the same in both the inline and notebook versions
%config InlineBackend.print_figure_kwargs = {'bbox_inches': None}

mpl.rcParams['savefig.dpi'] = 80
mpl.rcParams['figure.dpi'] = 80

plt.rcParams.update({'font.size': 18})

Import JWST pipeline modules

In [None]:
# The calwebb_detector1
from jwst.pipeline import Detector1Pipeline

# The calwebb_spec and spec3 pipelines
from jwst.pipeline import Spec2Pipeline
from jwst.pipeline import Spec3Pipeline

# individual steps
from jwst.assign_wcs import AssignWcsStep
from jwst.assign_wcs import nirspec
from jwst.background import BackgroundStep
from jwst.imprint import ImprintStep
from jwst.msaflagopen import MSAFlagOpenStep
from jwst.extract_2d import Extract2dStep
from jwst.srctype import SourceTypeStep
from jwst.wavecorr import WavecorrStep
from jwst.flatfield import FlatFieldStep
from jwst.pathloss import PathLossStep
from jwst.photom import PhotomStep
from jwst.cube_build import CubeBuildStep
from jwst.extract_1d import Extract1dStep
from jwst.combine_1d import Combine1dStep

# data models
from jwst import datamodels

# associations
from jwst.associations import asn_from_list
from jwst.associations.lib.rules_level3_base import DMS_Level3_Base

## Define convenience functions and parameters <a id='func'></a>

In [None]:
# All files created by the steps in this notebook have been pre-computed and cached on Box

# first specify a desired local directory in which to place the downloaded data, as well as any offline processing you choose to run
output_dir = 'nirspec_files/'
if not os.path.exists(output_dir):
    os.makedirs(output_dir)

ziplink = 'https://stsci.box.com/shared/static/hydkpkl2ki8c9m6g62abzsog4dao0y3j.zip'
zipfilename = 'nirspec_mos_inflight.zip'
if not os.path.isfile(os.path.join(output_dir, zipfilename)):
    print('Downloading {}...'.format(zipfilename))
    demo_file = download_file(ziplink, cache=True)
    # Make a symbolic link using a local name for convenience
    os.symlink(demo_file, os.path.join(output_dir, zipfilename))
else:
    print('{} already exists, skipping download...'.format(zipfilename))

# unzip
zf = zipfile.ZipFile(output_dir+zipfilename, 'r')
print(output_dir)
zf.extractall(output_dir)

os.system('mv '+output_dir+'nirspec_mos_inflight/* '+output_dir)

In [None]:
def show_image(data_2d, vmin, vmax, xsize=19, ysize=10, title=None, aspect=1, scale='log', units='MJy/sr'):
    """Function to generate a 2D, log-scaled image of the data
    
    Parameters
    ----------
    data_2d : numpy.ndarray
        2D image to be displayed
        
    vmin : float
        Minimum signal value to use for scaling
        
    vmax : float
        Maximum signal value to use for scaling
        
    title : str
        String to use for the plot title
        
    scale : str
        Specify scaling of the image. Can be 'log' or 'linear'
        
    units : str
        Units of the data. Used for the annotation in the
        color bar
    """
    if scale == 'log':
        norm = ImageNormalize(data_2d, interval=ManualInterval(vmin=vmin, vmax=vmax),
                              stretch=LogStretch())
    elif scale == 'linear':
        norm = ImageNormalize(data_2d, interval=ManualInterval(vmin=vmin, vmax=vmax),
                              stretch=LinearStretch())
    elif scale == 'Asinh':
        norm = ImageNormalize(data_2d, interval=ManualInterval(vmin=vmin, vmax=vmax),
                              stretch=AsinhStretch())
    fig = plt.figure(figsize=(xsize, ysize))
    ax = fig.add_subplot(1, 1, 1)
    im = ax.imshow(data_2d, origin='lower', norm=norm, aspect=aspect, cmap='gist_earth')

    if (units != 'none'):
        fig.colorbar(im, label=units)
    plt.xlabel('Pixel column')
    plt.ylabel('Pixel row')
    if title:
        plt.title(title)

## Pipeline processing flag <a id='flag'></a>

The pipeline and individual steps take too long to run in real time for this demo, so all products shown here have been pre-computed, and the actual pipeline calls will be skipped. Change the following flag to True if you want to run everything offline.

In [None]:
runflag = True

## Input simulations or data <a id='inputs'></a>

## Detector1

Update the paths to the correct path on your machine.

In [None]:
uncalfile1 = './nirspec_files/nirspec_mos_inflight/jw02736007001_03101_00002_nrs1_uncal.fits'
uncalfile2 = './nirspec_files/nirspec_mos_inflight/jw02736007001_03101_00002_nrs2_uncal.fits'

detector1 = Detector1Pipeline()
detector1.save_results = True
detector1.output_dir = output_dir
result = detector1(uncalfile1)

detector1 = Detector1Pipeline()
detector1.save_results = True
detector1.output_dir = output_dir
result = detector1(uncalfile2)

First, let's take a look at a few of the level 2a images to get familiarized with the inputs.

In [None]:
# get the data model of dither position 1:
ratefile1 = output_dir+'jw02736007001_03101_00002_nrs1_rate.fits' # for the NRS1 detector
dither1 = datamodels.open(ratefile1)
ratefile2 = output_dir+'jw02736007001_03101_00002_nrs2_rate.fits' # for the NRS2 detector
dither2 = datamodels.open(ratefile2)

# get the pixel data (the SCI extension of the fits file)
ratesci1 = dither1.data
ratesci2 = dither2.data

# display the images
show_image(ratesci1, 0, 1., xsize=19, ysize=19, units='DN/s', title='NRS1')
show_image(ratesci2, 0, 1., xsize=19, ysize=19, units='DN/s', title='NRS2')

# zoom in to show details of one slitlet
show_image(ratesci1[1550:1650, 850:1050], 0, 0.1, xsize=19, ysize=15, units='none', scale='Asinh', title='zoom of NRS1')

## Association files and metadata <a id='associations'></a>

In this example, we use no background. Background exposures can be found in observations jw02736007001_03101_00003 and jw02736007001_03101_00004.

In [None]:
# show the contents of one of the association files
asn_file = output_dir+"jw02736-o007_20220712t161855_spec2_011_asn.json"
with open(asn_file) as f_obj:
    asn_data = json.load(f_obj)
asn_data

One unique aspect of MOS data processing is that it requires additional metadata, such as shutter locations and source positions. The pipeline gets this information from an MSA "metafile" that is automatically generated, along with the raw "level 1b" files, using metadata taken from the PPSDB. The metadata is originally populated when the observation is created with the MSA planning tool in APT. The metafile is a fits file containing two binary fits tables, one with MSA shutter information for each slitlet and nod, and the other with various details about each source observed.  Let's take a look at the metafile for this simulation:

In [None]:
# open the MSA metafile
metafile = output_dir+'jw02736007001_01_msa.fits'
hdul = fits.open(metafile)
hdul.info()

In [None]:
# columns of the "SHUTTER INFO" table
hdul['SHUTTER_INFO'].columns

In the "SHUTTER_INFO" table, the number of rows per slitlet is N_s x N_n, where N_s is the number of open shutters in each slitlet and N_n is the number of nods. In this case, N_s = 3 and N_n = 3, so there are 9 rows per slitlet. These are needed by the pipeline to determine which slitlet shutter contains the source in each nodded exposure.

In [None]:
# print table entries for the first two slitlets
print(hdul['SHUTTER_INFO'].data[:18])

In [None]:
# columns of the "SOURCE_INFO" table
hdul['SOURCE_INFO'].columns

The SOURCE_INFO table contains one row per source that was observed with an MSA configuration. The most important information here are the source RA & Dec coordinates, and the stellarity parameter, which is used to determine whether the source should be processed as point or extended.

In [None]:
# print table entries for the first 10 sources
print(hdul['SOURCE_INFO'].data[:10])

## Run the spec2 pipeline <a id='runspec2'></a>

In [None]:
# run the calwebb_spec2 pipeline using an association file as input

if runflag:
    spec2 = Spec2Pipeline()
    spec2.save_results = True
    spec2.output_dir = output_dir
    result = spec2(asn_file)

In [None]:
# take a look at the results - open the level 2b files

callist = [f for f in glob.glob(output_dir+"jw02736007001_03101_00002_nrs1_cal.fits")]
callist.sort()
for calfile in callist:
    print(calfile)
    cal = datamodels.open(calfile) # this contains the calibrated unrectified 2D spectra
    root = calfile[:-9]
    s2d = datamodels.open(root+'_s2d.fits')  # this contains the calibrated *rectified* 2D spectra
    x1d = datamodels.open(root+'_x1d.fits')  # this contains the aperture-extracted 1D spectra
    
    for i, slit in enumerate(cal.slits):

        #if (slit.name == '31'):  # change this, or comment out, to see other slits
            print(slit.name)
        
            calsci = slit.data  # contains the pixel data from the cal file (SCI extension)
            s2dsci = s2d.slits[i].data  # contains the pixel data from the s2d file
    
            # determine the wavelength scale of the s2d data for plotting purposes
            # get the data model WCS object
            wcsobj = s2d.slits[i].meta.wcs
            y, x = np.mgrid[:s2dsci.shape[0], : s2dsci.shape[1]]  # grid of pixel x,y indices
            det2sky = wcsobj.get_transform('detector', 'world')  # the coordinate transform from detector space (pixels) to sky (RA, DEC in degrees)
            ra, dec, s2dwave = det2sky(x, y)  # RA, Dec, wavelength (microns) for each pixel
            s2dwaves = s2dwave[0, :]  # only need a single row of values since this is the rectified spectrum
            xtint = np.arange(100, s2dsci.shape[1], 100)
            xtlab = np.round(s2dwaves[xtint], 2)  # wavelength labels for the x-axis
        
            # get wavelength & flux from the x1d data model
            x1dwave = x1d.spec[i].spec_table.WAVELENGTH
            x1dflux = x1d.spec[i].spec_table.FLUX
  
            # plot the unrectified calibrated 2D spectrum
            show_image(calsci, -0.01, 0.01, aspect=5., scale='linear', units='MJy')
        
            # plot the rectified 2D spectrum
            show_image(s2dsci, -0.01, 0.01, aspect=5., scale='linear', units='MJy')
            plt.xticks(xtint, xtlab)
            plt.xlabel('wavelength (microns)')
        
            # plot the 1D extracted spectrum
            fig = plt.figure(figsize=(19, 8))
            plt.plot(x1dwave, x1dflux)
            plt.xlabel('wavelength (microns)')
            plt.ylabel('flux (Jy)')
            plt.show()