# MIRI MRS IFU Spectroscopy Part 2: 
# Defining and Extracting a Background Spectrum

Aug 2023

**Use case:** Reduce MRS Data With User Defined Master Background Step. This is particularly relevant if you did not obtain a Dedicated Background with your observations. While the pipeline will subtract a sky background derived from an annulus, the underlying background may be prohibitively complicated and the user may wish to measure their own background from elsewhere in the cube.<br>
**Data:** Publicly available science data for SN 1987A (Program 1232). For this notebook, we will follow the science workflow outlined by [Jones et al. 2023](https://ui.adsabs.harvard.edu/abs/2023arXiv230706692J/abstract).<br>
**Tools:** jwst, jdaviz, matplotlib, astropy.<br>
**Cross-intrument:** NIRSpec, MIRI.<br>
**Documentation:** This notebook is part of a STScI's larger [post-pipeline Data Analysis Tools Ecosystem](https://jwst-docs.stsci.edu/jwst-post-pipeline-data-analysis) and can be [downloaded](https://github.com/spacetelescope/dat_pyinthesky/tree/main/jdat_notebooks/MRS_Mstar_analysis) directly from the [JDAT Notebook Github directory](https://github.com/spacetelescope/jdat_notebooks).<br>

### Introduction: Spectral extraction in the JWST calibration pipeline

The JWST calibration pipeline performs spectrac extraction for all spectroscopic data using basic default assumptions that are tuned to produce accurately calibrated spectra for the majority of science cases. This default method is a simple fixed-width boxcar extraction, where the spectrum is summed over a number of pixels along the cross-dispersion axis, over the valid wavelength range. An aperture correction is applied at each pixel along the spectrum to account for flux lost from the finite-width aperture. 

The ``extract_1d`` step uses the following inputs for its algorithm:
- the spectral extraction reference file: this is a json-formatted file, available as a reference file from the [JWST CRDS system](https://jwst-crds.stsci.edu)
- the bounding box: the ``assign_wcs`` step attaches a bounding box definition to the data, which defines the region over which a valid calibration is available. We will demonstrate below how to visualize this region. 

However the ``extract_1d`` step has the capability to perform more complex spectral extractions, requiring some manual editing of parameters and re-running of the pipeline step. 


### Aims

This notebook will demonstrate how to re-run the spectral extraction step with different settings to illustrate the capabilities of the JWST calibration pipeline. 


### Assumptions

We will demonstrate the spectral extraction methods on resampled, calibrated spectral images. The basic demo and two examples run on Level 3 data, in which the nod exposures have been combined into a single spectral image. Two examples will use the Level 2b data - one of the nodded exposures. 


### Test data

The data used in this notebook is an observation of the Type Ia supernova SN2021aefx, observed by Jha et al in PID 2072 (Obs 1). These data were taken with zero exclusive access period, and published in [Kwok et al 2023](https://ui.adsabs.harvard.edu/abs/2023ApJ...944L...3K/abstract). You can retrieve the data from [this Box folder](https://stsci.box.com/s/i2xi18jziu1iawpkom0z2r94kvf9n9kb), and we recommend you place the files in the ``data/`` folder of this repository, or change the directory settings in the notebook prior to running. 

You can of course use your own data instead of the demo data. 


### JWST pipeline version and CRDS context

This notebook was written using the calibration pipeline version 1.10.2. We set the CRDS context explicitly to 1089 to match the current latest version in MAST. If you use different pipeline versions or CRDS context, please read the relevant release notes ([here for pipeline](https://github.com/spacetelescope/jwst), [here for CRDS](https://jwst-crds.stsci.edu)) for possibly relevant changes.

### Contents

1. [The Level 3 data products](#l3data)
2. [The spectral extraction reference file](#x1dref)
3. [Example 1: Changing the aperture width](#ex1)
4. [Example 2: Changing the aperture location](#ex2)
5. [Example 3: Extraction with background subtraction](#ex3)
6. [Example 4: Tapered column extraction](#ex4)

## Import Packages

- `astropy.io` fits for accessing FITS files
- `os` for managing system paths
- `matplotlib` for plotting data
- `urllib` for downloading data
- `tarfile` for unpacking data
- `numpy` for basic array manipulation
- `jwst` for running JWST pipeline and handling data products
- `json` for working with json files
- `crds` for working with JWST reference files

In [None]:
# Set CRDS variables first
import os

os.environ['CRDS_CONTEXT'] = 'jwst_1089.pmap'
os.environ['CRDS_PATH'] = os.environ['HOME']+'/crds_cache'
os.environ['CRDS_SERVER_URL'] = 'https://jwst-crds.stsci.edu'
print(f'CRDS cache location: {os.environ["CRDS_PATH"]}')

In [None]:
import sys,os, pdb
# Basic system utilities for interacting with files
import glob
import time
import shutil
import warnings
import zipfile
import urllib.request
import requests

# Astropy utilities for opening FITS and ASCII files
from astropy.io import fits
from astropy.io import ascii
from astropy.utils.data import download_file
from regions import Regions
from astropy import units as u

from astroquery.mast import Observations

# Astropy utilities for making plots
from astropy.visualization import (LinearStretch, LogStretch, ImageNormalize, ZScaleInterval)

# Numpy for doing calculations
import numpy as np

# Matplotlib for making plots
import matplotlib.pyplot as plt
from matplotlib import rc

# Import the base JWST package
import jwst

# JWST pipelines (encompassing many steps)
from jwst.pipeline import Detector1Pipeline
from jwst.pipeline import Spec2Pipeline
from jwst.pipeline import Spec3Pipeline

# JWST pipeline utilities
from jwst import datamodels # JWST datamodels
from jwst.associations import asn_from_list as afl # Tools for creating association files
from jwst.associations.lib.rules_level2_base import DMSLevel2bBase # Definition of a Lvl2 association file
from jwst.associations.lib.rules_level3_base import DMS_Level3_Base # Definition of a Lvl3 association file
from jwst.datamodels import SpecModel, MultiSpecModel, IFUCubeModel


from stcal import dqflags # Utilities for working with the data quality (DQ) arrays

import shutil

# Import packages for multiprocessing.  These won't be used on the online demo, but can be
# very useful for local data processing unless/until they get integrated natively into
# the cube building code.  These need to be imported before anything else.

import multiprocessing
#multiprocessing.set_start_method('fork')
from multiprocessing import Pool
import os

# Set the maximum number of processes to spawn based on available cores
usage = 'all' # Either 'none' (single thread), 'quarter', 'half', or 'all' available cores

from specutils import Spectrum1D
from matplotlib.pyplot import cm

from jdaviz import Cubeviz

# Display the video
from IPython.display import HTML, YouTubeVideo

#shutil.copytree('/astro/armin/data/mshahbandeh/aefx/input_dir/', '/astro/armin/data/mshahbandeh/aefx/input_dir_sc/')
#shutil.copytree('/astro/armin/data/mshahbandeh/aefx/input_dir/', '/astro/armin/data/mshahbandeh/aefx/input_dir_bkg/')

In [None]:
# Set parameters to be changed here.
# It should not be necessary to edit cells below this in general unless modifying pipeline processing steps.

import sys,os, pdb

# CRDS context (if overriding)
#%env CRDS_CONTEXT jwst_0771.pmap

# Point to where the uncalibrated FITS files are from the science observation
input_dir = './mastDownload/1232/uncal/'

# Point to where you want the output science results to go
output_dir = './output/87A/'

# Point to where the uncalibrated FITS files are from the background observation
# If no background observation, leave this blank
input_bgdir = ' '

# Point to where the output background observations should go
# If no background observation, leave this blank
output_bgdir = ' '

# Whether or not to run a given pipeline stage
# Science and background are processed independently through det1+spec2, and jointly in spec3

# Science processing
dodet1=True
dospec2=True
dospec3=True

# Background processing
dodet1bg=True
dospec2bg=True

# If there is no background folder, ensure we don't try to process it
if (input_bgdir == ''):
    dodet1bg=False
    dospec2bg=False

In [None]:
## Output subdirectories to keep science data products organized
## Note that the pipeline might complain about this as it is intended to work with everything in a single
## directory, but it nonetheless works fine for the examples given here.
det1_dir = os.path.join(output_dir, 'stage1/') # Detector1 pipeline outputs will go here
#spec2_dir = os.path.join(output_dir, 'stage2/') # Spec2 pipeline outputs will go here
spec2_dir = os.path.join(output_dir, 'stage2/') # Spec2 pipeline outputs will go here
spec2_bgdir = ' '
#spec3_dir = os.path.join(output_dir, 'stage3/') # Spec3 pipeline outputs will go here
spec3_dir = os.path.join(output_dir, 'stage3/') # Spec3 pipeline outputs will go here

# We need to check that the desired output directories exist, and if not create them
if not os.path.exists(det1_dir):
    os.makedirs(det1_dir)
if not os.path.exists(spec2_dir):
    os.makedirs(spec2_dir)
if not os.path.exists(spec3_dir):
    os.makedirs(spec3_dir)

In [None]:
# Output subdirectories to keep background data products organized
det1_bgdir = os.path.join(output_bgdir, 'stage1/') # Detector1 pipeline outputs will go here
spec2_bgdir = os.path.join(output_bgdir, 'stage2/') # Spec2 pipeline outputs will go here

# We need to check that the desired output directories exist, and if not create them
if (output_bgdir != ''):
    if not os.path.exists(det1_bgdir):
        os.makedirs(det1_bgdir)
    if not os.path.exists(spec2_bgdir):
        os.makedirs(spec2_bgdir)

In [None]:
def checkKey(dict, key):
      
    if key in dict.keys():
        print("Present, ", end=" ")
        print("value =", dict[key])
        return(True)
    else:
        print("Not present")
        return(False)

# 2. Use Cubeviz to make mask

#### This step show how to interactively define a region to be used for extracting a background. If you skip this step, you can continue to run the notebook further in Step 3.

In [None]:
# Video showing how to define an annulus background around SN 1987A using the cells below

HTML('<iframe width="700" height="500" src="https://data.science.stsci.edu/redirect/JWST/jwst-data_analysis_tools/MRS_1987A/region_mask1.mov" frameborder="0" allowfullscreen></iframe>')

In [None]:
cubefile = "/astro/armin/data/ofox/1232/output/87A/stage3/Level3_ch1-2-3-4-shortmediumlong_s3d.fits"

In [None]:
from jdaviz import Cubeviz
cubeviz = Cubeviz()
cubeviz.load_data(cubefile, data_label='SN1987A')
cubeviz.show()

In [None]:
# Get the collapsed cube, ideally from Cubeviz, but otherwise download from pre-defined files.

collapse_cube = cubeviz.app.get_data_from_viewer("uncert-viewer") # AGN Center Model Cube
if checkKey(collapse_cube,"collapsed") is True:
    collapse_cube = cubeviz.app.get_data_from_viewer("uncert-viewer","collapsed") # AGN Center Model Cube
    collapse_cube.write(spec3_dir+"collapsed_cube.fits",overwrite='True')
else:
    print("No Collapsed Cube in Cubeviz.")
    if os.path.isfile(spec3_dir+'/'+"collapsed_cube.fits"):
        print('File exists. Deleting '+spec3_dir+'/'+"collapsed_cube.fits")
        os.remove(spec3_dir+'/'+"collapsed_cube.fits")
    print("Downloading to "+spec3_dir)
    url = 'https://data.science.stsci.edu/redirect/JWST/jwst-data_analysis_tools/MRS_1987A/collapsed_cube.fits'
    urllib.request.urlretrieve(url, './collapsed_cube.fits')
    shutil.move("collapsed_cube.fits",spec3_dir)

In [None]:
# Get the ellipse region, ideally from Cubeviz, but otherwise download from pre-defined files.

regions = cubeviz.get_interactive_regions()
if checkKey(regions,"Subset 1") is True:
    regions['Subset 1'].write('my_elipse.reg', overwrite=True)
else:
    print("No Background Region From Cubeviz.")
    if os.path.isfile("./my_elipse.reg"):
        print('File exists. Deleting ./my_elipse.reg')
        os.remove("./my_elipse.reg")
    print("Downloading...")
    fname = "./my_elipse.reg"
    url = 'https://data.science.stsci.edu/redirect/JWST/jwst-data_analysis_tools/MRS_1987A/my_elipse.txt'
    urllib.request.urlretrieve(url, fname)
    #fn = x = download_file(url, cache=False)
    #reg = Regions.read(fn, format='ds9')[0]


# 3. Apply the mask to the weights extension of the data cube

In [None]:
# Read in data cube as a JWST data model
spec_model_cube = IFUCubeModel()
spec_model_cube.read(cubefile)

In [None]:
# Print the source and aperture type being used in the header of the file (this can be Extended or Point). 
# For SN 1987A, we have an EXTENDED source

spec_model_cube.find_fits_keyword('SRCTYPE')
spec_model_cube.find_fits_keyword('SRCTYAPT')

print(spec_model_cube.meta.target.source_type)
print(spec_model_cube.meta.target.source_type_apt)

In [None]:
# If necessary, you can change your cube header as necessary. We don't need to change anything in this case.
# But you might want to if you have a point source, yet want to extract a user specified background spectrum.
# The pipeline extracts EXTENDED and POINT sources differently.

spec_model_cube.meta.target.source_type = 'EXTENDED'
spec_model_cube.meta.target.source_type_apt = 'EXTENDED'

In [None]:
# Read in previously extracted region

reg = Regions.read('./my_elipse.reg', format='ds9')[0]
print(reg)

In [None]:
# Create a weight map using the region mask

tmp_wgts = spec_model_cube.weightmap[:]
mask = reg.to_mask('exact')
x1 = int(reg.center.x-reg.width/2.)
x2 = x1+mask.shape[1]
y1 = int(reg.center.y-reg.height/2.)
y2 = y1+mask.shape[0]

### Note above, the region shape is slightly different than the mask shape that gets generated. 
### This hack gets all the arrays to be the same size.

# Start by setting all pixels to 1.
mask2d = tmp_wgts[13,:,:]
mask2d[mask2d>0] = 1.

# Because we want an inverse array, we can't just use the mask, we have to subtract the mask (which is 1's) from the original mask2d above (make sense?) 
mask2d[y1:y2,x1:x2] = mask2d[y1:y2,x1:x2]-mask

# Take into account weird rounding errors
mask2d[mask2d<0.1] = 0.

In [None]:
# Visualize the 2D Mask

from astropy.nddata import CCDData
from astropy.visualization import simple_norm
ccd = mask2d
norm = simple_norm(ccd, 'sqrt', min_cut=0, max_cut=0.5)   
color = 'rgbmkrgbmk'

xceni = [36, 44]
yceni = [66, 58]

fig = plt.figure(figsize=(10, 10))
ax = fig.add_subplot()
counter = 0
plt.title("Masked Data")
plt.imshow(ccd, norm=norm, origin="lower")     
plt.show()

In [None]:
# Create a 3D weightmap from the 2D map. Mask all NAN values, too.

mask3d = np.broadcast_to(mask2d, spec_model_cube.weightmap.shape)
mask3d.flags.writeable = True
mask3d[np.isnan(spec_model_cube.data)] = 0
#mask_sci_cube = np.ma.masked_array(spec_model_cube.weightmap, mask=mask3d.astype(bool))
tmp_wgt_cube = np.swapaxes(mask3d,0,1)
tmp_wgt_cube = np.swapaxes(tmp_wgt_cube,1,2)
plotcube = Spectrum1D(tmp_wgt_cube*u.dimensionless_unscaled)

In [None]:
# Visualize the 3D Cube in Cubeviz

cubeviz2 = Cubeviz()
cubeviz2.load_data(plotcube, data_label='SN1987A MASK')
cubeviz2.show()

In [None]:
# Define the weightmap in the original cube as the new 3D Mask 
# This will tell the pipeline which spaxels to use for extraction

spec_model_cube.weightmap = mask3d
spec_model_cube.save(spec3_dir+'87A_skycube.fits',overwrite=True)

# 4. Extract Background Spectrum using Extract1dStep

In [None]:
# Set 1D extraction parameters

def runex(filename, outdir, outputfile):
    ex1d = jwst.extract_1d.Extract1dStep()
    ex1d.output_dir = outdir
    ex1d.save_results = True
    ex1d.subtract_background = False
    ex1d.output_file = outputfile
    ex1d(filename)

In [None]:
# We will extract a 1D spectrum from the cube created above with a weightmap defined by the region mask
# This extraction will create an average of all the spaxels in each frame that are not masked
# We will use this to be our master background to subtract from the entire cube

cubefile_p1  = spec3_dir+'87A_skycube.fits'
outputfile = spec3_dir+'87A_bg'
runex(cubefile_p1,spec3_dir,outputfile=outputfile)

### A single background spectrum is necessary in this workflow. But in some workflows, you may wish to work with more than one background region. This is further illustrated in Notebook 3 of this series on SN 1987A.

# 4. Rerun Stage 3 With Master Background Turned On

In [None]:
# Define a function that will call the spec3 pipeline with our desired set of parameters
# This is designed to run on an association file
def runspec3(filename):
    
    crds_config = Spec3Pipeline.get_config_from_reference(filename)
    spec3 = Spec3Pipeline.from_config_section(crds_config)
    spec3.output_dir = spec3_dir
    spec3.save_results = True
    spec3.cube_build.output_file = '87A_bg_sub' # Custom output name
    spec3.cube_build.output_type = 'multi' # 'band', 'channel', or 'multi' type cube output
    spec3.outlier_detection.threshold_percent = 98.5 # optimized threshold number
    spec3.master_background.user_background=spec3_dir+'87A_bg_extract1dstep.fits' # Master Background Extracted Above
    spec3.master_background.force_subtract=True
    
    spec3(filename)


### Developer Note: Right now, this association file can only be created manually. 

In [None]:
# Define the association file used for the background subtraction.
# Download the background subtract json file.

asnfile_bg_sub = 'spec2_l3asn_bg_sub.json'
if os.path.isfile(asnfile_bg_sub):
    print('File exists. Deleting '+asnfile_bg_sub)
    os.remove(asnfile_bg_sub)
print("Downloading "+asnfile_bg_sub)
url = 'https://data.science.stsci.edu/redirect/JWST/jwst-data_analysis_tools/MRS_1987A/spec2_l3asn_bg_sub.json'
urllib.request.urlretrieve(url, asnfile_bg_sub)

In [None]:
spec3 = 1.
if dospec3:
    runspec3(asnfile_bg_sub)
else:
    print('Skipping Spec3 processing')

In [None]:
datacube = spec3_dir+'87A_bg_sub_ch1-2-3-4-shortmediumlong_s3d.fits'
datacube

In [None]:
# Visualize Background Subtracted Cube
# Note, this is not a perfect subtraction. As we will see in the next notebook, we oversubtract the background.
# This is likely caused by poorly chose spaxels. A more careful selection of background regions should result in a flatter background post-subtraction.

cubeviz3 = Cubeviz()
cubeviz3.load_data(datacube, data_label='SN1987A MASK')
cubeviz3.show()