<a id="top"></a>
# MIRI MRS Spectroscopy of a Late M Star

**Use case:** Extract spatial-spectral features from IFU cube and measure their attributes.<br>
**Data:** KMOS datacube of point sources in the LMC from Jones et al. (in prep).<br>
**Tools:** specutils, spectral_cube, photutils, astropy, aplpy, scipy.<br>
**Cross-intrument:** MIRI<br>
**Documentation:** This notebook is part of a STScI's larger [post-pipeline Data Analysis Tools Ecosystem](https://jwst-docs.stsci.edu/jwst-post-pipeline-data-analysis).<br>

**Note**: Ultimately, this notebook will include MIRI simulated data cubes obtained using MIRISim (https://wiki.miricle.org//bin/view/Public/MIRISim_Public)
and run through the JWST pipeline (https://jwst-pipeline.readthedocs.io/en/latest/) of
point sources with spectra representative of late M type stars.

## Introduction

This notebook analyzes one star represented by a dusty SED corresponding to the ISO SWS spectrum of
W Per from Kraemer et al. (2002) and Sloan et al. (2003) to cover the MRS spectral range 5-28 microns.  Analysis of JWST spectral cubes requires extracting spatial-spectral features of interest and measuring their attributes. 

The first part of the notebook will process the datacube and automatically detect and extract spectra (summed over its spatial region) for all point sources in the cube.  Then it will read in a datacube generated at Stage 3 of the JWST pipeline or use near-IR data from KMOS as a representative example of an IR data cube.  The analysis will use `photutils` to automatically detect sources in the continuum image and use an aperture mask generated with `spectral-cube` to extract the spectra of each point source in the data cube.

The second part of the notebook will perform data analysis using `specutils`.  Specifically, it will fit a model photosphere/blackbody to the spectra.  Then it will calculate the centroids, line integrated flux and equivalent width for each dust and molecular feature. 

## To Do:
- Replace KMOS data cube with JWST/MIRI simulation of an M star ran through JWST piplieline.
- Make function to extract spectra from datacube using an apeture.
- Replace blackbody fit to the photosphere part of the spectra with a stellar photosphere model.
- Make sure errors have been propagated correctly in the caculation of centroids, line integrated flux and
equivalent widths.
- Make simple function within the `specutils` framework to fit a continium and measure centroids, line integrated flux and
equivalent widths of broad solid state and molecular features.

## Imports

In [None]:
# Import useful python packages
import numpy as np

# Import packages to display images inline in the notebook
import matplotlib.pyplot as plt    
%matplotlib inline   

# Set general plotting options
params={'legend.fontsize':'18','axes.labelsize':'18',
        'axes.titlesize':'18','xtick.labelsize':'18',
        'ytick.labelsize':'18','lines.linewidth':2,'axes.linewidth':2,'animation.html': 'html5'}
plt.rcParams.update(params)
plt.rcParams.update({'figure.max_open_warning': 0})

In [None]:
# Import astropy packages 
from astropy import units as u
from astropy.io import ascii
from astropy.wcs import WCS
from astropy.table import Table, vstack
from astropy.stats import sigma_clipped_stats
from astropy.nddata import StdDevUncertainty
from astropy.io import fits # added by BAS on 8 April 2021

# Import packages to deal with spectralcubes
from spectral_cube import SpectralCube

# To find stars in the MRS spectralcubes and do aperture photometry
from photutils import DAOStarFinder, CircularAperture

# To deal with 1D spectrum
from specutils import Spectrum1D
from specutils.fitting import fit_generic_continuum
from specutils.manipulation import box_smooth, extract_region, SplineInterpolatedResampler
from specutils.analysis import line_flux, centroid, equivalent_width
from specutils.spectra import SpectralRegion

# To make nice plots with WCS axis
import aplpy

# To fit a curve to the data
from scipy.optimize import curve_fit

## Set paths to the Data and Outputs

For now use KMOS data cube of YSOs in the LMC from Jones et al in prep.

TODO: Update with MIRISim JWST pipeline processed data in future itterations.

In [None]:
# import other stuff

from jwst.pipeline import Detector1Pipeline
from jwst.pipeline import Spec2Pipeline
from jwst.pipeline import Spec3Pipeline
from jwst.extract_1d import Extract1dStep
import json
import glob
from jwst.associations.lib.rules_level3_base import DMS_Level3_Base
from jwst.associations import asn_from_list
import crds
from jdaviz.app import Application
import asdf
from photutils import aperture_photometry
import os

In [None]:
# do Stage 1 pipeline

allshortfiles=glob.glob('/filepath/and/filenames_of_the_MIRIFUSHORT_mirisim_simulation_files_here.fits')
alllongfiles=glob.glob('/filepath/and/filenames_of_the_MIRIFULONG_mirisim_simulation_files_here.fits')

pipe1short = Detector1Pipeline()

# run calwebb_detector1 on the MIRIFUSHORT data separate from MIRIFULONG data, as it saves time this way
for shortfile in allshortfiles:
    print(shortfile)
    baseshort,remaindershort = shortfile.split('.')
    
    beforestuffshort,dateafterstuffshort = shortfile.split('2021_')
    #Since there are 3 simulations folders, each corresponding to SHORT, MEDIUM, and LONG band mirisim
    #  simulations, we need a way to identify the pipeline's products resulting from these input simulations,  
    #  so we identify based on the number string in the folder name giving month and day and time of  
    #  simulation creation.  There may be a better way to do this, but this works for now.
    datestringshort,afterstuffshort = dateafterstuffshort.split('_mirisim')
    
    pipe1short.refpix.skip = True
    pipe1short.output_file = baseshort+datestringshort
    
    pipe1short.run(shortfile)

pipe1long = Detector1Pipeline()

for longfile in alllongfiles:
    print(longfile)
    baselong,remainderlong = longfile.split('.')
    
    beforestufflong,dateafterstufflong = longfile.split('2021_')
    #same as above
    datestringlong,afterstufflong = dateafterstufflong.split('_mirisim')
    
    pipe1long.refpix.skip = True
    pipe1long.output_file = baselong+datestringlong
    
    pipe1long.run(longfile)

In [None]:
# do Stage 2 pipeline

allshortfiles2 = glob.glob('det_image_*_MIRIFUSHORT_*_rate.fits')
alllongfiles2 = glob.glob('det_image_*_MIRIFULONG_*_rate.fits')

for short2file in allshortfiles2:
    print(short2file)
    pipe2short = Spec2Pipeline()
    base2short,remainder2short = short2file.split('.')
    
    pipe2short.straylight.skip = True# skip stray light step
    pipe2short.cube_build.skip = True# skip cube building step (for now, but see my comment in next cell)
    pipe2short.extract_1d.skip = True# skip extract_1d step
    pipe2short.output_file = base2short
    
    pipe2short.run(short2file)

for long2file in alllongfiles2:
    print(long2file)
    pipe2long = Spec2Pipeline()
    base2long,remainder2long = long2file.split('.')
    
    pipe2long.straylight.skip = True
    pipe2long.cube_build.skip = True
    pipe2long.extract_1d.skip = True
    pipe2long.output_file = base2long
    
    pipe2long.run(long2file)

In [None]:
# B. Sargent, 10 May 2021
# The first time you run this notebook, skip this cell and proceed to the next cell.
#   Then, when you have a cube produced from stage 3 of the pipeline, open the cube 
#   outside of this notebook and find the RA and Dec of the point source in the cube.  
#   Come back to this cell and set those RA and Dec equal to targra and targdec in this 
#   cell, respectively, run this cell, then re-do the next 2 cells to re-do the stage 3 
#   pipeline reduction.
#
# In the future, perhaps that awkward process could be replaced.  Specifically, I am 
#   thinking that, further down in this notebook, Libby has photutils code that can 
#   identify a point source in the data.  That could be done here on a single cube
#   produced by the preceding cell that does the stage 2 reduction, and the pixel 
#   coordinates returned could be used to compute the RA and Dec of the source, then 
#   that RA and Dec could be used here instead of having to open the stage 3 cube 
#   outside of this notebook to get the source's RA and Dec like is currently needed.

all_files = glob.glob('det_image_*_cal.fits')
targra=# Put the RA you find from the cube you get the first time you run the stage 3 pipeline here.
targdec=# Put the Dec you find from the cube you get the first time you run the stage 3 pipeline here.
for thisfile in all_files:
    base,remainder = thisfile.split('.')
    outfilename=base+'_fix.'+remainder
    print(outfilename)
    
    with fits.open(thisfile) as hduthis:
        hduthis['SCI'].header['SRCTYPE']='POINT'# This tells the extract_1d step to extract a point source.
        hduthis[0].header['TARG_RA']=targra
        hduthis[0].header['TARG_DEC']=targdec
        hduthis.writeto(outfilename,overwrite=True)

In [None]:
# set up needed reference file for stage 3

file_all_list = glob.glob('det_image_*_cal_fix.fits')

asnall = asn_from_list.asn_from_list(file_all_list, rule=DMS_Level3_Base,product_name='combine_dithers_all_exposures')

asnallfile='for_spec3_all.json'
with open(asnallfile, 'w') as fpall:
    fpall.write(asnall.dump()[1])

In [None]:
# do stage 3 pipeline

pipe3ss = Spec3Pipeline()
pipe3ss.master_background.skip = True
pipe3ss.mrs_imatch.skip = True
pipe3ss.outlier_detection.skip = True
pipe3ss.resample_spec.skip = True
pipe3ss.combine_1d.skip = True
#pipe3ss.cube_build.output_type='multi'# un-comment this if you want a cube spanning the entire MRS wavelength range
pipe3ss.use_source_posn='True'
pipe3ss.subtract_background='True'
#pipe3ss.extract_1d.apply_apcorr = False# un-comment this if you do not want the aperture corrction applied
pipe3ss.output_file = 'allspec3'
pipe3ss.run(asnallfile)

In [None]:
wlall=[]
fnuall=[]
dfnuall=[]
bandall=[]

speclist=[]

allch_extractfile='combine_dithers_all_exposures_ch1-2-3-4-mediumlongshort-_x1d.fits'
with fits.open(allch_extractfile) as hduallch:
    wlallch=hduallch['EXTRACT1D'].data['wavelength']
    fnuallch=hduallch['EXTRACT1D'].data['flux']
    dfnuallch=hduallch['EXTRACT1D'].data['error']

ch1short_extractfile='combine_dithers_all_exposures_ch1-short_x1d.fits'
with fits.open(ch1short_extractfile) as hduch1short:
    wlch1short=hduch1short['EXTRACT1D'].data['wavelength']
    fnuch1short=hduch1short['EXTRACT1D'].data['flux']
    dfnuch1short=hduch1short['EXTRACT1D'].data['error']
    print(dfnuch1short)
ch1shortspec=Spectrum1D(spectral_axis=wlch1short*u.micron,flux=fnuch1short*u.Jy,uncertainty=dfnuch1short,meta={'band':'ch1short'})
speclist.append(ch1shortspec)

ch1medium_extractfile='combine_dithers_all_exposures_ch1-medium_x1d.fits'
with fits.open(ch1medium_extractfile) as hduch1medium:
    wlch1medium=hduch1medium['EXTRACT1D'].data['wavelength']
    fnuch1medium=hduch1medium['EXTRACT1D'].data['flux']
    dfnuch1medium=hduch1medium['EXTRACT1D'].data['error']
ch1mediumspec=Spectrum1D(spectral_axis=wlch1medium*u.micron,flux=fnuch1medium*u.Jy,uncertainty=dfnuch1medium,meta={'band':'ch1medium'})
speclist.append(ch1mediumspec)

ch1long_extractfile='combine_dithers_all_exposures_ch1-long_x1d.fits'
with fits.open(ch1long_extractfile) as hduch1long:
    wlch1long=hduch1long['EXTRACT1D'].data['wavelength']
    fnuch1long=hduch1long['EXTRACT1D'].data['flux']
    dfnuch1long=hduch1long['EXTRACT1D'].data['error']
ch1longspec=Spectrum1D(spectral_axis=wlch1long*u.micron,flux=fnuch1long*u.Jy,uncertainty=dfnuch1long,meta={'band':'ch1long'})
speclist.append(ch1longspec)

ch2short_extractfile='combine_dithers_all_exposures_ch2-short_x1d.fits'
with fits.open(ch2short_extractfile) as hduch2short:
    wlch2short=hduch2short['EXTRACT1D'].data['wavelength']
    fnuch2short=hduch2short['EXTRACT1D'].data['flux']
    dfnuch2short=hduch2short['EXTRACT1D'].data['error']
ch2shortspec=Spectrum1D(spectral_axis=wlch2short*u.micron,flux=fnuch2short*u.Jy,uncertainty=dfnuch2short,meta={'band':'ch2short'})
speclist.append(ch2shortspec)

ch2medium_extractfile='combine_dithers_all_exposures_ch2-medium_x1d.fits'
with fits.open(ch2medium_extractfile) as hduch2medium:
    wlch2medium=hduch2medium['EXTRACT1D'].data['wavelength']
    fnuch2medium=hduch2medium['EXTRACT1D'].data['flux']
    dfnuch2medium=hduch2medium['EXTRACT1D'].data['error']
ch2mediumspec=Spectrum1D(spectral_axis=wlch2medium*u.micron,flux=fnuch2medium*u.Jy,uncertainty=dfnuch2medium,meta={'band':'ch2medium'})
speclist.append(ch2mediumspec)

ch2long_extractfile='combine_dithers_all_exposures_ch2-long_x1d.fits'
with fits.open(ch2long_extractfile) as hduch2long:
    wlch2long=hduch2long['EXTRACT1D'].data['wavelength']
    fnuch2long=hduch2long['EXTRACT1D'].data['flux']
    dfnuch2long=hduch2long['EXTRACT1D'].data['error']
ch2longspec=Spectrum1D(spectral_axis=wlch2long*u.micron,flux=fnuch2long*u.Jy,uncertainty=dfnuch2long,meta={'band':'ch2long'})
speclist.append(ch2longspec)

ch3short_extractfile='combine_dithers_all_exposures_ch3-short_x1d.fits'
with fits.open(ch3short_extractfile) as hduch3short:
    wlch3short=hduch3short['EXTRACT1D'].data['wavelength']
    fnuch3short=hduch3short['EXTRACT1D'].data['flux']
    dfnuch3short=hduch3short['EXTRACT1D'].data['error']
ch3shortspec=Spectrum1D(spectral_axis=wlch3short*u.micron,flux=fnuch3short*u.Jy,uncertainty=dfnuch3short,meta={'band':'ch3short'})
speclist.append(ch3shortspec)

ch3medium_extractfile='combine_dithers_all_exposures_ch3-medium_x1d.fits'
with fits.open(ch3medium_extractfile) as hduch3medium:
    wlch3medium=hduch3medium['EXTRACT1D'].data['wavelength']
    fnuch3medium=hduch3medium['EXTRACT1D'].data['flux']
    dfnuch3medium=hduch3medium['EXTRACT1D'].data['error']
ch3mediumspec=Spectrum1D(spectral_axis=wlch3medium*u.micron,flux=fnuch3medium*u.Jy,uncertainty=dfnuch3medium,meta={'band':'ch3medium'})
speclist.append(ch3mediumspec)

ch3long_extractfile='combine_dithers_all_exposures_ch3-long_x1d.fits'
with fits.open(ch3long_extractfile) as hduch3long:
    wlch3long=hduch3long['EXTRACT1D'].data['wavelength']
    fnuch3long=hduch3long['EXTRACT1D'].data['flux']
    dfnuch3long=hduch3long['EXTRACT1D'].data['error']
ch3longspec=Spectrum1D(spectral_axis=wlch3long*u.micron,flux=fnuch3long*u.Jy,uncertainty=dfnuch3long,meta={'band':'ch3long'})
speclist.append(ch3longspec)

ch4short_extractfile='combine_dithers_all_exposures_ch4-short_x1d.fits'
with fits.open(ch4short_extractfile) as hduch4short:
    wlch4short=hduch4short['EXTRACT1D'].data['wavelength']
    fnuch4short=hduch4short['EXTRACT1D'].data['flux']
    dfnuch4short=hduch4short['EXTRACT1D'].data['error']
ch4shortspec=Spectrum1D(spectral_axis=wlch4short*u.micron,flux=fnuch4short*u.Jy,uncertainty=dfnuch4short,meta={'band':'ch4short'})
speclist.append(ch4shortspec)

ch4medium_extractfile='combine_dithers_all_exposures_ch4-medium_x1d.fits'
with fits.open(ch4medium_extractfile) as hduch4medium:
    wlch4medium=hduch4medium['EXTRACT1D'].data['wavelength']
    fnuch4medium=hduch4medium['EXTRACT1D'].data['flux']
    dfnuch4medium=hduch4medium['EXTRACT1D'].data['error']
ch4mediumspec=Spectrum1D(spectral_axis=wlch4medium*u.micron,flux=fnuch4medium*u.Jy,uncertainty=dfnuch4medium,meta={'band':'ch4medium'})
speclist.append(ch4mediumspec)

ch4long_extractfile='combine_dithers_all_exposures_ch4-long_x1d.fits'
with fits.open(ch4long_extractfile) as hduch4long:
    wlch4long=hduch4long['EXTRACT1D'].data['wavelength']
    fnuch4long=hduch4long['EXTRACT1D'].data['flux']
    dfnuch4long=hduch4long['EXTRACT1D'].data['error']
ch4longspec=Spectrum1D(spectral_axis=wlch4long*u.micron,flux=fnuch4long*u.Jy,uncertainty=dfnuch4long,meta={'band':'ch4long'})
speclist.append(ch4longspec)

import matplotlib.pyplot as plt
plt.plot(wlallch,fnuallch,'.',color='black',markersize=1)
plt.plot(wlch1short,fnuch1short,'.',color='purple',markersize=1)
plt.plot(wlch1medium,fnuch1medium,'.',color='blue',markersize=1)
plt.plot(wlch1long,fnuch1long,'.',color='green',markersize=1)
plt.plot(wlch2short,fnuch2short,'.',color='yellow',markersize=1)
plt.plot(wlch2medium,fnuch2medium,'.',color='orange',markersize=1)
plt.plot(wlch2long,fnuch2long,'.',color='red',markersize=1)
plt.plot(wlch3short,fnuch3short,'.',color='purple',markersize=1)
plt.plot(wlch3medium,fnuch3medium,'.',color='blue',markersize=1)
plt.plot(wlch3long,fnuch3long,'.',color='green',markersize=1)
plt.plot(wlch4short,fnuch4short,'.',color='yellow',markersize=1)
plt.plot(wlch4medium,fnuch4medium,'.',color='orange',markersize=1)
plt.plot(wlch4long,fnuch4long,'.',color='red',markersize=1)
plt.xlabel('wavelength (microns)')
plt.ylabel('flux (Jy)')
plt.xlim(4.0,29.0)
plt.ylim(0.0,0.15)
plt.title('Spectrum from extract_1d in Stage 3')
plt.tight_layout()
plt.savefig('spectrum.jpg')
plt.show()

In [None]:
app = Application(configuration='cubeviz')# when doing Cubeviz, first run this block.
app

In [None]:
ch1short_cubefile='combine_dithers_all_exposures_ch1-long_s3d.fits'
#ch1short_cubefile='combine_dithers_all_exposures_ch1-2-3-4-shortlongmedium-_s3d.fits'
app.load_data(ch1short_cubefile)# When this block is run, the spectrum will appear in the Cubeviz viewer above.

In [None]:
app.get_viewer('spectrum-viewer').show()# This just shows the spectrum part.

In [None]:
spec_input = app.get_data_from_viewer('spectrum-viewer')
# This is the spectrum - wavelength and flux

In [None]:
#Before you run this cell, draw a region (like a circle) in one of the cube viewers in Cubeviz above.
#  Otherwise, you get an error.
spec_agb = app.get_data_from_viewer('spectrum-viewer', 'Subset 1')
spec_agb

plt.figure(2)
plt.plot(spec_agb.spectral_axis,spec_agb.flux)
plt.show()

In [None]:
# Setup an input directory where relevant data is located
#data_in_path = "https://data.science.stsci.edu/redirect/JWST/jwst-data_analysis_tools/MRS_Spectroscopy_Late_M_Star/"

data_cube_file = 'combine_dithers_all_exposures_ch1-long_s3d.fits'
#'combine_dithers_all_exposures_ch1-2-3-4-shortlongmedium-_s3d.fits'
# Apparently the cell that is 2 cells from now that reads in the cube does not like the MRS cube that spans 
#   all the MRS wavelengths!  So, instead, I have here a cube for only 1 of the MRS bands, which it seems 
#   to like better.

#data_in_path + "NGC346_K_2c_COMBINED_CUBE_Y551.fits"

# Path to output directory
data_out_path = "./"

# Setup an output directory to save the extracted 1D spectra
outdir_spectra = data_out_path + '/spectra/'

In [None]:
# Some housekeeping if using the KMOS data rather than simulated JWST/MRS data
# Define good wavelength ranges for each grating from which to make the data cube
#YJgrating = [1.02, 1.358] # microns
#Hgrating = [1.44, 1.85]   # microns
Kgrating = [7.6, 8.6]#[2.1, 2.42]    # microns

## Load and Display the Data cube

**Developer note**  Note the `SpectralCube` package is designed for sub-mm/radio data it expects a beam!
This is preferred to other packages available due to much of its functionality and ease of use.
JWST NIRSpec and MIRI both have instruments that give data cubes (with two positional dimensions and one spectral
dimension) as the final pipeline product, as do many ground based telescopes, which do not have a beam.


https://spectral-cube.readthedocs.io/en/stable/index.html

In [None]:
cube = SpectralCube.read(data_cube_file, hdu=1)  
print(cube)

In [None]:
# Cube dimensions and trimming 
# Data order in cube is (n_spectral, n_y, n_x)

# Trim the ends of the cube where the data quality is poor
subcube = cube.spectral_slab(Kgrating[0] * u.micron, Kgrating[1] * u.micron)

# Rename subcube to equal cube - done in case step above is not necessary
cube = subcube

# Chop out the NaN borders
cmin = cube.minimal_subcube()

In [None]:
# Make a continuum image (Sum/average over Wavelength)
# Note: many mathematical options are available median is preferred
cont_img = cmin[:, 10:30, 10:30].median(axis = 0)#cmin.median(axis = 0)

cont_img_sub = cont_img[10:30, 10:30]
# Extract the target name
#name_long = cont_img.header["OBJECT"]
#name, _ = name_long.split("/")
name = 'AGB'

In [None]:
# Quick plot the continuum image now the NaN borders removed
plt.imshow(cont_img.value)
plt.tight_layout()
plt.show()

In [None]:
#Plot the continuum in WCS
F = aplpy.FITSFigure(cont_img.hdu, north = True)
F.show_colorscale()
F.add_label(0.1, 0.9, name, relative = True, size = 22, weight = 'bold')
F.axis_labels.set_font(size = 22)
F.tick_labels.set_font(size = 18, stretch = 'condensed')

## Now to detect the point source in the datacube and extract and plot the spectra for each source

**Developer note** Finding a way to streamline the process of detecting sources within a data cube and extracting their
spectra would be extremely valuable.

For data cubes like the JWST/MIRI MRS information on the point sources in the FOV and also obtaining a source subtracted
 data cube will be necessary (See the `PampelMuse` software for an example on how spectral extraction is implemented for
  near-IR data cubes like MUSE).

Note these backgrounds of diffuse emission can be quite complex.

On these source extracted data cubes (see `SUBTRES` in `PampelMuse`) I would like to produce moment maps
(https://casa.nrao.edu/Release3.4.0/docs/UserMan/UserManse41.html) and Position-Velocity (PV) diagrams
(https://casa.nrao.edu/Release4.1.0/doc/UserMan/UserManse42.html).

### 1) Use `Photutils` to detect stars/point sources in the continuum image

The first step of the analysis is to identify those sources for which it is feasible to extract spectra from the IFU
data. Ideally we can estimate the signal-to-noise ratio (S/N) for all sources in the cube, do a number of checks to
determine the status of every source and loop through these (brightest first) to extract the spectra.

### 2) Extract the spectra from the datacube using `SpectralCube`

**Note** There are multiple ways of extracting spectra from datacubes. The simplest is slice the cube along a single
pixel but this is not ideal for point sources which should cover multiple pixels.
Here I use *Aperture Extraction*. 

- The flux from each point source was obtained via a circular aperture. This requires you to mask the data, and make a
circular mask and a maskedsubcube.

- A background measured using a square/rectangular aperture sliced in pixel coordinates to produce a sub-cube.

- A annulus surrounding the point source to measure the local background. 

- Using predefined regions from DS9 etc. to create a mask [`Not used here`].

*If have a small number of data cubes selecting the source extraction region and background region manually using
`cubeviz` would be useful here.*

Mathematical operation e.g. `max, mean, sum, median, std` should then be applied to the region in the aperture.

Below I show a few different options from the simple to the complex, which takes into account the background emission
within the data cube. Taking into account the background may not always be the preferred method but the option should
always be available when using an aperture extraction.

#### Steps to find the background

1) Define a background region either as an annulus or as a rectangle away from the source

2) Find the median of all the background pixels to account for variations 

3) Find number of pixels in background and number of pixels in the point source aperture

4) Find the sum of all the pixels in the point source aperture

5) Correct for background using the sum star flux minus median of background * pixels in star aperture



**Advanced Developer Note** Using Aperture Extraction to obtain the spectra for each source in the data cube is still
very simplistic. It should be noted that the MIRI aperture changes as a function of wavelength, the steps above do not
account for this.
A good example of software that looks at extracting point sources from datacubes is: `PampelMuse`, by Sebastian Kamann. 
https://gitlab.gwdg.de/skamann/pampelmuse; https://ui.adsabs.harvard.edu/abs/2013A%26A...549A..71K/abstract

An `optimal spectrum extraction` procedure would take into account the varying PSF through the cube, to produce an
accurate spectra with the maximum possible signal-to-noise ratio. This weights the extracted data by the S/N of each
pixel (Horne 1986) and would be ideal for when there is a complex background or for extracting spatially-blended source.
For small cubes its best to fit a PSF profile to all resolved sources simultaneously, but this might not be possible in
larger data sets.

**Advanced Developer Note 2** In dense fields like globular clusters, with a significant number of unresolved sources or
 in embedded star-forming clusters, a more advanced treatment of the background would be necessary. For instance using a
 homogeneous grid across the field of view with parameters controlling the bin size would be ideal. If a variable
 background is not accounted for in a PSF extraction systematic residuals in the data would be present where background
 is over or underestimated.



## Detect, extract and plot 1D spectrum of each source in the cube 

### First automatically identify all the point sources in the cube using `photutils`

In [None]:
# Make an array to store results of the source detection within the data cube
name_val = []
source_val = []
ra_val =[]
dec_val =[]

In [None]:
# Crop out Edges and take absolute value of the continuum image
#cont_img = cont_img[1:13, 1:13]

# Find the background in the collapsed datacube
mean, median, std = sigma_clipped_stats(cont_img.value, sigma = 2.0)

# Get a list of sources using a dedicated source detection algorithm
# Find sources at least 3* background (typically)

daofind = DAOStarFinder(fwhm = 2.0, threshold = 3. * std)
sources = daofind(cont_img.value - median) 
print("\n Number of sources in field: ", len(sources))

### If point sources are present in the cube extract and plot the spectrum of each source

#### In the cell below we:

1) Extract a spectra for each detected object using aperture photometry, and a circular masked region.

2) Make an estimate of the background in the datacube using both: an annulus around each source and a box region away
from the source - this box and annulus is hard coded and not ideal for other datasets or multiple cubes.

3) Generate a background corrected spectrum.

4) Plots the spectra and its various background corrected versions. 

5) Convert the spectra into Jy.

6) Write each of the spectra to a file. (They could be put into a `specutils` `Spectrum1D` object at this stage but I
have not done this here.) This file is loaded by all other routines to do analysis on the data.

In [None]:
if len(sources) > 0:
    print()            
    for col in sources.colnames:
        sources[col].info.format = '%.8g'  # for consistent table output

    print(sources)                  

    # From the list of sources in the field get their RA and DEC (ICRS)
    print()

    # Positions in pixels
    positions = Table([sources['xcentroid'], sources['ycentroid']])

    # Instantiate WCS object
    w = WCS(cont_img.header)

    # Convert to RA & Dec (ICRS)
    radec_lst = w.pixel_to_world(sources['xcentroid'], sources['ycentroid'])
    
    #-----------------------------------------------------  
    # We are now entering a loop which does multiple processing steps on each 
    # point source detected in the cube
    
    for count_s, _ in enumerate(sources):
        print(radec_lst[count_s].to_string('hmsdms'))
        name_val.append(name)
        source_val.append(count_s)
        ra_val.append(radec_lst[count_s].ra.deg)
        dec_val.append(radec_lst[count_s].dec.deg)
        
        #-----------------------------------------------------           
        # Aperture Extract spectrum of point source - using a circular aperture

        # Size of frame 
        ysize_pix = cmin.shape[1]
        xsize_pix = cmin.shape[2]

        # Set up some centroid pixel for the source 
        ycent_pix = sources['ycentroid'][count_s]
        xcent_pix = sources['xcentroid'][count_s]

        # Make an aperture radius for source
        # If made into a function this value should not be hardcoded
        aperture_rad_pix = 2

        # Make a masked array for the aperture
        yy, xx = np.indices([ysize_pix,xsize_pix], dtype = 'float')
        radius = ((yy-ycent_pix)**2 + (xx-xcent_pix)**2)**0.5

        # Select pixels within the aperture radius
        mask = radius <= aperture_rad_pix

        # Make a masked cube
        maskedcube = cmin.with_mask(mask)

        # Pixels in aperture
        pix_in_ap = np.count_nonzero(mask == 1)

        # Extract the spectrum from only the circular aperture - use sum
        spectrum = maskedcube.sum(axis = (1,2))

        # Extract the noise spectrum for the source
        noisespectrum = maskedcube.std(axis = (1,2))

        # Measure a spectrum from the background - Use an annulus around the source
        # NOTE: Hardcoded values in for annulus size - improve

        # Select pixels within an annulus
        an_mask = (radius > aperture_rad_pix + 1) & (radius <= aperture_rad_pix + 2)

        # Make a masked cube
        an_maskedcube = cmin.with_mask(an_mask)

        # Extract the background spectrum from only the annulus
        bkg_spectrum = an_maskedcube.median(axis = (1,2))

        # Background corrected spectrum - annulus
        corr_sp = spectrum - (bkg_spectrum * pix_in_ap)

        # Try measuring a spectrum from the background -> Use a box away from source.
        # NOTE: Hardcoded values in for box region - improve

        bkgcube = cmin[:, 13:16, 13:16]#[: , 1:3, 10:13]
        bkgbox_spectrum = bkgcube.median(axis = (1,2))
        bkg_img = bkgcube.median(axis = 0)

        # Background corrected spectrum - box
        corr_sp_box = spectrum - (bkgbox_spectrum * pix_in_ap)
                
        #-----------------------------------------------------               
        # Plot the spectrum extracted from circular aperture via: a sum extraction
        
        plt.figure(figsize = (10,5))

        plt.plot(maskedcube.spectral_axis.value, spectrum.value,
                 label = 'Source')
        plt.plot(maskedcube.spectral_axis.value, corr_sp.value,
                 label = 'Bkg Corr')
        plt.plot(maskedcube.spectral_axis.value, corr_sp_box.value,
                 label = 'Bkg Corr box')

        plt.xlabel('Wavelength (microns)')
        plt.ylabel(spectrum.unit)

        plt.gcf().text(0.5, 0.85, name, fontsize = 14, ha = 'center')

        plt.gcf().text(0.5, 0.80, radec_lst[count_s].to_string('decimal'),
                       ha = 'center', fontsize=14)

        plt.legend(frameon = False, fontsize = 'medium')
        plt.tight_layout()
        plt.show()
        plt.close()
        
        #-----------------------------------------------------  
        # Convert flux from erg / (Angstrom cm2 s) to Jy 
        
        spectrumJy = spectrum.to(
            u.Jy, equivalencies = u.spectral_density(maskedcube.spectral_axis))

        corr_sp_Jy = corr_sp.to(
            u.Jy, equivalencies = u.spectral_density(maskedcube.spectral_axis))

        corr_sp_box_Jy = corr_sp_box.to(
            u.Jy, equivalencies= u.spectral_density(maskedcube.spectral_axis))

        noiseSp_Jy = noisespectrum.to(
            u.Jy, equivalencies = u.spectral_density(maskedcube.spectral_axis))

        #-----------------------------------------------------
        # Save each extracted spectrum to a file

        # Set an output name
        spec_outname = name + "_" + str(count_s) + "_" + "spec"

        # Make output table 
        specdata_tab = Table([maskedcube.spectral_axis, corr_sp_Jy, noiseSp_Jy,
                              spectrumJy, corr_sp_box_Jy],
                             names=['wave_mum', 'cspec_Jy', 'err_fl_Jy',
                                    'spec_Jy', 'cSpec_box_Jy'])

        # Write the file
        # ascii.write(specdata_tab, outdir_spectra + spec_outname +".csv",
        #             format = 'csv', overwrite = True)

    #-----------------------------------------------------
    # Do aperture photometry on the sources - Only if using sum of image

    # Take list of star positions from DAOFIND use this to define an aperture
    if len(sources) == 2:                         # To overcome in array order 
        sources = vstack([sources, sources])             
        positions_pix = (sources['xcentroid'], sources['ycentroid'])
    else:
        positions_pix = (sources['xcentroid'], sources['ycentroid'])

    apertures = CircularAperture(positions_pix, r = 2.)   # Aperture radius = 2 pixels
    
    #-----------------------------------------------------
    # As a check to make sure all obvious point sources have been identified
    # plot the cube with the NaN borders removed and overplot the apertures
    # for the extracted sources
    plt.figure()

    plt.subplot(1, 2, 1)
    plt.imshow(cont_img.value, cmap='Greys', origin='lower')
    apertures.plot(color='blue', lw=1.5, alpha=0.5)

    plt.subplot(1, 2, 2)
    plt.imshow(cont_img.value, origin='lower')

    plt.tight_layout()
    plt.show()
    plt.close()

else:  
    # Plot the cube with the NaN borders removed 
    plt.figure()
    plt.imshow(cont_img.value, origin='lower')
    plt.tight_layout()
    plt.show()
    plt.close()

In [None]:
# Make table of extracted sources
source_extspec_tab = Table([name_val, source_val, ra_val, dec_val], 
                           names = ("name", "source_no", "ra", "dec"))
print(source_extspec_tab)   

##  Data analysis - on the extracted spectra using `specutils`
With the present lack of JWST flight data, we instead use the SWS spectra of an dusty AGB star, a cool M-star.

In [None]:
# Set the paths to the spectral data extracted from the datacube above
dusty_AGB_spec_file = data_in_path + '63702662.txt'

spectra_file = dusty_AGB_spec_file

In [None]:
# Read in the spectra - as saved as text files & do some housekeeping
data = ascii.read(spectra_file)

if data.colnames[0] == 'col1':
    data['col1'].name = 'wave_mum'
    data['col2'].name = 'cspec_Jy'            
    data['col3'].name = 'err_fl_Jy'         

wav = data['wave_mum'] * u.micron  # Wavelength: microns
fl = data['cspec_Jy'] * u.Jy       # Fnu:  Jy
efl = data['err_fl_Jy'] * u.Jy     # Error flux: Jy

# Make a 1D spectrum object
spec = Spectrum1D(spectral_axis = wav, flux = fl, uncertainty = StdDevUncertainty(efl))

**Note** Reading in a spectra comprised of multiple spectral components this file may have a spectral order column. In
many instances these orders are not correctly stitched together due to issues with background and flux calibration. A
spectral file with an order column that can read into the `Spectrum1D` is necessary to do corrections and scaling on
each segment individually to fix the jumps between the spectra.

In [None]:
# Apply a 5 pixel boxcar smoothing to the spectrum
spec_bsmooth = box_smooth(spec, width = 5)   

# Plot the spectrum & smoothed spectrum to inspect features 
plt.figure(figsize = (8,4))
plt.plot(spec.spectral_axis, spec.flux, label = 'Source')
plt.plot(spec.spectral_axis, spec_bsmooth.flux, label = 'Smoothed')
plt.xlabel('Wavelength (microns)')
plt.ylabel("Flux ({:latex})".format(spec.flux.unit))

plt.legend(frameon = False, fontsize = 'medium')
plt.tight_layout()
plt.show()
plt.close()

### Fit a continuum - find the best-fitting template (stellar photosphere model or blackbody)

**Note** - Would idealy like to fit the photosphere with a set of Phoenix Models - but cant get that to work.
I think `template_comparison` may be a good function here to work with the Phoenix Models which have been setup to
interface with `pysynphot`.

For now switching to a blackbody.

- For AGB stars with a photosphere component fit a stellar photosphere model or a blackbody to short wavelength end of
the spectra

In [None]:
def blackbody_Fnu(lam, T, A):
    """ Blackbody as a function of wavelength (um) and temperature (K).
        Function returns the Planck function in f_nu units
        # [Y Jy] = 1.0E+23 * [X erg/cm^2/s/Hz] = 10E+26  [X Watts/m^2/Hz]
    """
    from scipy.constants import h, k, c
    lam = 1e-6 * lam                                              # convert to metres
    bb_nu = 2*h*c / (lam**3 * (np.exp(h*c / (lam*k*T)) - 1))      # units of W/m^2/Hz/Steradian ; f_nu units
    return A * bb_nu

In [None]:
# Only want to fit to a small wavelength range at the start of the spectra
phot_fit_region = [3.0, 9.4]  # Microns

# Trim the specrum to the region showing a stellar photosphere
sub_region_phot = SpectralRegion([(phot_fit_region[0], phot_fit_region[1])] * u.micron)
sub_spectrum_phot = extract_region(spec, sub_region_phot)

In [None]:
# fit BB to the data
def phot_fn(wa, T1, A):
    return blackbody_Fnu(wa, T1, A) 

popt, pcov = curve_fit(phot_fn, sub_spectrum_phot.spectral_axis.value,
                       sub_spectrum_phot.flux.value, p0=(3000, 10000),
                       sigma=sub_spectrum_phot.uncertainty.quantity)

# Get the best fitting parameter value and their 1 sigma errors
best_t1, best_a1 = popt
sigma_t1, sigma_a1 = np.sqrt(np.diag(pcov))

ybest = blackbody_Fnu(spec.spectral_axis.value, best_t1, best_a1)

print ('Parameters of best-fitting model:')
print ('T1 = %.2f +/- %.2f' % (best_t1, sigma_t1))

degrees_of_freedom = len(sub_spectrum_phot.spectral_axis.value) - 2

resid = (sub_spectrum_phot.flux.value - phot_fn(sub_spectrum_phot.spectral_axis.value, *popt)) \
        / sub_spectrum_phot.uncertainty.quantity

chisq = np.dot(resid, resid)

print ('nchi2 %.2f' % (chisq.value / degrees_of_freedom))

In [None]:
# Plot the spectrum & the model fit to the short wavelength region of the data.
plt.figure(figsize = (8,4))
plt.plot(spec.spectral_axis, spec.flux, label = 'Source')
plt.plot(spec.spectral_axis, ybest, label = 'BB')
plt.xlabel('Wavelength (microns)')
plt.ylabel("Flux ({:latex})".format(spec.flux.unit))
plt.title("Spectrum with blackbody fit")
plt.legend(frameon = False, fontsize = 'medium')
plt.tight_layout()
plt.show()
plt.close()

# Now subtract the BB and plot the underlying dust continuum
plt.figure(figsize = (8,4))
plt.plot(spec.spectral_axis, spec.flux.value - ybest, label = 'Dust spectra')
plt.axhline(0, color='r', linestyle = 'dashdot', alpha=0.5)
plt.xlabel('Wavelength (microns)')
plt.ylabel("Flux ({:latex})".format(spec.flux.unit))
plt.title("Continuum-subtracted spectrum")
plt.legend(frameon = False, fontsize = 'medium')
plt.tight_layout()
plt.show()
plt.close()

### Now have the dust continuum want to look for features and measure their properties.

Want to find:
- Equivalent width
- Equivalent flux
- Optical depth
- Centroids = wavelength with half the flux on either side

#### As an example lets focus on the amorphous silicate 10 micron region.

**Method - used repeatedly**

- Fit a spline to the photosphere continuum subtracted spectra excluding the feature in this fit.
- Trim the spectra to that wavelength region as the spline is now a different size to the full wavelength range of the
spectra.
- Make a continuum subtracted and and continuum normalised spectra.
- Convert the units of the flux from Jy to W/m^2/wavelength for nice units post line integration. 
- Determine the feature line flux in units of W/m^2 and the feature centroid. Use continuum subtracted spectra.
- Determine the feature equivalent width. Use continuum normalised spectra.
- Make sure errors have been propagated correctly.
- Store these results in a table 
- Several molecular and dust features are normally present in the spectra. Repeat for each feature.

**Note**
This seems like a long winded way to do this. Is there a simpler approach?

> For instance a tool that takes four wavelengths, fits a line using the data from  lam0 to lam1 and lam2 to lam3, then
>passes the continuum-subtracted  spectrum for line integration from lam1 to lam2 with error propagation is needed
>several times for dust features. But with the current spectra1d framework this takes many steps to write manually and
>is beyond tedious after doing this for 2 features let alone 20+.  Similar framework is also needed for the integrated
>line centroid with uncertainty, and the extracted equivalent width.

In [None]:
# Fit a spline to the 10 micron feature to isolate it.

bbsub_spectra = spec - ybest     # continuum subtracted spectra - Dust only

# Fit a local continuum between the flux densities at: 8.0 - 8.1 & 14.9 - 15.0 microns
# (i.e. excluding the line itself)

sw_region = 8.0   #lam0
sw_line = 8.1     #lam1
lw_line = 14.9    #lam2
lw_region = 15.0  #lam3

# Zoom in on the line complex & extract
line_reg_10 = SpectralRegion([(sw_region*u.um, lw_region*u.um)])
line_spec = extract_region(bbsub_spectra, line_reg_10)

# Fit a local continuum - exclude the actual dust feature when doing the fit

lgl_fit = fit_generic_continuum(line_spec, 
                                exclude_regions = SpectralRegion([(sw_line*u.um, 
                                                                   lw_line*u.um)])) 

# Determine Y values of the line continuum
line_y_continuum = lgl_fit(line_spec.spectral_axis) 

#-----------------------------------------------------------------
# Generate a continuum subtracted and continuum normalised spectra

line_spec_norm =line_spec / line_y_continuum

line_spec_consub = line_spec - line_y_continuum

#-----------------------------------------------------------------
# Plot the dust feature & continuum fit to the region

plt.figure(figsize = (8, 4))

plt.plot(line_spec.spectral_axis, line_spec.flux.value,
         label = 'Dust spectra 10 micron region')

plt.plot(line_spec.spectral_axis, line_y_continuum, label = 'Local continuum')

plt.xlabel('Wavelength (microns)')
plt.ylabel("Flux ({:latex})".format(spec.flux.unit))
plt.title("10$\mu$m feature plus local continuum")
plt.legend(frameon = False, fontsize = 'medium')
plt.tight_layout()
plt.show()
plt.close()

#-----------------------------------------------------------------
# Plot the continuum subtracted 10 micron feature

plt.figure(figsize = (8,4))

plt.plot(line_spec_consub.spectral_axis, line_spec_consub.flux,
         label = 'continuum subtracted')

plt.xlabel('Wavelength (microns)')
plt.ylabel("Flux ({:latex})".format(spec.flux.unit))
plt.title("Continuum subtracted 10$\mu$m feature")
plt.tight_layout()
plt.show()
plt.close()

In [None]:
# Calculate the Line flux; Line Centroid; Equivalent width
# NOTE: Where are errors computed with these functions?

line_centroid = centroid(line_spec_consub, SpectralRegion(sw_line*u.um, lw_line*u.um))

line_flux_val = line_flux(line_spec_consub, SpectralRegion(sw_line*u.um, lw_line*u.um))

equivalent_width_val = equivalent_width(line_spec_norm)

# Hack to convert the line flux value into more conventional units
# Necessary as spectra has mixed units: f_nu+lambda
line_flux_val = (line_flux_val * u.micron).to(u.W * u.m**-2 * u.micron,
                                              u.spectral_density(line_centroid)) / u.micron

print("Line_centroid: {:.6} ".format(line_centroid))
print("Integrated line_flux: {:.6} ".format(line_flux_val))
print("Equivalent width: {:.6} ".format(equivalent_width_val))


**Developer note** The hack in the cell above is necessary, as the line flux computed by `specutils` would return
units of Jy micron and it is hard to convert this into conventional units within the current `specutils` framework.
Line flux units should be in units of in W/m^2. Implementing a simple way to convert the flux and associate error to
other units when dealing with a 1d spectal object with "mixed" spectral x and y axis units seems necessary.

In [None]:
# Compute the optical depth of the 10 micron feature
tau = -(np.log(line_spec.flux.value / line_y_continuum.value))

optdepth_spec = Spectrum1D(spectral_axis = line_spec.spectral_axis,
                           flux = tau*(u.Jy/u.Jy))
        

**Developer note** Trying to put optical depth into a Spectrum1D object results in an error as no units.
But the optical depth is unit-less - using (u.Jy/u.Jy) as work arround.

In [None]:
# Plot the optical depth of the 10 micron region vs wavelength
plt.figure(figsize = (10,6))
plt.plot(optdepth_spec.spectral_axis, optdepth_spec.flux)
plt.xlabel("Wavelength ({:latex})".format(spec.spectral_axis.unit))
plt.ylabel('Tau') 
plt.tight_layout()
plt.show()
plt.close()

**Note** At this point repeat *all* the steps above to isolate solid-state features e.g. for the forsterite feature at
at approx 13.3 microns.

#### Now try looking for low crystalline silicate features  at 23, 28, 33 microns in the spectra.

In [None]:
bbsub_spectra = spec - ybest  # photosphere continuum subtracted spectra

spline_points = [20.0, 21.3, 22.0, 24.4, 25.5, 33.8, 35.9] * u.micron  

fluxc_resample = SplineInterpolatedResampler()

# Generate a spline fit to the dust continuum
spline_spec = fluxc_resample(bbsub_spectra, spline_points) 

In [None]:
# Plot the underlying dust continuum and spline fit
plt.figure(figsize = (8,4))
plt.plot(bbsub_spectra.spectral_axis, bbsub_spectra.flux.value, label = 'Dust spectra')
plt.plot(spline_spec.spectral_axis, spline_spec.flux.value, label = 'Spline spectra')

plt.axhline(0, color='r', linestyle='dashdot', alpha=0.5)

plt.xlabel('Wavelength (microns)')
plt.ylabel("Flux ({:latex})".format(spec.flux.unit))
plt.title("Continuum-subtracted spectrum with spline")
plt.legend(frameon = False, fontsize = 'medium')
plt.tight_layout()
plt.show()
plt.close()

# Plot the underlying dust continuum and spline fit
plt.figure(figsize = (8,4))
plt.plot(bbsub_spectra.spectral_axis, bbsub_spectra.flux.value, label = 'Dust spectra')
plt.plot(spline_spec.spectral_axis, spline_spec.flux.value, label = 'Spline spectra')

plt.xlim(spline_points[0].value, spline_points[-1].value)

plt.xlabel('Wavelength (microns)')
plt.ylabel("Flux ({:latex})".format(spec.flux.unit))
plt.title("Zoom of continuum-subtracted spectrum with spline")
plt.legend(frameon = False, fontsize = 'medium')
plt.tight_layout()
plt.show()
plt.close()

**Developer note** By fitting a spline to a sub region the spectral shapes are no longer the same.
` bbsub_spectra.flux.value - spline_spec.flux.value` now brakes. Would need to trim the spectrum to the spline size to
start looking closely for low contrast dust features and again measure their properties (see above). Some  wrapper to
stop repeating the same steps over and over would be nice.

## Additional Resources

- [PampelMuse](https://gitlab.gwdg.de/skamann/pampelmuse)
- [CASA](https://casa.nrao.edu/Release3.4.0/docs/UserMan/UserManse41.html)

## About this notebook
**Author:** Olivia Jones, Project Scientist, UK ATC.
**Updated On:** 2020-08-11
**Later Updated On:** 2021-May-10 by B. Sargent, STScI Scientist, Space Telescope Science Institute

***

[Top of Page](#top)