# q3dfit example notebook: rest-frame mid-IR, Spitzer IRS data of 2MASX J15561599+3951374, a.k.a. IRAS F15545+4000.

<h3><font color='teal'>Installation of the environment and the package are described <a href="https://q3dfit.readthedocs.io/">here</a>. </font></h3>

This Jupyter notebook allows you to run Q3Dfit, a PSF decomposition and spectral analysis package tailored for JWST NIRSpec and MIRI IFU observations. 

Q3Dfit is developed as a science-enabling data product by the Early Release Science Team #1335 Q3D. You can find more information about this ERS program **Q3D** [here](https://wwwstaff.ari.uni-heidelberg.de/dwylezalek/q3d.html) and [here](https://www.stsci.edu/jwst/science-execution/approved-programs/dd-ers/program-1335).

The software is based on the existing package IFSFIT developed by Dave Rupke (see [ADS link](https://ui.adsabs.harvard.edu/abs/2017ApJ...850...40R/abstract)).

The following notebook will guide you through the initialization procedure and will then perform the analysis. 

## Table of Contents

* [1. Initialization](#chapter1)
    * [1.0. Setting up the directory tree](#chapter1_0)
    * [1.1. Initializing the fit](#chapter1_1)
    * [1.2. Setting up the data and models](#chapter1_2)
    * [1.3. Setting up the fitting parameters](#chapter1_3)
        * [1.3.1. Emission line parameters](#chapter1_3_1)
        * [1.3.2. Continuum parameters](#chapter1_3_2)
* [2. Run fitting](#chapter2)
* [3. Plot fit](#chapter3)

## 1. Initialization <a class="anchor" id="chapter1"></a>

In [None]:
import os.path
import numpy as np
%load_ext autoreload
%autoreload 2
%matplotlib widget

In [None]:
# Be sure to set the path to q3dfit correctly.
# For instance:
#import sys
#sys.path.append('/Users/jwstuser/q3dfit/')
#import sys
#sys.path.append("../")

### 1.0. Setting up the directory tree <a class="anchor" id="chapter1_0"></a>

Define the directories in which the data cube(s) that you want to analyse are stored and the output directories. We recommend creating a working directory that you name after your target, in which all outputs from q3dfit will be saved. Then download test data.

In [None]:
# Base directory (book-keeping)
volume = 'Spitzer-example/'
# prefix label for output files
label = 'j1556'
# Input directory
indir = volume
if not os.path.exists(indir):
    os.makedirs(indir)
# Output directory
outdir = volume
if not os.path.exists(outdir):
    os.makedirs(outdir)
# Initialization file (q3di.npy) directory
initdir = volume
# Output logfile
logfile = os.path.join(outdir, label+'-fitlog.txt')

Download data from public Box folder. <font color='red'> Note: This also downloads the *.cf configuration file (specifying which spectral components to include in the MIR fitting). We note that the format of this file is currently under development and it will in the future be superseded by a more readable/convenient option. For a current description of continuum fitting parameters in the .cf file, please see the documentation: 
https://q3dfit.readthedocs.io/en/latest/

In [None]:
# make tuples of urls and download filenames
# infile = 1x1 mock data cube: containing only 1 single spaxel with a Spitzer spectrum
# cf = config file
infile_tup = ('https://rhodes.box.com/shared/static/spe9pc4kbwylw2khwcca0cimoaks6puh.fits', '22128896_mock_cube.fits')
cf_tup = ('https://rhodes.box.com/shared/static/6502fu97kxwky9zl5t2t8gnn6n2fi2y9.cf', '22128896.cf')
# download files; by default don't force overwrite and take first element of output
from q3dfit.jnb import download_files
infile = download_files(infile_tup, indir, force=False)[0]
cfinfile = download_files(cf_tup, indir, force=False)[0]
# add subdirectory to filenames
infile = os.path.join(indir, infile)
cfinfile = os.path.join(indir, cfinfile)

### 1.1. Initializing the fit <a class="anchor" id="chapter1_1"></a>

The initial parameters of the fit are stored in an object of class `q3din`. Each parameter or attribute of this class controls some aspect of the fit process. We start by instantiating the class. The only required parameters at the outset are the input data cube and label; the label is used for output file naming. 

The default JWST pipeline output has data, variance, and data quality in extensions 1, 2, and 3, respectively. Our processed cube has a different set of extensions, so we specify them here.

In [None]:
from q3dfit.q3din import q3din
q3di = q3din(infile, label, outdir=outdir, logfile=logfile)

Here's a list of the fit parameters that are automatically set:

In [None]:
q3di.__dict__

### 1.2. Setting up the data and models <a class="anchor" id="chapter1_2"></a>

Some general information about your cube. `argsreadcube` is a dictionary of attributes sent to the `Cube` class.
- For non-JWST data, set `wmapext` to `None`. The WMAP extension is a [3-D weight image](https://jwst-pipeline.readthedocs.io/en/latest/jwst/data_products/science_products.html) giving the relative weights of the output spaxels. While our data is from JWST, the wmap extension has been cut out during processing.
- Microns are the wavelength unit used internally, but `q3dfit` can accept input/output in Å.
- `q3dit` does calculations in f$_\lambda$ space, but assumes input units of MJy/sr, the JWST default. Other input wavelength units can be specified. In this case, the reduced data has been converted to erg/s/cm$^2$/Å. The output flux units will be in erg/s/cm$^2$/$\mu$m.



Some general information about your cube:

In [None]:
from q3dfit.readcube import Cube
q3di.argsreadcube = {'wmapext': None,
                     'wavext': 4,
                     'waveunit_in': 'Angstrom',
                     'fluxunit_in': 'Jy',
                     'fluxnorm': 1e-12}
cube = q3di.load_cube()

Let's plot the spectrum to see how it looks. The arguments are column and row in unity-offset units. 

In [None]:
import matplotlib.pyplot as plt
fig = plt.figure(figsize=[10,4])
spec_test = cube.specextract(1, 1, radius=0, ylim=(0,3))

Name and systemic redshift of the galaxy. `zsys_gas` is an input for calculating velocity maps in `q3dpro` and for initializing the arrays of initial guesses below. In this case, the spectrum has already been shifted to the rest frame.

In [None]:
q3di.name = 'F15545+4000'
q3di.zsys_gas = 0.0

Wavelength range over which to fit data. The user can also specify sets of regions to ignore in the fit.

In [None]:
q3di.fitrange = np.array([5.42, 29.98])  # micron
#q3di.cutrange = np.array([,])

### 1.3. Setting up the fitting parameters <a class="anchor" id="chapter1_3"></a>

#### 1.3.1. Emission-line parameters <a class="anchor" id="chapter1_3_1"></a>

What lines do you want to fit? You can choose from the linelists [here](https://github.com/Q3D/q3dfit/tree/main/data/linelists), or in `q3dfit/data/linelists/`.

In [None]:
lines = ['H2_00_S5', '[ArII]6.99', '[ArIII]8.99', '[NeII]12.81', 
         '[NeIII]15.56', 'H2_00_S3', 'H2_00_S1', '[SIII]18.71']

This block sets up initial conditions for the emission-line fit to each spaxel. This initialization method adds a number of new attributes to the object. Emission lines are set to a common redshift and velocity dispersion, set to `q3di.zsys_gas` and 50 km/s by default. However, different sets of emission lines can have different velocities and linewidths by specifying different lines to which to tie particular emission lines. Different initial conditions can also be set on a spaxel-by-spaxel and/or line-by-line basis. The default number of velocity components is 1.

In [None]:
q3di.init_linefit(lines, linetie='[NeII]12.81')
q3di.__dict__.keys()

Because these lines are not well-resolved spectrally, we change the default initial conditions in sigma.

In [None]:
for i in lines:
    q3di.siginit_gas[i][:,:,0] = 1000.

`siglim_gas` sets lower and upper bounds for the Gaussian width (sigma) of the emission line. These limits can be set globablly, for all spaxels and components, by defining a 2-element array. The limits can also be set for individual spaxels (but all components) by defining an (Ncol x Nrow x 2) array.

In [None]:
q3di.siglim_gas = np.array([5., 4000.])

# Spaxel-by-spaxel limit
# siglim_gas = np.ndarray((dx, dy, 2))
# siglim_gas[:,:,] = array([5.,1000.])
# siglim_gas[13, 10, :] = array([5.,500.])

The routine `checkcomp` automatically discards components that it deems insignificant after each fit. It does so with both a significance cut on flux, and if the linewidth is too large. If components are removed, the fit is re-run. The `sigcut` parameter determines the level of the significance cut. `ignore` is a list of lines to ignore when performing the significance cut. Component checking can be disabled by setting `checkcomp = False`.

In [None]:
q3di.checkcomp = False
#q3di.argscheckcomp['sigcut'] = 3.
#q3di.argscheckcomp['ignore']= ['H2_00_S5']

#### Spectral resolution convolution

If no convolution is desired: do not set `spect_convol` (or set it to `{}`, or `None`).

If convolution is desired: `spect_convol` is a dictionary with two optional tags.
- `ws_instrum`: This specifies the desired convolution method. The syntax is: `{INSTRUMENT:[GRATING]}`. The values for `INSTRUMENT` and `GRATING` for pre-defined dispersion files should mirror the filename syntax in `q3dfit/data/dispersion_files/`. E.g., for file `jwst_miri_ch1a_disp.fits`, `INSTRUMENT=jwst_miri` and `GRATING=ch1a`. (Case is irrelevant. For convolution with a constant value of spectral resolution [R], Δλ FWHM in [$\mu$m], or velocity in [km/s], set `INSTRUMENT = flat` and `GRATING = ` a string containing `R`, `dlam`, or `dvel` and the corresponding numerical quantity. More thana one instrument and/or grating can be set.
- `dispdir`: Directory in which to find the dispersion files. If not set, the default `q3dfit` directory is searched.

Examples: 
1. flat R=500: `spect_instrum = {'flat':['R500']}`
2. flat velocity FWHM = 30km/s: `spect_instrum = {'flat':['dvel30']}`
3. flat Δλ FWHM = 4 Å: `spect_instrum = {'flat':['dlam0.0004']}`
4. JWST NIRSPEC / G140M: `spect_instrum = {'JWST_NIRSPEC':['G140M']}`
5. Spitzer IRS SH+LH: `spect_instrum = {'Spitzer_IRS':['ch1_sh','ch1_lh']}`

Note in the final example that two gratings are set.

In [None]:
q3di.spect_convol['ws_instrum'] = {'spitzer_irs_ch0':['sl2','sl1',], 'spitzer_irs_ch2':['ll1','ll2']}

##### Creating convolution files (optional)

To create a dispersion file, use one of the following methods. The second two involve specific subclasses of the dispersion class used for the instrument/grating file or constant dispersion formats 

1. Create a `dispersion` object and use the `dispersion.write()` method. For example:

```
dispEx1 = dispersion()
dispEx1.write('/dispdir/disp.fits', wave=np.linspace(5.,10.,50), type='R', disp=np.full(50, 500.))
```

2. Create a `InstGratDispersion` object to attach instrument and grating information to the object and define the output filename in the `q3dfit` format. Use the `InstGratDispersion.writeInstGrat()` method.

```
dispEx2 = InstGratDispersion(`Keck_ESI`,`echellette`, dispdir=`/dispdir/`)
dispEx2.writeInstGrat(wave=np.linspace(5.,10.,50), type='dvel', disp=np.full(50, 30.))
```

3. Create a `FlatDispersion` object and use the `FlatDispersion.writeFlat()` method. This requires only a single value for the dispersion quantity and also defines the filename automatically.
```
dispEx3 = FlatDispersion(0.0004,`dlam`,wave=np.linspace(5.,10.,50))
dispEx3.writeFlat(dispdir=`/dispdir/`)
```

#### Options to `lmfit` and `scipy.optimize.least_squares`
`q3dfit` uses the `fit` method of the [`Model` class](https://lmfit.github.io/lmfit-py/model.html#lmfit.model.Model) of `lmfit` to call [`scipy.optimize.least_squares`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.least_squares.html). Both the method and function have options which can be changed in the `q3dfit` call. To do so, add key/value pairs to the `argslinefit` dictionary, which in turn is a keyword of the `q3di` dictionary.

The options to the `fit` method in `lmfit` that can currently be changed are the following:
- `max_nfev`: maximum number of function evaluations before the fit aborts
- `iter_cb`: if this is set to "per_iteration", the value of every model parameter at each function evaluation is printed to `stdout`

Most parameters of `least_squares` can be changed in this way, unless they are specifically set by `lmfit`. Examples which have been tested include:
- `x_scale`: jac
- `tr_solver`: lsmr
- `loss`: soft_l1
- `ftol`, `gtol`, `xtol`

In [None]:
#q3di.argslinefit['method'] = 'leastsq'
#q3di.argslinefit['iter_cb'] = 'per_iteration'
# As an example, to change the criteria for fit convergence from the defaults of 1.e-8 to 1.e-10:
q3di.argslinefit['ftol'] = 1.e-10
q3di.argslinefit['gtol'] = 1.e-10
q3di.argslinefit['xtol'] = 1.e-10
q3di.argslinefit['x_scale'] = 'jac'
q3di.argslinefit['tr_solver'] = 'lsmr'
# .. and the "suitable step length for the forward- difference approximation of the Jacobian. Normally the actual step length will be sqrt(epsfcn)*x"
# this is only for scipy.optimize.leastsq
#q3di.argslinefit['epsfcn'] = 1.e-15

#### 1.3.2 Continuum parameters <a class="anchor" id="chapter1_3_2"></a>

We next initialize the continuum. As part of this, we give it the name of our continuum fitting function. (See Rupke et al. 2017 for more details on the methodology of `q3dfit` when separating a quasar from its host galaxy.)

In [None]:
q3di.init_contfit('questfit')
q3di.__dict__.keys()

`q3dfit` first masks emission lines before fitting. This sets is the default mask value in km/s for each velocity component for the first fit. During the second fit, the mask value is set automatically using the best-fit linewidths determined from the first fit.

In [None]:
q3di.maskwidths_def = 4000.

The continuum fitting parameters specified here are for the case of general MIR fitting. The mid-IR continuum fitting includes features that depend on redshift. These are specified as "stellar" redshift for compatibility with stellar template fitting, even though they refer in this case to the redshift of the mid-IR dust features. The input spectrum has in this case already been shifted to rest wavelengths.

In [None]:
q3di.argscontfit['config_file'] = cfinfile
q3di.argscontfit['convert2Flambda'] = True
q3di.argscontfit['plot_decomp'] = True
q3di.argscontfit['outdir'] = outdir

Optional arguments to `lmfit`. These are the tolerances for determining fit convergence, described in further detail [here](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.least_squares.html).

In [None]:
argslmfit = dict()
argslmfit['ftol'] = 1.e-10
argslmfit['gtol'] = 1.e-10
argslmfit['xtol'] = 1.e-10
argslmfit['x_scale'] = 'jac'
argslmfit['tr_solver'] = 'lsmr'
q3di.argscontfit['argslmfit'] = argslmfit

If you want to run `q3dfit` in batch mode, run this cell, which saves q3di to an `npy` file. In your python command line, read in file and run `q3dfit` with
<pre><code>q3di = '/path/to/the/npy/file/q3di.npy'
from q3dfit.q3dfit import q3dfit
q3dfot(q3di,cols=cols,rows=rows)</code></pre>
N.B.: When running `q3dfit` using multiple cores (`ncores=N` in the call to `q3df`), the input dictionary has to be specified in this way; i.e., as a string describing the location of this .npy file.

In [None]:
q3di_npy = 'q3di.npy'
np.save(os.path.join(initdir, q3di_npy), q3di)

## 2. Run fit <a class="anchor" id="chapter2"></a>

Choose columns and rows to fit. Ranges are specified as two-element lists specifying the first and last spaxel. Because there is only one spaxel in this case, we don't actually have to specify the rows and columns, but we'll do it to illustrate the syntax.

In [None]:
cols = 1
rows = 1

Run the fit. Choose `quiet=False` for verbose output. An output object for each spaxel, of class `q3dout`, is saved to a numpy binary file labeled with prefix `q3di['label']` and suffix `_col_row.npy`. See note above on multicore processing.

In [None]:
from q3dfit.q3df import q3dfit
q3dfit(q3di,cols=cols,rows=rows, quiet=False)

## 3. Plot fit results <a class="anchor" id="chapter3"></a>

Load the output of a fit.

In [None]:
cols = 1
rows = 1
from q3dfit.q3dout import load_q3dout
q3do = load_q3dout(q3di, cols, rows)

Set up the line plot parameters using a dictionary.

* `nx`: Number of subplots in the horizontal direction (default = 1)
* `ny`: Number of subplots in the vertical direction (default = 1)
* Required: choose one options for centerting the plot
    - `line`: a string list of line labels
    - `center_obs`: a float list of wavelengths of each subplot center, in the observed (plotted) frame
    - `center_rest`: a float list of wavelengths of each subplot center, in the rest frame, which are converted to obs. frame
* `size`: float list of widths in wavelength space of each subplot; if not specified (default = 300 $Å$)

In [None]:
argsplotline = dict()
argsplotline['nx'] = 3
argsplotline['ny'] = 2
argsplotline['line'] = ['[ArII]6.99', '[ArIII]8.99', '[NeII]12.81', 
                        '[NeIII]15.56', 'H2_00_S1', '[SIII]18.71']
argsplotline['size'] = [3., 3., 3., 3., 3., 3.]
argsplotline['figsize'] = [10,8]

Run the plot method. The output can be saved as a jpg by specifying `savefig=True`. A default filename is used, which can be overridden by specifying `outfile=file`. The output file will have the suffix `_lin` attached, so that the actual filename will be "file_lin.jpg".

In [None]:
q3do.plot_line(q3di,plotargs=argsplotline)

The continuum plot can be changed by specifying several parameters. In this case, we have chosen to output a log/log plot of f$_\nu$ vs. wavelength.

In [None]:
argscontplot = dict()
#argscontplot['xstyle'] = 'lin'
#argscontplot['ystyle'] = 'lin'
#argscontplot['fluxunit_out'] = 'flambda'
argscontplot['mode'] = 'dark'
argscontplot['figsize'] = [10,8]

Run two methods. The first computes the continuum values to plot, and the second does the plotting.

In [None]:
q3do.sepcontpars(q3di)
q3do.plot_cont(q3di, plotargs=argscontplot)