# Example: Using MIRAGE to Generate Wide Field Slitless Exposures

This notebook shows how to use Mirage to create Wide Field Slitless Spectroscopy (WFSS) data, beginning with an APT file. This can be done for NIRCam or NIRISS.

*Table of Contents:*
* [Getting Started](#getting_started)
* [Create input yaml files from an APT proposal](#yaml_from_apt)
* [Make WFSS simulated observations](#make_wfss)
   * [Provide mulitple yaml files](#multiple_yamls)
   * [Provide a single yaml file and an hdf5 file containing SED curves of the sources](#yaml_plus_hdf5)
   * [Outputs](#wfss_outputs)
* [Make imaging simulated observations](#make_imaging)
   * [Outputs](#imaging_outputs)

---
<a id='getting_started'></a>
## Getting Started

<div class="alert alert-block alert-warning">
**Important:** 
Before proceeding, ensure you have set the MIRAGE_DATA environment variable to point to the directory that contains the reference files associated with MIRAGE.
</div>

<div class="alert alert-block alert-info">
**Dependencies:**<br>

1) Install GRISMCONF from https://github.com/npirzkal/GRISMCONF<br>

2) Install NIRCAM_Gsim from https://github.com/npirzkal/NIRCAM_Gsim. This is the disperser software, which works for both NIRCam and NIRISS.
</div>

<div class="alert alert-block alert-info">
**Link to CRDS:**<br>
    Make sure that you are pointing to an installation of CRDS. If working outside of the STScI network, CRDS can be configured by setting two environment variables:

from the command line:

export CRDS_PATH=$HOME/crds_cache

export CRDS_SERVER_URL=https://jwst-crds.stsci.edu

OR:

In [None]:
# Within python
#os.environ["CRDS_PATH"]="$HOME/crds_cache"
#os.environ["CDRS_SERVER_URL"]="https://jwst-cdrs.stsci.edu"

In [None]:
import os
from glob import glob
import pkg_resources
import yaml

from astropy.io import fits
from astropy.visualization import simple_norm, imshow_norm
import h5py
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

from mirage import imaging_simulator
from mirage import wfss_simulator
from mirage.yaml import yaml_generator

In [None]:
TEST_DATA_DIRECTORY = os.path.normpath(os.path.join(pkg_resources.resource_filename('mirage', ''),
                                                    '../examples/wfss_example_data'))

---
<a id='yaml_from_apt'></a>
## Create a series of yaml files from an [APT](https://jwst-docs.stsci.edu/display/JPP/JWST+Astronomers+Proposal+Tool+Overview) proposal

With your proposal file open in APT, export the "xml" and "pointing" files. These will serve as the inputs to the yaml file generator function.

In [None]:
# Input files from APT
xml_file = os.path.join(TEST_DATA_DIRECTORY, 'niriss_wfss_example.xml')
pointing_file = os.path.join(TEST_DATA_DIRECTORY, 'niriss_wfss_example.pointing')

List the source catalogs to use when creating the simulations. The catalog input can be either a single string, in which case it is assumed that the same catalog is used for all observations, or a list of catalogs, in which case it is assumed that there is one catalog for each [observation](https://jwst-docs.stsci.edu/display/JPP/JWST+Astronomers+Proposal+Tool+Overview#JWSTAstronomersProposalToolOverview-Observations) in the APT file.

In [None]:
catalogs = {'niriss': os.path.join(TEST_DATA_DIRECTORY, 'point_sources.cat')}

Other parameters can be input via the params keyword. Currently for each keyword, only a single value is accepted, and this value is applied to all observations. We will be expanding to allow different values for different observations soon.

In [None]:
params = {'PAV3': 0.}

Provide the output directory for the yaml files themselves, as well as the output directory where you want the simulated files to eventually be saved. This information will be placed in the yaml files.

In [None]:
# Create a series of Mirage input yaml files
# using the APT files
yaml_output_dir = '/where/to/put/yaml/files'
simulations_output_dir = '/where/to/put/simulated/data'
yam = yaml_generator.SimInput(xml_file, pointing_file, catalogs=catalogs, verbose=True,
                                  output_dir=yaml_output_dir,
                                  simdata_output_dir=simulations_output_dir,
                                  parameter_defaults=params, datatype='raw')
# If you are on the STScI network and can see central store, setting
# use_linearized_darks to True will save time. Otherwise set to False,
# and the linearized darks will be constructed during the run
yam.use_linearized_darks = True
yam.create_inputs()

One yaml file will be created for each exposure. The naming convention of the files follows that for [JWST exposure filenames](https://jwst-docs.stsci.edu/display/JDAT/File+Naming+Conventions+and+Data+Products). For example, the first expsure in proposal number 12345, Observation 3, Visit 2, assuming it is made using NIRCam (the A2 detector in this case) will be named jw12345003002_01101_00001_nrca1_uncal.fits

Look to see which yaml files are for WFSS and which are imaging

In [None]:
yaml_files = glob(os.path.join(yam.output_dir,"jw*.yaml"))

yaml_WFSS_files = []
yaml_imaging_files = []
for f in yaml_files:
    my_dict = yaml.load(open(f))
    if my_dict["Inst"]["mode"]=="wfss":
        yaml_WFSS_files.append(f)
    if my_dict["Inst"]["mode"]=="imaging":
        yaml_imaging_files.append(f)
    
print("WFSS files:",len(yaml_WFSS_files))
print("Imaging files:",len(yaml_imaging_files))

Each output yaml file contains details on the simulation.

In [None]:
with open(yaml_WFSS_files[0], 'r') as infile:
    parameters = yaml.load(infile)
for key in parameters:
    for level2_key in parameters[key]:
        print('{}: {}: {}'.format(key, level2_key, parameters[key][level2_key]))

---
<a id='make_wfss'></a>
## Make WFSS simulated observations

Create simulated data from the WFSS yaml files. This is accomplished using the **wfss_simulator** module, which wraps around the various stages of Mirage. There are several input options available for the **wfss_simulator**.

* Provide a single yaml file and an hdf5 file containing SED curves of the sources
* Provide mulitple yaml files

<a id='multiple_yamls'></a>
### Provide mulitple yaml files

Here, we provide multiple yaml files as input. In this case, Mirage will create a direct (undispersed) seed image for each yaml file. For each source, Mirage will construct a continuum spectrum by interpolating the filtered magnitudes in the direct images. This continuum spectrum will then be placed in the dispersed seed image, which will then be combined with a dark current exposure in order to create the final simulated exposure.

NOTE: In this case, all of the supplied yaml files MUST have the same pointing!

In [None]:
test_yaml_files = ['jw00042001001_01104_00003_nis.yaml', 'jw00042001001_01107_00005_nis.yaml',
                   'jw00042001001_0110c_00009_nis.yaml']
test_yaml_files = [os.path.join(yaml_output_dir, yfile) for yfile in test_yaml_files]

* If an appropriate (linearized, or linearized and cut to the proper number of groups) dark current exposure already exists, the dark current preparation step can be skipped by providing the name of the dark file in **override_dark**.

* The **save_dispersed_seed** option will save the dispersed seed image to a fits file. 

* The name of the fits file can be given in the **disp_seed_filename** keyword or, if that is left as None, Mirage will create a filename using the input yaml filename.

* If **extrapolate_SED** is set to True, then the continuum calculated by Mirage will be extrapolated to cover the necessary wavlengths if the filters in the input yaml files do not span the entire wavelength range.

* If the **source_stamps_file** is set to the name of an [hdf5](https://www.h5py.org/) file, then the disperser will save 2D stamp images of the dispersed spectral orders for each target. These are intended as aids for spectral extraction. (**NOTE that turning this option on will lead to significantly longer run times for Mirage, as so much more data will be generated.**) 

In [None]:
m = wfss_simulator.WFSSSim(test_yaml_files, override_dark=None, save_dispersed_seed=True,
                           extrapolate_SED=True, disp_seed_filename=None, source_stamps_file=None)
m.create()

<a id='yaml_plus_hdf5'></a>
### Provide a single yaml file and an hdf5 file containing SED curves of the sources

In this case, a single WFSS mode yaml file is provided as input to Mirage. Along with this an [hdf5](https://www.h5py.org/) file is provided. This file contains a Spectral Energy Distribution (SED) curve for each target, in units of F_lambda. The advantage of this input scenario is that you are not limited to simple continuum spectra for your targets. Emission and absorption features can be added.

The disperser software will then use the SED along with the segmentation map in the direct seed image to place spectra into the dispersed seed image. In the cell below, we show a simple example of how to create an hdf5 file with SEDs. In this case the spectrum is flat with no emission nor absorption features. 

In [None]:
target_1_wavelength = np.arange(1.0, 5.5, 0.1)
target_1_flux = np.repeat(1e-16, len(target_1_wavelength))
wavelengths = [target_1_wavelength]
fluxes = [target_1_flux]

# To add fluxes for more targets
target_2_wavelength = np.arange(0.8, 5.4, 0.05)
target_2_flux = np.repeat(1.4e-16, len(target_2_wavelength))
wavelengths.append(target_2_wavelength)
fluxes.append(target_2_flux)

Currently Mirage and the disperser software assume that wavelengths are in units of microns and fluxes are in units of F_lambda. In the future Mirage may begin checking units. If you wish to add information about the units of the wavelengths and fluxes, that can be done by setting attributes of each dataset as it is created. See the example below where the file **test_sed_file.hdf5** is created.

In [None]:
wavelength_units = 'microns'
flux_units = 'flam'

In [None]:
sed_file = 'test_sed_file.hdf5'
with h5py.File(sed_file, "w") as file_obj:
    for i in range(len(fluxes)):
        dset = file_obj.create_dataset(str(i+1), data=[wavelengths[i], fluxes[i]], dtype='f',
                                       compression="gzip", compression_opts=9)
        dset.attrs[u'wavelength_units'] = wavelength_units
        dset.attrs[u'flux_units'] = flux_units

In [None]:
# Input the SED file along with a WFSS mode yaml file to Mirage
m = wfss_simulator.WFSSSim(test_yaml_files[1], override_dark=None, save_dispersed_seed=True,
                           extrapolate_SED=True, disp_seed_filename=None, SED_file=sed_file)
m.create()

<a id='wfss_outputs'></a>
### Outputs

Regardless of whether the **wfss_simulator** is called with multiple yaml files or a yaml and an hdf5 file, the outputs will be the same. The final output will be **jw\*uncal.fits** (or **jw\*linear.fits**, depending on whether raw or linear outputs are specified in the yaml files) files in your output directory. These files are in DMS format and can be fed directly into the **calwebb_detector1** pipeline for further calibration, if desired.

The seed image is also saved, as an intermediate output. This seed image is a noiseless rate image of the same scene in the final output file. The seed image can be thought of as an ideal version of the scene that excludes (most) detector effects.

#### Examine the dispersed seed image

In [None]:
with fits.open(m.disp_seed_filename) as seedfile:
    dispersed_seed = seedfile[1].data

In [None]:
fig, ax = plt.subplots(figsize=(10, 10))
norm = simple_norm(dispersed_seed, stretch='log', min_cut=0.25, max_cut=10)
cax = ax.imshow(dispersed_seed, norm=norm)
cbar = fig.colorbar(cax)
plt.show()

#### Examine the final output file

In [None]:
final_file = 'jw00042001001_01107_00005_nis_uncal.fits'
with fits.open(final_file) as hdulist:
    data = hdulist['SCI'].data
    hdulist.info()

In [None]:
fig, ax = plt.subplots(figsize=(10, 10))
norm = simple_norm(data[0, 8, :, :], stretch='log', min_cut=5000, max_cut=50000)
cax = ax.imshow(data[0, 8, :, :], norm=norm)
cbar = fig.colorbar(cax)
plt.show()

---
<a id='make_imaging'></a>
# Make imaging simulated observations

Similar to the **wfss_simulator** module for WFSS observations, imaging data can be created using the **imaging_simulator** module. This can be used to create the data for the direct (in NIRCam and NIRISS), and Out of Field (NIRCam) exposures that accompany WFSS observations, as well as the shortwave channel data for NIRCam, which is always imaging while the longwave detector is observing through the grism.

In [None]:
for yaml_imaging_file in yaml_imaging_files[0:1]:
    print("Imaging simulation for {}".format(yaml_imaging_file))
    img_sim = imaging_simulator.ImgSim()
    img_sim.paramfile = yaml_imaging_file
    img_sim.create()

<a id='imaging_outputs'></a>
### Outputs

As with WFSS outputs, the **imaging_simulator** will create **jw\*ucal.fits** or **jw\*linear.fits** files, depending on which was specified in the associated yaml files.

#### Examine the seed image

In [None]:
with fits.open(img_sim.seedimage) as seedfile:
    dispersed_seed = seedfile[1].data

In [None]:
fig, ax = plt.subplots(figsize=(10, 10))
norm = simple_norm(img_sim.seedimage, stretch='log', min_cut=0.25, max_cut=1000)
cax = ax.imshow(img_sim.seedimage, norm=norm)
cbar = fig.colorbar(cax)
plt.show()

#### Examine the output file

In [None]:
final_file = 'jw00042001001_01109_00007_nis_uncal.fits'
with fits.open(final_file) as hdulist:
    data = hdulist['SCI'].data
    hdulist.info()

In [None]:
fig, ax = plt.subplots(figsize=(10, 10))
norm = simple_norm(data[0, 4, :, :], stretch='log', min_cut=5000, max_cut=50000)
cax = ax.imshow(data[0, 4, :, :], norm=norm)
cbar = fig.colorbar(cax)
plt.show()