# Example: Using MIRAGE to Generate Moving Target Exposures
### (i.e. Solar System target observations)

This notebook shows an example of how to simulate observations of a non-sidereal target. In this case, JWST tracks the non-sidereal target during the exposure, causing sidereal targets to move over the course of the exposure.

The `mirage` simulator is broken up into four basic stages:

1. **Creation of yaml-formatted input files**.<br>
   Calls to Mirage generally require one [yaml input](https://mirage-data-simulator.readthedocs.io/en/latest/example_yaml.html) file. This file specifies
   details about the instrument set-up, and source catalogs and reference files
   to use. Each yaml file specifies exposure details for a single exposure
   in a single detector.<br><br>

2. **Creation of a "seed image".**<br>
   This is generally a noiseless countrate image that contains signal
   only from the astronomical sources to be simulated. Currently, the 
   mirage package contains code to produce a seed image starting
   from object catalogs.<br>
   Note that the much larger amount of data in a
   seed image containing moving targets means that this step will be significantly
   slower than when generating a simple seed image for a sidereal observation.<br><br>
   
3. **Dark current preparation.**<br>
   The simulated data will be created by adding the simulated sources
   in the seed image to a real dark current exposure. This step
   converts the dark current exposure to the requested readout pattern
   and subarray size requested by the user.<br><br>
   
4. **Observation generation.**<br>
   This step converts the seed image into an exposure of the requested
   readout pattern and subarray size. It also adds cosmic rays and 
   Poisson noise, as well as other detector effects (IPC, crosstalk, etc).
   This exposure is then added to the dark current exposure from step 2.<br><br>

---
## Getting Started

<div class="alert alert-block alert-warning">
**Important:** 
Before proceeding, ensure you have set the MIRAGE_DATA environment variable to point to the directory that contains the reference files associated with MIRAGE.
<br/><br/>
If you want JWST pipeline calibration reference files to be downloaded in a specific directory, you should also set the CRDS_DATA environment variable to point to that directory. This directory will also be used by the JWST calibration pipeline during data reduction.
<br/><br/>
You may also want to set the CRDS_SERVER_URL environment variable set to https://jwst-crds.stsci.edu. This is not strictly necessary, and Mirage will do it for you if you do not set it, but if you import the crds package, or any package that imports the crds package, you should set this environment variable first, in order to avoid an error.
</div>

*Table of Contents:*
* [Imports](#imports)
* [Create Source Catalogs](#make_catalogs)
* [Generating `yaml` files](#make_yaml)
* [Create Simulated Data](#run_steps_together)
* [Running Simulation Steps Independently](#run_steps_independently)
* [Simulating Multiple Exposures](#mult_sims)

---
<a id='imports'></a>
# Imports

In [None]:
# Set the MIRAGE_DATA environment variable if it is not
# set already. This is for users at STScI.
import os

In [None]:
#os.environ["MIRAGE_DATA"] = "/my/mirage_data/"
#os.environ["CRDS_DATA"] = "/user/myself/crds_cache"
#os.environ["CRDS_SERVER_URL"] = "https://jwst-crds.stsci.edu"

In [None]:
from astropy.io import fits
from glob import glob
import matplotlib.pyplot as plt
import numpy as np
import pkg_resources
from scipy.stats import sigmaclip
%matplotlib inline

In [None]:
# mirage imports
from mirage import imaging_simulator
from mirage.catalogs import catalog_generator
from mirage.seed_image import catalog_seed_image
from mirage.dark import dark_prep
from mirage.ramp_generator import obs_generator
from mirage.yaml.yaml_generator import SimInput

In [None]:
TEST_DATA_DIRECTORY = os.path.normpath(os.path.join(pkg_resources.resource_filename('mirage', ''),
                                                    '../examples/movingtarget_example_data'))

In [None]:
if not os.path.isdir(TEST_DATA_DIRECTORY):
    print("WARNING: test data directory does not exist!")

In [None]:
output_dir = './'

In [None]:
if not os.path.isdir(output_dir):
    print("WARNING: output directory does not exist!")

---
<a id='make_catalogs'></a>
# Create Source Catalogs

The first task to prepare for the creation of simulated data is to create source catalogs. Mirage supports several different types of catalogs, with a different catalog for each type of source (e.g. point sources, galaxies, etc. See the [catalog documentation](https://mirage-data-simulator.readthedocs.io/en/stable/catalogs.html) for details.)

For this example, for our target we use the ephemeris for Mars (in order to maximize velocity and make the motion easy to see in a short exposure). However, for simplicity we will use a point source in place of Mars' disk. We will also include background stars in order to show the motion of background sources.

### Create non-sidereal catalog

First, create the source catalog containing our target. For this we will use Mirage's non-sidereal catalog type. By using the non-sidereal catalog, we will be telling Mirage that we wish to have JWST track this source during the exposure. The motion of the non-sidereal source can be captured via either manually entered velocites, or by providing a [JPL Horizons](https://ssd.jpl.nasa.gov/horizons.cgi) formatted ephemeris file. In this example, we will use an ephemeris file.

In [None]:
ephemeris_file = os.path.join(TEST_DATA_DIRECTORY, 'mars_ephemeris.txt')

In [None]:
non_sidereal_catalog = os.path.join(output_dir, 'mars_nonsidereal.cat')

In [None]:
# Create the catalog. Since we are using an ephemeris, there is no need to
# specify the RA, Dec, nor velocity of the source. All will be retrieved from
# the ephemeris file.
ns = catalog_generator.NonSiderealCatalog(object_type=['pointSource'], ephemeris_file=[ephemeris_file])

Add the source magnitudes to the catalog. Note that the magnitude values required by Mirage are magnitudes in the NIRCam/NIRISS filters of interest, so we cannot get these from the ephemeris file. Also, Mirage does not yet support source magnitudes that change with time. 

In [None]:
mag = 14.

In [None]:
# Be sure to add magnitude columns for all filters you wish to simulate. In this case
# the APT file uses filters F150W and F356W
ns.add_magnitude_column([mag], magnitude_system='abmag', instrument='nircam', filter_name='f150w')
ns.add_magnitude_column([mag], magnitude_system='abmag', instrument='nircam', filter_name='f356w')
ns.save(non_sidereal_catalog)

In [None]:
ns.table

### Create catalog of background stars

See the [Catalog Generation Tools](https://github.com/spacetelescope/mirage/blob/master/examples/Catalog_Generation_Tools.ipynb) example notebook for more details on creating source catalogs, including the use of 2MASS/GAIA/WISE/Besancon queries.

In [None]:
point_source_catalog = os.path.join(output_dir, 'background_point_sources.cat')

In [None]:
base_ra = 25.7442083  # degrees
base_dec = 6.4404722  # degrees
cat_width = 93.  # arcseconds
cat_width /= 3600.

In [None]:
# Let's just randomly scatter some stars in the area
ra_val = np.random.uniform(base_ra - cat_width, base_ra + cat_width, 50)
dec_val = np.random.uniform(base_dec - cat_width, base_dec + cat_width, 50)
mags = np.random.uniform(17, 20, 50)

In [None]:
# Set the first background source to be ~140 pixels from our non-sidereal target
# This will make it easier to see the difference between the two in the 
# resulting simulated data
ra_val[0] = 25.74248611078
dec_val[0] = 6.438749978

Note that all Mirage source catalogs must have an "index" column that assigns a number to each source. You cannot have multiple sources with the same index number, even across catalogs (because these index numbers will be used to populate the segmentation map). Since the non-sidereal catalog contains one source (with an index number of 1), we start the index numbers in this catalog at 2.

In [None]:
ptsrc = catalog_generator.PointSourceCatalog(ra=ra_val, dec=dec_val, starting_index=2)
ptsrc.add_magnitude_column(mags, magnitude_system='abmag', instrument='nircam', filter_name='f150w')
ptsrc.add_magnitude_column(mags, magnitude_system='abmag', instrument='nircam', filter_name='f356w')
ptsrc.save(point_source_catalog)

In [None]:
ptsrc.table

---
<a id='make_yaml'></a>
# Generating input yaml files

The easiest way to construct input yaml files is to begin with a proposal in [APT](https://jwst-docs.stsci.edu/jwst-astronomers-proposal-tool-overview). In this example, we use an APT file in the examples/movingtarget_example_data directory. Mirage does not use the apt file dirctly, but instead the exported xml and pointing files.

In [None]:
xml_file = os.path.join(TEST_DATA_DIRECTORY, 'mars_example.xml')
pointing_file = xml_file.replace('.xml', '.pointing')

Due to the large number of ancillary files output by Mirage, it is often helpful to store the yaml files in their own directory, separate from the outputs of the simulator itself.

In [None]:
yaml_output_dir = './'
simdata_output_dir = './'

Inputs into the yaml generator function include the source catalogs, as well as a number of other options detailed on the [yaml_generator documentation page](https://mirage-data-simulator.readthedocs.io/en/stable/yaml_generator.html). See that page for more information.

In [None]:
# Catalogs must be put in a nested dictionary with target names (from the APT file) as
# the top level keys, and catalog types as the second level keys. 
cats = {'MARS': {'moving_target_to_track': non_sidereal_catalog,
                    'point_source': point_source_catalog},
        'ONE': {'point_source': point_source_catalog}}

In [None]:
# Dates can be specified using a date-only or a datetime string for each observation in
# the proposal. In this case, with a fast-moving target, we will use datetime strings. Keys
# for this dictionary are the observation numbers from the proposal.
dates = {'001': '2020-09-25T00:00:00.0'}

In [None]:
# Now run the yaml_generator and create the yaml files
y = SimInput(input_xml=xml_file, pointing_file=pointing_file, catalogs=cats,
             dates=dates, dateobs_for_background=False, datatype='raw',
             output_dir=yaml_output_dir, simdata_output_dir=simdata_output_dir)
y.create_inputs()

In [None]:
# List the newly-created yaml files
y.yaml_files

---
<a id='run_steps_together'></a>
# Create Simulated Data

### The imaging simulator class

The imaging_simulator.ImgSim class is a wrapper around the three main steps of the simulator (detailed in the [Running simulator steps independently](#run_steps_independently) section below). This convenience function is useful when creating simulated imaging mode data.

For this example we'll simulate the first exposure in the NRCB5 detector. This should have our target relatively close to the center of the detector.

In [None]:
# Specify the yaml input file to use
yamlfile = os.path.join(yaml_output_dir, 'jw00042001001_01101_00001_nrcb5.yaml')

In [None]:
# Run all steps of the imaging simulator for the yaml file
m = imaging_simulator.ImgSim()
m.paramfile = yamlfile
m.create()

### Examine the Output

In [None]:
def show(array, title, min=0, max=1000):
    plt.figure(figsize=(12, 12))
    plt.imshow(array, clim=(min, max), origin='lower')
    plt.title(title)
    plt.colorbar().set_label('DN$^{-}$/s')

In [None]:
def show_mult(array1, array2, array3, title, min=0, max=1000):
    fig = plt.figure(figsize=(18, 18))
    a = fig.add_subplot(131)
    aplt = plt.imshow(array1, clim=(min, max), origin='lower')
    b = fig.add_subplot(132)
    bplt = plt.imshow(array2, clim=(min, max), origin='lower')
    plt.title(title)
    c = fig.add_subplot(133)
    cplt = plt.imshow(array3, clim=(min, max), origin='lower')

#### Noiseless Seed Image

This image is an intermediate product. It contains only the signal from the astronomical sources and background. There are no detector effects, nor cosmic rays added to this count rate image.

In this case, the seed image has 4 dimensions rather than the 2 dimensions that it is for sidereal targets. This is because the moving sources lead to a seed image that is different in each group of each integration. So let's look at just the final frame of one integration of the seed image.

We'll also zoom in, to make the motion of the background targets more visible. The non-sidereal target is in the upper left corner and appears as a normal PSF. The background star whose coordinates we specified manually when creating the point source catalog is smeared, since the telescope was not tracking at the sidereal rate.

In [None]:
# First, look at the noiseless seed image. Zoom in to make the smeared
# background sources obvious. 
show(m.seedimage[0,-1,850:1100,750:1000], 'Seed Image', max=25000)

#### Final Output Product

Examine the raw output. First a single group, which contains noise and detector artifacts. By zooming in we can minimize the appearance of these effects.

In [None]:
y_base = os.path.basename(yamlfile)
raw_base = y_base.replace('.yaml', '_uncal.fits')
raw_file = os.path.join(simdata_output_dir, raw_base)
with fits.open(raw_file) as hdulist:
    raw_data = hdulist['SCI'].data
print(raw_data.shape)

In [None]:
show(raw_data[0, -1, 850:1100,750:1000], "Final Group", max=15000)

Many of the instrumental artifacts can be removed by looking at the difference between two groups. Raw data values are integers, so first make the data floats before doing the subtraction.

In [None]:
show(1. * raw_data[0, -1, 850:1100,750:1000] - 1. * raw_data[0, 0, 850:1100,750:1000],
     "Last Minus First Group", max=20000)

This raw data file is now ready to be run through the [JWST calibration pipeline](https://jwst-pipeline.readthedocs.io/en/stable/) from the beginning.

---
<a id='run_steps_independently'></a>
# Running simulation steps independently

## First generate the "seed image" 
This is generally a 2D noiseless countrate image that contains only simulated astronomical sources. However, when creating data using non-sidereal tracking or for sidereal tracking where a moving target (e.g. asteroid, KBO) are in the field of view, the seed image will in fact be a 3D seed ramp.

A seed image is generated based on a `.yaml` file that contains all the necessary parameters for simulating data. An example `.yaml` file is show at the [bottom of this notebook](#yaml_example).

In [None]:
# yaml file that contains the parameters of the
# data to be simulated
yamlfile = os.path.join(yaml_output_dir, 'jw00042001001_01101_00001_nrcb5.yaml')

In [None]:
cat = catalog_seed_image.Catalog_seed()
cat.paramfile = yamlfile
cat.make_seed()

### Look at the seed image

In [None]:
# In this case, the seed image is 4D rather than the
# 2D that it is for sidereal targets.
# So let's look at just the final frame of the seed image

# The non-sidereal target is in the center of the frame and appears
# as a normal PSF (although hard to see in this view). All of the 
# background stars and galaxies are
# smeared, since the telescope was not tracking at the sidereal rate.
show(cat.seedimage[0, -1, 850:1100, 750:1000],'Seed Image',max=25000)

In [None]:
# Look at the first, middle, and last frames of the seed image
# so we can see the background sources moving relative to the target,
# and the stationary non-sidereal source getting brighter as exposure
# time increases.
show_mult(cat.seedimage[0, 0, 850:1100, 750:1000],
          cat.seedimage[0, 3,850:1100, 750:1000],
          cat.seedimage[0, -1, 850:1100, 750:1000], 'Seed Images',max=25000)

## Prepare the dark current exposure
This will serve as the base of the simulated data.
This step will linearize the dark current (if it 
is not already), and reorganize it into the 
requested readout pattern and number of groups.

In [None]:
d = dark_prep.DarkPrep()
d.paramfile = yamlfile
d.prepare()

### Look at the dark current 
For this, we will look at an image of the final group
minus the first group

In [None]:
exptime = d.linDark.header['NGROUPS'] * cat.frametime
diff = (d.linDark.data[0, -1, 850:1100, 750:1000] - d.linDark.data[0, 0, 850:1100,750:1000]) / exptime
show(diff,'Dark Current Countrate',max=0.1)

## Create the final exposure
Turn the seed image into a exposure of the 
proper readout pattern, and combine it with the
dark current exposure. Cosmic rays and other detector
effects are added. 

The output can be either this linearized exposure, or
a 'raw' exposure where the linearized exposure is 
"unlinearized" and the superbias and 
reference pixel signals are added, or the user can 
request both outputs. This is controlled from
within the yaml parameter file.

In [None]:
obs = obs_generator.Observation()
obs.linDark = d.prepDark
obs.seed = cat.seedimage
obs.segmap = cat.seed_segmap
obs.seedheader = cat.seedinfo
obs.paramfile = yamlfile
obs.create()

### Examine the final output image
Look at the last group minus the first group

In [None]:
with fits.open(obs.raw_output) as h:
    lindata = h[1].data
    header = h[0].header

In [None]:
# The central target is difficult to see in this full field view
exptime = header['EFFINTTM']
diffdata = (lindata[0, -1, 850:1100, 750:1000] - lindata[0, 0, 850:1100, 750:1000]) / exptime
show(diffdata, 'Simulated Data', min=0, max=200)

In [None]:
# Show on a log scale, to bring out the presence of the dark current
# Noise in the CDS image makes for a lot of pixels with values < 0,
# which makes this kind of an ugly image. Add an offset so that
# everything is positive and the noise is visible
offset = 2.
plt.figure(figsize=(12,12))
plt.imshow(np.log10(diffdata + offset), clim=(0.001,np.log10(80)), origin='lower')
plt.title('Simulated Data')
plt.colorbar().set_label('DN/s')

---
<a id='mult_sims'></a>
## Running Multiple Simulations

### Each yaml file, will simulate an exposure for a single pointing using a single detector.

To simulate an exposure using multiple detectors, you must have multiple yaml files. Consider this cumbersome example:
```python
yaml_a1 = 'sim_param_A1.yaml'
yaml_a2 = 'sim_param_A2.yaml'
yaml_a3 = 'sim_param_A3.yaml'
yaml_a4 = 'sim_param_A4.yaml'
yaml_a5 = 'sim_param_A5.yaml'

make_sim(yaml_a1)
make_sim(yaml_a2)
make_sim(yaml_a3)
make_sim(yaml_a4)
make_sim(yaml_a5)
```

This can be performed more efficiently, either in series or in parallel:

### In Series
```python
paramlist = [yaml_a1,yaml_a2,yaml_a3,yaml_a4,yaml_a5]

def many_sim(paramlist):
    '''Function to run many simulations in series
    '''
    for file in paramlist:
        m = imaging_simulator.ImgSim()
        m.paramfile = file
        m.create()
```

### In Parallel

Since each `yaml` simulations does not depend on the others, we can parallelize the process to speed things up:
```python
# Need to test this. May need a wrapper since the 
# imaging simulator is a class

from multiprocessing import Pool

n_procs = 5 # number of cores available

with Pool(n_procs) as pool:
    pool.map(make_sim, paramlist)
```