# NIRCam Preimaging: MIRAGE Simulations

**Use case:** Simulation of NIRCam pre-imaging for NIRSpec.<br>
**Data:** JWST simulated NIRCam data from MIRAGE; LMC.<br>
**Tools:**  mirage, jwst, astropy, grismconf, nircam_gsim.<br>
**Cross-intrument:** NIRCam. <br>
**Documentation:** This notebook is part of a STScI's larger [post-pipeline Data Analysis Tools Ecosystem](https://jwst-docs.stsci.edu/jwst-post-pipeline-data-analysis).<br>

 ## Introduction


This notebook shows step-by-step instructions to simulate images of the JWST LMC astrometric calibration field. The NIRCam images are simulated using the software [MIRAGE](https://jwst-docs.stsci.edu/jwst-other-tools/mirage-data-simulator). The observation is designed in APT. The APT output is used as input of MIRAGE.

This Notebook must be executed from an environment that has MIRAGE installed. Follow the instructions in the [Installing MIRAGE webpage](https://mirage-data-simulator.readthedocs.io/en/latest/install.html) before executing this Jupyter Notebook. 

### MIRAGE Tutorials

This notebook provides an example of running MIRAGE in a specific science use case. For a broader tutorial on running MIRAGE, it is suggested you review the [Jwebbinar Number 10](https://www.stsci.edu/jwst/science-execution/jwebbinars).



In [None]:
import os
from glob import glob
import shutil
import yaml
import zipfile
import urllib.request

os.environ["PYSYN_CDBS"] = "./grp/redcat/trds/"
synphot_folder = './grp'

synExist = os.path.exists(synphot_folder)
if not synExist:
    os.makedirs(synphot_folder)
    
# mirage imports
from mirage.imaging_simulator import ImgSim
from mirage.seed_image import catalog_seed_image
from mirage.dark import dark_prep
from mirage.ramp_generator import obs_generator
from mirage.yaml import yaml_generator
from mirage.reference_files import downloader

from astropy.table import Table
from astropy.io import fits


In [None]:
%matplotlib inline
import matplotlib.pyplot as plt

## Setting things up

After activating the environment with MIRAGE and beginning a Jupyter Notebook session, we begin by defining the working directory

In [None]:
path='./'  # write here your working directory

os.chdir(path)

In [None]:
pwd

*Developer Note:*
Find out a way to install the mirage data for the testing CI. Right now the data size is too

Mirage is accompanied by a set of reference files that are used to construct the simulated data. Here we define the location of the MIRAGE data. This is the directory that contains the reference files associated with MIRAGE. 
For users at STScI, this is the location of MIRAGE data:

In [None]:
os.environ['MIRAGE_DATA'] = './mirage_data/'


### Download reference files. This will take a long time. You will need around ~100 GB of space.

If the user is outside of STScI then the reference files must be downloaded using the "downloader" module. Please follow the instructions in https://mirage-data-simulator.readthedocs.io/en/latest/reference_files.html and create an appropriate MIRAGE_DATA location. 

In [None]:
download_path = './'

In [None]:
downloader.download_reffiles(download_path, instrument='FGS', dark_type='linearized', skip_darks=False, single_dark=True, skip_cosmic_rays=False, skip_psfs=False)

In [None]:
downloader.download_reffiles(download_path, instrument='NIRCam', dark_type='linearized', skip_darks=False, single_dark=True, skip_cosmic_rays=False, skip_psfs=False)

# Download Data

In [None]:
boxlink = 'https://data.science.stsci.edu/redirect/JWST/jwst-data_analysis_tools/preimaging_notebooks/preimaging.zip'
boxfile = './preimaging.zip'

# Download zip file
if not os.path.exists(boxfile):
    urllib.request.urlretrieve(boxlink, boxfile)
    
    zf = zipfile.ZipFile(boxfile, 'r')
    zf.extractall()

## Generating input yaml files

We begin the simulation using the programme's APT file. The xml and pointings files must be exported from APT, and are then used as input to the yaml_generator, which will generate a series of yaml input files.

From APT we export two files: the xml and pointing files. These should be in the working directory.


In [None]:
# Specify the xml and pointing files exported from APT
xml_file = os.path.join('preimaging', 'NRC21_pid1069_2018_rev2.xml')
pointing_file = os.path.join('preimaging', 'NRC21_pid1069_2018_rev2.pointing')

Additional optional data to be included.

In [None]:
# Optionally set the telescope roll angle (PAV3) for the observations
pav3=0.0

# Define the output directory
output_dir = path

In this example we create NIRCam images based on a catalogue (all_filters_lmc.cat) of point sources. This catalogue contains the AB magnitude of each source in the following six filters: F070W, F150W, F200W, F277W, F356W, and F444W. 

The dictionary of catalogs must use the APT target names as keys, for example `LMC-ASTROMETRIC-FIELD`. Full details on yaml_generator input options are given here: https://mirage-data-simulator.readthedocs.io/en/latest/yaml_generator.html


This is what the input catalogue looks like. Space separated values with an uncommented header line. 

``` 
# position_RA_Dec
# abmag
# 
# 
index x_or_RA y_or_Dec nircam_f070w_magnitude nircam_f150w_magnitude nircam_f200w_magnitude nircam_f277w_magnitude nircam_f356w_magnitude nircam_f444w_magnitude
1 80.386396453731 -69.468909240644 21.63889 21.59946 21.93288 22.51786 22.99632 23.4255
2 80.385587687224 -69.469200540277 20.42033 20.05396 20.32926 20.92191 21.37946 21.83321
3 80.38036547567 -69.470930464875 21.8158 21.86888 22.2175 22.8008 23.28381 23.7064
4 80.388130492656 -69.468453170293 21.11582 20.8028 21.08802 21.67932 22.14077 22.59048
5 80.388935773363 -69.468195831029 21.76617 21.80178 22.14757 22.73117 23.21336 23.63717
```

For more information look at the following link 

https://github.com/spacetelescope/mirage/blob/master/examples/Catalog_Generation_Tools.ipynb

In [None]:
# Source catalogs to be used
cat_dict = { 'LMC-ASTROMETRIC-FIELD': {'nircam': {'point_source': 'preimaging/all_filters_lmc.cat'} ,
                                          'fgs': {'point_source': 'dummy.cat'} } ,
             '2 LMC-ASTROMETRIC-FIELD': {'nircam': {'point_source': 'preimaging/all_filters_lmc.cat'} ,
                                          'fgs': {'point_source': 'dummy.cat'} } }

## Running the yaml_generator
This will create a collection of yaml files that will be used as input when creating the simulated data. There will be one yaml file for each detector and exposure, so there can be quite a few files created if your programme has lots of exposures or dithers. This LMC  programme will generate 528 files using six NIRCam filters and the JWST FGS. 

In [None]:
# Run the yaml generator

yam = yaml_generator.SimInput(xml_file, pointing_file, 
                              catalogs=cat_dict, 
                              verbose=True,
                              simdata_output_dir=output_dir,
                              output_dir=output_dir,
                              roll_angle=pav3, 
                              # to do : explain linear vs raw
                              datatype='linear,raw') 

yam.use_linearized_darks = True
yam.create_inputs()

## Organizing files according to filter

These notebooks will generate a large amount of data and it is useful to keep it organized in sub directories.

yaml: all the yaml files organized according to filter
mirage_output: linear and uncal files
pipeline_level1: rate files
pipeline_level2: cal files 

In [None]:
path = os.getcwd()
files = glob('jw*yaml')
allfiles = glob('jw*')

if not os.path.exists(os.path.join(path,'mirage_output')):
    os.mkdir(os.path.join(path,'mirage_output'))
             
if not os.path.exists(os.path.join(path,'pipeline_level1')):
    os.mkdir(os.path.join(path,'pipeline_level1'))
             
if not os.path.exists(os.path.join(path,'pipeline_level2')):
    os.mkdir(os.path.join(path,'pipeline_level2'))
             
if not os.path.exists(os.path.join(path,'yaml')):
    os.mkdir(os.path.join(path,'yaml'))

Here we store the yaml files in the yaml directory organized according to filter. The cell below will fail if the files have already been relocated before. If you want to intentionally re-do this step, please manually remove the previous files from the output directory.

In [None]:
# we organize files according to filter
for yamlfile in files:

    with open(yamlfile, 'r') as stream: #open the yaml file in read mode
        doc = yaml.load(stream, Loader=yaml.FullLoader)
        
        filtname = doc['Readout']['filter'] #read the filter keyword
        if not os.path.exists(os.path.join(path,'yaml',filtname.lower())):
            os.mkdir(os.path.join(path,'yaml',filtname.lower()))
    
    filetomove = yamlfile  
    input_file = filetomove
    output_file = os.path.join(path,'yaml',filtname.lower()) 
    
    print('input  = ',input_file)
    print('output = ',output_file)
    
    shutil.move(input_file, output_file) #move the file to the corresponding sub directory


# Execute MIRAGE and create simulated data

Now that the yaml files have been generated, we can execute MIRAGE using them as input parameters and generate the NIRCam images.

As an example, let us choose filter F150W. We are going to simulate all of the images that were observed using filter F150W. The variable "listname" contains the names of the yaml files that we want to process through MIRAGE. There are 128 F150W yaml files.  

### This step will take a long time to run. To decrease the run-time, we will only process Exposure 0004. You can change the filter to process all files if desired.

In [None]:
# input parameters

filtname = 'f150w'

cwd = os.getcwd()
filter_pattern = os.path.join(cwd,'yaml',filtname.lower(),'jw01069001001*0004*yaml') 
files = glob(filter_pattern)[:]
listname = files

In [None]:
# copy the F150W yaml files back in the working directory
for yamlfile in files:
    input_file = yamlfile         
    output_file = cwd 
    print('input  = ',input_file)
    print('output = ',output_file)
    shutil.copy(input_file, output_file) #this copies over filter files

In [None]:
# read the list of yaml files to process
t = Table.read(listname, format='ascii.fast_no_header')
input_yaml = t['col1']

yaml_list = []
for k in range(len(input_yaml)):
    yaml_list.append(input_yaml[k])

print(yaml_list)

files = yaml_list
paramlist = yaml_list
print(files)

From each yaml file, Mirage will produce a noiseless seed image, a "raw" [(level 1b) file](https://jwst-pipeline.readthedocs.io/en/stable/jwst/data_products/science_products.html?highlight=uncal#uncalibrated-raw-data-uncal), and a linearized ramp (equivalent to the output of the linearity correction step of the [calwebb_detector1 pipeline](https://jwst-pipeline.readthedocs.io/en/stable/jwst/pipeline/calwebb_detector1.html))

In [None]:
for yamlfile in files:
    print('---------------------PROCESSING: ',yamlfile,'  -------------------------------')
    
    # run Mirage
    sim = ImgSim()
    sim.paramfile = yamlfile
    sim.create()


## Examine the output
Here we display the output files generated by MIRAGE. The UNCAL file is the raw uncalibrated file. 

### Seed image
The seed image contains only the signal from the astronomical sources and background. There are no detector effects, nor cosmic rays added to this count rate image.


In [None]:
def show(array,title,min=0,max=1000):
    plt.figure(figsize=(12,12))
    plt.imshow(array,clim=(min,max))
    plt.title(title)
    plt.colorbar().set_label('DN$^{-}$/s')

In [None]:
seed_file = 'jw01069001001_01101_00004_nrcb4_uncal_F150W_CLEAR_final_seed_image.fits'

with fits.open(seed_file) as hdulist:
    seed_data = hdulist[1].data
print(seed_data.shape)
show(seed_data,'Seed Image',max=5)

### Linear file example
MIRAGE generates the linear and uncalibrated files. Here we display an example linear file. 

In [None]:
linear_file = 'jw01069001001_01101_00004_nrcb4_linear.fits'
with fits.open(linear_file) as hdulist:
    linear_data = hdulist['SCI'].data
print(linear_data.shape)

In [None]:
# this image has five groups
# we display the last group
show(linear_data[0, 4, :, :], "Final Group linear file", max=250)

### Raw uncalibrated file example
First let us display a single group, which is dominated by noise and detector artifacts.

In [None]:
raw_file = 'jw01069001001_01101_00004_nrcb4_uncal.fits'
with fits.open(raw_file) as hdulist:
    raw_data = hdulist['SCI'].data
print(raw_data.shape)

In [None]:
# the image has five groups. Here we display the last group
show(raw_data[0, 4, :, :], "Final Group uncal file", max=15000)

Many of the instrumental artifacts can be removed by looking at the difference between two groups. Raw data values are integers, so first make the data floats before doing the subtraction.

In [None]:
show(1. * raw_data[0, 4, :, :] - 1. * raw_data[0, 0, :, :], "Last Minus First Group uncal file", max=200)