# Synthetic image creation for MOSviz pipeline data

**Motiviation**: The synthetic dataset we've received from the JWST data pipeline team for use in MOSviz contains simulated spectra from NIRSpec, but no simulated photometry from NIRCam. We'd like to have test imagery to display in MOSviz alongside the 2D and 1D spectra we already possess. We've learned that the pipeline team has no plans to produce any, so we attempt to piece together our own.

**Goal**: Populate an image of background noise with properly-scaled galaxy cutouts sourced from an HST image and placed at their respective WCS locations. These galaxies' real spectra don't necessarily correspond with those in our dataset, but at this point we care more about the veneer of having  photometry to match with our spectra.

**Execution**: We pull our galaxy cutouts and catalog information from the Hubble Deep Field image. ~~ASTRODEEP's image of the [Abell 2774 Parallel](http://astrodeep.u-strasbg.fr/ff/?img=JH140?cm=grayscale) | [MACS J0416.1-2403 Parallel](http://astrodeep.u-strasbg.fr/ff/?ffid=FF_M0416PAR&id=1264&cm=grayscale)~~. We sought to use [Artifactory](https://bytesalad.stsci.edu/ui/repos/tree/General/jwst-pipeline%2Fdev%2Ftruth) to obtain a range of RA/Dec over which to project our synthetic image, but that information was absent. Instead, we place the image over a manually chosen RA/Dec range and place the cutouts in randomly selected locations within that field of view.

**Issues**:
- We wanted to scrape RA/Dec information the data pipeline products to get a range of coordinates over which to scale our synthetic image, but it appears that the pipeline's data products don't have `"TARG_RA"` or `"TARG_DEC"` keywords in their headers.
    - The data products also don't appear to have WCS information. We don't strictly need it to achieve this notebook's goals, but it would be convenient to have.
- There doesn't appear to be an observation with level 2 data in `jwst-pipeline/truth/test_nirspec_mos_spec2` and level 3 data in `jwst-pipeline/truth/test_nirspec_mos_spec3`. All observations are either level 2 only or level 3 only.
- _(Resolved)_ A good number of the cutouts from the first couple of field images we tested had intrustions from other galaxies due to crowding. We settled on the Hubble Deep Field as a good source image, but if we hadn't, we may have considered using galaxies modeled with Sersic profiles to get cleaner cutouts to inject into our synthetic image.

*Writers: Robel Geda and O. Justin Otor*

In [None]:
from astropy.io import fits
from astropy.nddata import block_reduce, Cutout2D
from astropy.stats import sigma_clipped_stats, sigma_clip
from astropy.table import Table, join
from astropy.wcs import WCS
from glob import glob

import matplotlib.pyplot as plt
import numpy as np

### Generate galaxy cutouts

In [None]:
# download the image containing sources to be cut out later
image_fits = fits.open('https://archive.stsci.edu/pub/hlsp/hdf/v2/mosaics/x4096/f814_mosaic_v2.fits')
image_header = image_fits[0].header
image_data = image_fits[0].data

image_data.shape

In [None]:
# download those sources' locations in the image and sort them by brightness
source_info1 = Table.read('https://archive.stsci.edu/pub/hlsp/hdf/wfpc_hdfn_v2catalog/HDFN_wfpc_v2generic.cat',
                          format='ascii')
source_info2 = Table.read('https://archive.stsci.edu/pub/hlsp/hdf/wfpc_hdfn_v2catalog/HDFN_f814_v2.cat',
                          format='ascii')
sources = table_join(source_info1, source_info2)

# confirm that both tables contain the same objects in the same order
(source_info1['NUMBER'] == source_info2['NUMBER']).sum() == len(source_info1) == len(source_info2)

In [None]:
# sort sources by flux contained within 71.1 pixel diameter of source (11)
sources.sort('FLUX_APER_11', reverse=True)
#sources.sort('KRON_RADIUS', reverse=True)
#sources.sort('FWHM_IMAGE')

# filter out likely stars and sources with negative flux
sources = sources[(sources['CLASS_STAR'] < .5)
                  & (sources['FLUX_APER_8'] > 0)]

In [None]:
sources[:5]

In [None]:
# convert the sources' WCS locations to in-image pixel values
image_wcs = WCS(image_fits[0].header)
sources_x, sources_y = image_wcs.world_to_pixel_values(sources['ALPHA_J2000'],
                                                       sources['DELTA_J2000'])

In [None]:
# save a list of good cutouts for later use
cutout_list = []
first_source = 0
catalog_size = 20
downsample_factor = 2
patch_length = 100

for x, y in list(zip(sources_x, sources_y))[first_source:]:
    # use pixel locations to cut a source from the image
    cutout = Cutout2D(image_data, (x, y),
                      patch_length * downsample_factor).data
    
    # bin by downsample_factor to increase field of view
    cutout = block_reduce(cutout, downsample_factor)
    
    # skip any cutouts that extend past the image border
    if (  np.all(cutout[-1] <= 0) or np.all(cutout[0] <= 0)
          or np.all(cutout[:,-1] <= 0) or np.all(cutout[:,0] <= 0)  ):
        continue
        
    # save and plot the new cutout
    cutout_list.append(cutout)
    
    plt.imshow(cutout, vmin=-1e-5, vmax=image_data.std(),
               origin='lower', cmap='bone')
    plt.show()
    
    if len(cutout_list) == catalog_size:
        break

In [None]:
clipped_mean, clipped_median, clipped_stddev = sigma_clipped_stats(image_data, sigma=3.)

### Extract destination RA/Dec from spectra files

In [None]:
# save level 3 spectra FITS header information
# (is there a way to do so using the URL?)
# 'https://bytesalad.stsci.edu/ui/repos/tree/General/jwst-pipeline%2Fdev%2Ftruth%2Ftest_nirspec_mos_spec3%2Fjw00626-o030_s00000_nirspec_f170lp-g235m_s2d.fits'
x1d_header = fits.getheader('/Users/jotor/Downloads/jw00626-o030_s00000_nirspec_f170lp-g235m_x1d.fits')
s2d_header = fits.getheader('/Users/jotor/Downloads/jw00626-o030_s00000_nirspec_f170lp-g235m_s2d.fits')

In [None]:
for k, v in x1d_header.items():
    if 'or' in k.lower():
        print(k, v)

In [None]:
(x1d_header['TARG_RA'], x1d_header['TARG_DEC'],
 s2d_header['TARG_RA'], s2d_header['TARG_DEC'])

In [None]:
# no WCS data... not what we want
WCS(x1d_header)

In [None]:
# search for RA/Dec information from Artifactory observation files
x1d_header_list = [fits.getheader(file) for file in glob('/Users/jotor/Downloads/jw00626*x1d.fits')]

# all the same... not what we want
ras, decs = np.array([[h['TARG_RA'], h['TARG_DEC']] for h in x1d_header_list]).T
ras, decs

In [None]:
# since all information is the same, we randomly generate our own RAs/decs
# (ranges based on NIRSpec MSA's on-sky projection size of 3.6x3.4 arcmins)
np.random.seed(19)
ras = np.random.uniform(0, 1/15, catalog_size)
decs = np.random.uniform(-1/30, 1/30, catalog_size)

### Create synthetic image

In [None]:
# create synthetic image onto which cutouts will be pasted
synth_img_size = 1000
synth_image = np.zeros((synth_img_size, synth_img_size))

In [None]:
# add noise
synth_image += np.random.normal(loc=clipped_mean, scale=clipped_stddev*8,
                                size=synth_image.shape)
# synth_image += np.random.normal(loc=image_data.mean(), scale=image_data.std(),
#                                size=synth_image.shape)

In [None]:
plt.imshow(synth_image, cmap='bone')
plt.show()

### Fill out new WCS object for `synth_image`

In [None]:
synth_wcs = WCS(naxis=2)
synth_wcs

In [None]:
ra_bounds = np.array([ras.max(), ras.min()])
dec_bounds = np.array([decs.max(), decs.min()])

In [None]:
delta_ra = ras.max() - ras.min()
delta_dec = decs.max() - decs.min()

In [None]:
# set minimum FOV as maximum range in RA or dec
if delta_ra > delta_dec:
    min_image_fov = abs(delta_ra * np.cos(np.pi/180 * dec_bounds.sum()/2))
else:
    min_image_fov = delta_dec
    
min_image_fov

In [None]:
# scale this FOV by pixels
pix_scale = min_image_fov / synth_img_size

# add a buffer to the borders
pix_scale *= 1.5
pix_scale

In [None]:
synth_wcs.wcs.ctype = ['RA---TAN', 'DEC--TAN']

# match value of center pixel of detector to value of FOV's central coordinate in the sky
synth_wcs.wcs.crpix = [synth_img_size / 2, synth_img_size / 2]
synth_wcs.wcs.crval = [ra_bounds.sum() / 2, dec_bounds.sum() / 2]

# distance (in sky coordinates) traversed by one pixel length in each dimension
synth_wcs.wcs.cdelt = [-pix_scale, pix_scale]

synth_wcs

In [None]:
# convert source RAs/decs from real coordinates to pixels 
ras_pix, decs_pix = np.round(synth_wcs.world_to_pixel_values(ras, decs)).astype(int)
ras_pix, decs_pix

### Populate `synth_image` with the cutouts

In [None]:
cutout_half_delta = patch_length // 2

for i in range(len(ras_pix)):
    synth_image[ras_pix[i] - cutout_half_delta : ras_pix[i] + cutout_half_delta,
                decs_pix[i] - cutout_half_delta : decs_pix[i] + cutout_half_delta] += cutout_list[i]

In [None]:
fig, ax = plt.subplots(figsize=(10,10))
ax.imshow(synth_image, vmin=0, vmax=synth_image.std()*3, origin='lower', cmap='bone')
#plt.imshow(synth_image, vmin=0, vmax=synth_image.mean()*3, origin='lower', cmap='bone')
plt.show()

In [None]:
fits.writeto('synthetic_HDF_more.fits', synth_image,
             header=synth_wcs.to_header(), overwrite=True)