# How to run the pipeline on a set of MIRI imager data with defaults and update a few parameters

This example notebook will demonstrate how to process a MIRI imager data set through the stage one, two and three imager pipelines. 

The pipeline documentation can be found here: https://jwst-pipeline.readthedocs.io/en/latest/

**The pipeline code and install directions are available on GitHub: https://github.com/spacetelescope/jwst**

The steps in this notebook are:


  1)  Read in list of uncalibrated data (uncal.fits) files.

  2)  Process through calwebb_detector1.

  3)  Process ramp fit (rate.fits) files through calwebb_image2.

  4)  Create an assocation file for the calibrated files (cal.fits).

  5)  Run the calibrated files through calwebb_image3 using the association file.

  6)  Look at the output catalogs and compare sources on image.
    
Setup needed before getting started.

* Place notebook and data into the directory where you wish to process your data.

* Install pipeline (and pip install any missing modules you encounter when you try to run).


There will also be a set of useful plots and image displays throughout the notebook that will help examine data quality as you proceed.  


This notebook was written in October 2023, and was meant to work with pipeline version 1.12.0. Future versions of the pipeline should work, but backward compatibility of notebooks is not guaranteed with all python package updates.

## Information on Running the pipeline

This notebook utilizes the .call() method of running the notebook in python. With this method, the pipeline will retrieve and use any parameter reference file that applies to your data. These files will set certain parts of the code to run or be skipped, or select any parameters that the instrument teams have put as preferred defaults for most science cases. If you run with this method and do not set any custom parameters, you will replicate what you would get from the automated pipeline and populated in MAST. These parameter reference files can be found in CRDS as well as reference files like flats and darks. 

If you wish to customize the parameters, you can read the documentation in the ReadTheDocs link posted above. Each step in the pipeline has a section in the documentation that explains the parameters relevant to that step.  To set the parameters in your code as you re-run data, a configuration dictionary is set up (and shown in the notebook as examples), then that custom configuration overrides the default values that would have come from the parameter reference files. Only the parameters listed in your custom dictionary are overriden, while the rest remain set to the default values.

Most parameters that would need customization come in the stage three pipeline, but the jump step in the first pipeline stage, calwebb_detector1, also has a more extensive set of customizable parameters, and reading the documentation for this step could prove useful. (More details at the calwebb_detector1 stage in this notebook.)

In order to run the pipeline and retrieve the necessary reference files from CRDS, there are a couple of CRDS variables that need to be set. CRDS_PATH tells the pipeline where to save and retrieve reference files. This should be set to a local directory where new reference files will be downloaded, and frequently used reference files can be retrieved and used without needing to pull large files across your internet connection. The other path that needs to be set is the CRDS_SERVER_URL, to know where to look to retrieve any new reference files. That path should be set to https://jwst-crds-stsci.edu. In this notebook, these are set using the os.environ statements below.

NOTE: you should store only CRDS reference files in your CRDS cache directory. This is in case you need to delete and redo your CRDS cache area at any point, and you don't want to chance erasing anything that should not be deleted.

os.environ['CRDS_PATH'] = '/local_crds_path/crds_cache/'

os.environ['CRDS_SERVER_URL'] = 'https://jwst-crds-stsci.edu'

### Using datamodels

This notebook will show how to use datamodels in working with your data. 
The basic steps involve reading a file into a datamodel type and then opening the individual named extensions as needed.

Header data is stored in the .meta extension.

Science data is stored in the .data extension.

Data quality flags can be accessed with .groupdq (in RampModel data files that allow accessing dq flags of each group in the data) or .dq (in ImageModel data files after calwebb_detector1 is run).

For more in-depth information on datamodels and how to use them: https://stdatamodels.readthedocs.io/en/latest/jwst/datamodels/index.html#data-models

### Import statements
Import modules that will be needed to read in data, run the pipeline and display any plots or visualizations.

Packages to be sure are installed:
* jwst
* jupyter

### Set CRDS Path info

The CRDS path information needs to be set before importing any crds or jwst packages. The jwst installation instructions here: https://jwst-pipeline.readthedocs.io/en/latest/getting_started/quickstart.html show how to set up the CRDS information in a setup file (such as bash.profile or other setup file), but if the paths are not set in a setup file, they can be set in a notebook using os.environ as shown below.

In [None]:
# Changes for the Science Platform environment
# Preparing cached data
import os
import glob

preloaded_fits_dir = "/home/shared/preloaded-fits/jwebbinar_31/miri/"
for filename in glob.glob(os.path.join(preloaded_fits_dir, "*.fits")):
    basename = os.path.basename(filename)
    if not os.path.exists(basename):
        os.symlink(filename, basename)

In [None]:
# Set CRDS path info
import os

os.environ['CRDS_PATH'] = os.environ['HOME']+'/crds_cache/' 
os.environ['CRDS_SERVER_URL'] = 'https://jwst-crds-stsci.edu'

print('CRDS cache location: {}'.format(os.environ['CRDS_PATH']))

In [None]:
#import jwst pipeline modules
import jwst
from jwst.pipeline import Detector1Pipeline, Image2Pipeline, Image3Pipeline # pipeline modules
from jwst import datamodels
from jwst.datamodels import RampModel, ImageModel, dqflags # Data models and dq values

# Needed for associations
from jwst import associations
from jwst.associations.lib.rules_level3_base import DMS_Level3_Base
from jwst.associations import asn_from_list

# Other modules/functions to work with and examine data
import numpy as np
import os
import matplotlib
import matplotlib.pyplot as plt
import glob

from astropy.io import ascii, fits
from astropy.visualization import simple_norm
from astropy.modeling import models, fitting

from astropy.stats import sigma_clipped_stats
from astropy.convolution import Gaussian1DKernel, convolve
from astropy.wcs import WCS
from astropy.coordinates import SkyCoord
from astropy import table
#from astropy.table import Table

from tweakwcs import JWSTgWCS

# Box download imports 
from astropy.utils.data import download_file
from pathlib import Path
from shutil import move
from os.path import splitext

import crds

In [None]:
# Print the version of the JWST pipeline.

print(jwst.__version__)

### Load helper script for overlaying a catalog over the image



In [None]:
# Helper script to plot an image and overlay catalog sources

def overlay_catalog(
    data_2d,
    catalog,
    flux_limit=0,
    vmin=0,
    vmax=10,
    title=None,
    units="MJy/str",
    dmap="binary",
):
    """Function to generate a 2D image of the data,
    with sources overlaid.

    data_2d : numpy.ndarray
        2D image to be displayed

    catalog : astropy.table.Table
        Table of sources

    flux_limit : float
        Minimum signal threshold to overplot sources from catalog.
        Sources below this limit will not be shown on the image.

    vmin : float
        Minimum signal value to use for scaling

    vmax : float
        Maximum signal value to use for scaling

    title : str
        String to use for the plot title

    units : str
        Units of the data. Used for the annotation in the
        color bar
    """
    norm = ImageNormalize(
        data_2d, interval=ManualInterval(vmin=vmin, vmax=vmax), stretch=SqrtStretch()
    )
    fig, ax = plt.subplots(figsize=(12, 10))
    im = ax.imshow(data_2d, origin="lower", norm=norm, cmap=plt.get_cmap(dmap))

    for row in catalog:
        if row["aper_total_flux"].value > flux_limit:
            plt.plot(
                row["xcentroid"],
                row["ycentroid"],
                marker="o",
                markersize="3",
                color="red",
            )

    plt.xlabel("X pixel")
    plt.ylabel("Y pixel")

    fig.colorbar(im, label=units)
    fig.tight_layout()
    plt.subplots_adjust(left=0.15)

    if title:
        plt.title(title)

### Read in data

For this notebook, the data is stored in Box and will be loaded into the working directory.
You can also just put the notebook and data into a single directory, or include path information as you work with the data. 

The data provided here as an example come from PID 1040, and consist of five dither positions that make up a single tile of a larger LMC mosaic. The data is taken in filter F770W, and each exposure is six frames and a single integration.


In [None]:
# Read in dataset from Box

def get_box_files(file_list):
    for box_url,file_name in file_list:
        if 'https' not in box_url:
            box_url = 'https://stsci.box.com/shared/static/' + box_url
        downloaded_file = download_file(box_url, timeout=600)
        print(downloaded_file)
        if Path(file_name).suffix == '':
            ext = splitext(box_url)[1]
            file_name += ext
        print(file_name)
        copy(downloaded_file, file_name)


# F770W data of PID 1040 (LMC), one tile of full mosaic, taken in May 2022   
file_urls = ['https://stsci.box.com/shared/static/yk2kfwxr1dv7mcc84zvi02t7mq8e5wc7.fits',
             'https://stsci.box.com/shared/static/icedk709owq6op5mttv4y8k0bce58wu7.fits',
             'https://stsci.box.com/shared/static/kq4h4pb95yse3k1ryb0d1vcs69ay4679.fits',
             'https://stsci.box.com/shared/static/z1wbjh0y3dzuh2i5heejo0sehwqu2p2w.fits',
             'https://stsci.box.com/shared/static/q01wh0evq7w7klqy566cbm1qlpuys7h7.fits'] 

uncalfiles = ['jw01040001005_03103_00001_mirimage_uncal.fits',
              'jw01040001005_03103_00002_mirimage_uncal.fits',
              'jw01040001005_03103_00003_mirimage_uncal.fits',
              'jw01040001005_03103_00004_mirimage_uncal.fits',
              'jw01040001005_03103_00005_mirimage_uncal.fits']



In [None]:
# Run commands to load the data from box into current directory

# box_download_list = [(url,name) for url,name in zip(file_urls,uncalfiles)]


# get_box_files(box_download_list)

# print(uncalfiles)

In [None]:
# To just gather all uncalfiles in a specific folder: Do not do this if you have multiple datasets in a single folder

#uncalfiles = glob.glob('*uncal.fits')
#print(uncalfiles)

In [None]:
# If files area already in the directory with the data, and you only need the list of files for processing, uncomment these lines.

uncalfiles = ['jw01040001005_03103_00001_mirimage_uncal.fits',
             'jw01040001005_03103_00002_mirimage_uncal.fits',
             'jw01040001005_03103_00003_mirimage_uncal.fits',
             'jw01040001005_03103_00004_mirimage_uncal.fits',
             'jw01040001005_03103_00005_mirimage_uncal.fits']

### Look at your data and header parameters

Read your data files into a RampModel data model to examine some header parameters.
Read a sample data file into a model to display the last frame in the ramp to look at the scene in the data.

In [None]:
# Look at some header parameters from your data

print('File name, Instrument, Subarray,  Filter,  Nints,  Ngroups')

for file in uncalfiles: 

    imfile = RampModel(file) # Read files into datamodel

    header = imfile.meta # Read the meta data into a variable called 'header'
    
    # Read in the header keywords
    name = header.filename
    inst = header.instrument.name
    subarray = header.subarray.name
    filt = header.instrument.filter
    nints = header.exposure.nints
    ngroups = header.exposure.ngroups

    print(name, inst, subarray, filt, nints, ngroups)

In [None]:
# If you need to find a specific header value to view, and you know the FITS header keyword, you 
# can use this code to find the datamodel equivalent of the keyword.

imfile.find_fits_keyword('DATE-OBS')


In [None]:
# set up sample file name for image viewing and up the ramp pixel plotting

scifile = uncalfiles[0]
print(scifile)

# Split filename to get the base section of the name without the specific stage of processing.
samplefile, remainder = scifile.split('_uncal.')
print(samplefile)


In [None]:
# Read sample data file into a datamodel
uncal_im = RampModel(samplefile+'_uncal.fits')

In [None]:
# Look at the last frame of one of the data files to visualize the field of view

fig, ax = plt.subplots(figsize=(20,20))

# Set up image
cax = ax.imshow(uncal_im.data[0, -1, :, :], cmap='Greys', origin='lower', vmin=2500,vmax=4500)

# Set up colorbar
cb = fig.colorbar(cax)
cb.ax.set_ylabel('counts',fontsize=14)

#Set labels 
ax.set_xlabel('X',fontsize=16)
ax.set_ylabel('Y',fontsize=16)
ax.set_title(scifile, fontsize=16)
plt.tight_layout()

In [None]:
# Plot a pixel up the ramp 

xpos = 590
ypos = 289

integration = 0  # This is single integration data, but this works with multi integration data to look at a single int out of the exposure

plt.xlabel('Frame number')
plt.ylabel('Pixel value in counts')

plt.plot(uncal_im.data[integration, :, ypos, xpos])

## Run calwebb_detector1 on your data

Documentation link for calwebb_detector1: https://jwst-pipeline.readthedocs.io/en/latest/jwst/pipeline/calwebb_detector1.html

Set a few parameters for the jump step (most steps are fine to run with defaults).
Loop through and run all files through calwebb_detector1 to get 'rate.fits' files that contain slope fit images.

When setting up the pipeline stage to run, using the .call() method requires setting up dictionaries for each of the steps that will have parameters set that are different from the defaults.

If you wish to track what pixels are flagged as jumps, you can set the jump step parameter 'save_results' to True and examine the output '*jump.fits' files.

#### The detector1 pipeline will take the longest amount of runtime in this notebook. This example uses a small dataset, but if you have large data (either in large numbers of frames/integrations or number of exposures), this can take awhile to run, so be prepared for that.

The jump step has the largest number of adjustable parameters in calwebb_detector1. If you choose to turn on shower or snowball correction code (showers for MIRI, snowballs for NIR), then it would be a good idea to see what the adjustable options are. There are default values defined in the parameter reference files, but those may not be ideal for specific modes or science cases, so check the full list of parameters if your results seem less than optimal. And keep in mind there are different parameters for snowball and shower code, so be sure to read the documentation. https://jwst-pipeline.readthedocs.io/en/latest/jwst/jump/arguments.html

In [None]:
# Set a few Jump step parameters, for versions 1.10.0 and later (build version 9.2)
# find_showers turns the cr shower code on for MIRI. Set to False to skip this code. (Default is currently False.)
# Use mostly code defaults for the rest of the parameters.

rej_thresh = 5

expand_large_events = False  # This parameter is used for NIR instruments
find_showers = True  # This parameter is used for MIRI only

In [None]:
# Set a few parameters and run each file through the Detector1 Pipeline

for file in uncalfiles:
    pipe = Detector1Pipeline()

    base, remainder = file.split('_uncal.')
    print(base)
    outname=base
    
    # Set up dictionaries for step parameters
    cfg = dict()
    cfg['jump'] = {} # Set up dictionary for the various jump paramters
    cfg['jump']['save_results'] = True # The jump output file is useful if you want to track which frames have jumps flagged
    cfg['jump']['rejection_threshold'] = rej_thresh
    cfg['jump']['expand_large_events'] = expand_large_events
    cfg['jump']['find_showers'] = find_showers


    pipe.call(file, steps=cfg, save_results=True, output_file =  base + '.fits')

### Look at output rate image

Plot one of the output rate images to see what the image looks like. The code here will also mark any pixels with a DQ value of DO_NOT_USE in blue. These blue pixels are considered bad in some way and will not contribute to the final combined image. You can see an image of an example flat field mask here: https://jwst-docs.stsci.edu/jwst-mid-infrared-instrument/miri-instrument-features-and-caveats. You will notice some of the same shorted columns and bad pixels that are flagged blue in our example below.

In [None]:
# Load and look at a single rate image

# Read the rate file into an ImageModel datamodel
ratefiles = [ele.replace('uncal', 'rate') for ele in uncalfiles]
rate_im = ImageModel(samplefile+'_rate.fits')

In [None]:
# Look at the averaged slope image, and plot the DO_NOT_USE dq flagged pixels in blue


# mask out DO_NOT_USE values of 1
masked_rate_im = np.ma.masked_where((rate_im.dq & dqflags.pixel['DO_NOT_USE'] > 0), rate_im.data)

cmap = matplotlib.colormaps["Greys"].copy()  # Can be any colormap that you want after the cm
cmap.set_bad(color='blue') # color to mark all DO_NOT_USE pixels


fig, ax = plt.subplots(figsize=(20,20))

# Set up image
cax = ax.imshow(masked_rate_im, cmap=cmap, origin='lower', vmin=10,vmax=25.0)

# Set up colorbar
cb = fig.colorbar(cax)
cb.ax.set_ylabel('counts/sec',fontsize=14)

#Set labels 
ax.set_xlabel('X',fontsize=16)
ax.set_ylabel('Y',fontsize=16)
ax.set_title(ratefiles[0], fontsize=16)
plt.tight_layout()


### Look at a few specific ramps and whether any jumps were flagged

Read in the output jump.fits file (output specifically in the jump step), and choose a few pixels in the image to plot up the ramp and look for jumps. The code below plots the selected pixels up the ramp and places a black dot to show where pixels were flagged up the ramp as jumps.

In [None]:
# Load jump output image and examine the same pixel up the ramp to see if any jumps were flagged.

# Read the jump file into a RampModel datamodel
jumpfile = samplefile+'_jump.fits'
jumpim = RampModel(jumpfile)


In [None]:
# Choose a set of pixels to plot up the ramp and check for jumps

xvals = [590, 715, 942, 574]
yvals = [289, 278, 154, 607]

In [None]:
# Cycle through pixel values and plot ramp and list dq values
# This is plotted for a single integration of a cube 

# The black dot is used to represent a pixel (in a single group) that has been flagged as a jump
# The value for a jump detection in the dq array is 4, but you can search for it with the name 'JUMP_DET' rather than the value.

nframes = jumpim.meta.exposure.ngroups
frames = np.arange(nframes)

# set up titles for plot
plt.title(jumpfile)
plt.xlabel('Frame number')
plt.ylabel('value up the ramp in counts')

i=0
# Loop through x,y values
for x, y in zip(xvals, yvals):
    # get locations of flagged pixels within the ramps
    # The groupdq extension is the one that tracks all flagged pixels up the ramps.
    jumps = jumpim.groupdq[integration, :, y, x] & dqflags.pixel['JUMP_DET'] > 0
    
    ramp = jumpim.data[integration, :, y, x]

    # plot ramps of selected pixels and flagged jumps
    plt.plot(ramp, label='ramp '+str(i) )  # Plot ramps
    plt.plot(frames[jumps], ramp[jumps], color='k', marker='o', linestyle='None') # Plot any jumps up the ramp
    i=i+1
    
    # Print dq values for each group up the ramp for each pixel; jumps will include a value of 4
    print(jumpim.groupdq[integration, :, y, x])
    
plt.legend()  

## Run calwebb_image2

Documentation link for calwebb_image2: https://jwst-pipeline.readthedocs.io/en/latest/jwst/pipeline/calwebb_image2.html

Run the output of detector1 (rate files) through calwebb_detector2 to obtain a set of calibrated files (cal files). Use default parameters here.


In [None]:
# Run Calwebb_image2 on output files from detector1    
    
print('There are ', len(ratefiles), ' images.')
    
callist = []

# cycle through files
for im in ratefiles:

    calfile = Image2Pipeline.call(im, save_results=True)

    callist.append(calfile)

print(callist)

In [None]:
# Plot rate and cal images near each other to view any differences
# Read the cal image into an ImageModel datamodel
cal_im = ImageModel(samplefile+'_cal.fits')

# Set up two plots side by side
fig, ax = plt.subplots(2, 1,figsize=[15,15])
cmap='Greys'


ax[0].set_title('Rate file')
ax[1].set_title('Cal file')

cset1 = ax[0].imshow(rate_im.data[:,:],cmap=cmap, origin='lower', vmin=10, vmax=24)
ax[0].set_xlabel('X',fontsize=14)
ax[0].set_ylabel('Y',fontsize=14)
cb1 = fig.colorbar(cset1, ax=ax[0])
cb1.ax.set_ylabel('counts/sec',fontsize=14)


cset2 = ax[1].imshow(cal_im.data[:,:],cmap=cmap, origin='lower', vmin=3, vmax=6)
ax[1].set_xlabel('X',fontsize=14)
ax[1].set_ylabel('Y',fontsize=14)
cb2 = fig.colorbar(cset2, ax=ax[1])
cb2.ax.set_ylabel('MJy/str',fontsize=14)
plt.tight_layout()

### Run Calwebb_image3 on cal.fits files from Calwebb_image2

First create an association file for the calibrated files if you do not already have an association file, and then use that file to process the set through calwebb_image3.


In [None]:
# use asn_from_list to create association table
miri_asn_name = 'miri_F770W_pid1040_combined_scaled' # name of output asn file

calfiles = glob.glob('jw01040*_cal.fits') # get a list of calibrated files

# Create association file
asn = asn_from_list.asn_from_list(calfiles, rule=DMS_Level3_Base, product_name=miri_asn_name)

# dump association table to a .json file for use in image3

miri_asn_file = miri_asn_name+'.json'
with open(miri_asn_file, 'w') as outfile:
    outfile.write(asn.dump()[1])

### Set up parameters for calwebb_image3

Set up a dictionary for the parameters you wish to change. The parameters set here are an example set of parameters.

To learm more about the parameters that can be set for each step in the pipeline, check the 'Step Arguments' section of the pipeline documentation for each step in calwebb_image3.

Pipeline module for calwebb_image3 documentation: https://jwst-pipeline.readthedocs.io/en/latest/jwst/pipeline/calwebb_image3.html

Click on the step you wish to explore, and then the Step Arguments.

A few parameters that a user might want to spend some time testing are resample step parameters kernel and weight_type, and outlier_detection step scale parameters. The scale values demonstrated in this notebook are twice the default values, and have been shown to improve the combined output image for some science data. If your photometry is off, try testing these parameters.

The tweakreg parameters are mostly needed for image alignment. If your images don't seem well aligned, check those parameters.


Some images can have bad WCS values due to arge pointing errors. This can show up between different dithers, and might need adjustments before they will correctly align with the other images in the association. If your mosaic shows problems like double point sources or evidence that one image is misaligned from the others, you can examine them to figure out what corrections would need to be applied to the mis-aligned image before it could be combined into a mosaic.

Some tools exist to help with this issue. 
Adjust_wcs tool: https://jwst-pipeline.readthedocs.io/en/latest/jwst/tweakreg/utils.html - This allows you to provide offset/rotation/scale values to fix the WCS header of the cal fits files. You can set parameters here to adjust the x/y offset, rotation or scale values to bring the outlier image back into better alignment with the other images.

The JHAT tools: https://github.com/arminrest/jhat have also been developed in order to help users fix bad wcs issues.



In [None]:
# Set desired parameters for calwebb_image3 steps

# This section of parameters are for aligning to an external catalog, in this case GAIADR3.
# parameters with abs_* names are used to refer to the absolute catalog parameters (gaia or other external catalog)
abs_refcat = 'GAIADR3'   #String indicating what astrometric catalog should be used. Currently supported options: ‘GAIADR1’, ‘GAIADR2’, ‘GAIADR3’, a path to an existing reference catalog, None, or ‘’.
abs_minobj = 15  # number of objects that must match to GAIA catalog; not all fields will have many Gaia sources for comparison
save_abs_catalog = True # Save out the gaia catalog that matches to your dataset 

# You can also set parameters to match between files in your own dataset
minobj = 25 # Number of stars that must match between your individual dithers to be considered a match

# These are example parameters to demonstrate how to change values, but the values have been good options in specific science cases.
# A gaussian kernel might be better for PSF centroiding than other options
# Doubling the scale used from the default was helpful in the absolute flux calibration work
resamp_kernel = 'gaussian'
scale = '1.0 0.8'
weight_type = 'exptime'
#pix_scale = '0.06'


In [None]:
# Run calwebb_image3 (or Image3Pipeline) on calibrated data using association table.

# Set up the configuration to change any parameters
cfg = dict() 

cfg['tweakreg'] = {} # set up empty dictionary for multiple parameters to be set per step
cfg['tweakreg']['abs_refcat'] = abs_refcat
cfg['tweakreg']['abs_minobj'] = abs_minobj
cfg['tweakreg']['save_abs_catalog'] = save_abs_catalog 
cfg['tweakreg']['minobj'] = minobj
cfg['tweakreg']['save_catalogs'] = True

#cfg['skymatch'] = {'skip' : True}  # Syntax if you wish to skip a step within a pipeline module

cfg['resample']={}  # set up empty dictionary for multiple parameters to be set per step
cfg['resample']['kernel'] = resamp_kernel
cfg['resample']['weight_type'] = weight_type
#cfg['resample']['pixel_scale'] = pix_scale

cfg['outlier_detection'] = {'scale' : scale}  # Can set single parameters with this syntax
                     
output = Image3Pipeline.call(miri_asn_file, steps=cfg, save_results=True)

#### Look at the output combined mosaic and overlay the source catalog to see what sources were found

You will notice that the Four Quadrant phase mask (4QPM) regions of the image are visible in previous stages of the pipeline (rate files, cal files, etc), but are not part of the combined mosaic. While you can see those regions in earlier stages of processing, they should not be used, as there are extra optical elements in place, and data taken for the imager is not valid in those regions. The corongraph regions need to be taken in the correct subarray with the appropriate coronagraphic filter in place for it to be valid data.

In [None]:
# Use a single definition of vmin and vmax to use in the image displays, so all mosaics will be plotted with the same color scale
display_vals = [4.0,7.0]

In [None]:
# Look at the final i2d image (combined mosaic)

# Read your mosaic image into an ImageModel datamodel
miri_mosaic_file =  miri_asn_name + '_i2d.fits'
miri_mosaic = ImageModel(miri_mosaic_file)

plt.figure(figsize=(20,20))

fig, ax = plt.subplots(figsize=(20,20))

# Set up image
cax = ax.imshow(miri_mosaic.data, cmap='Greys', origin='lower', vmin=display_vals[0],vmax=display_vals[1])

# Set up colorbar
cb = fig.colorbar(cax)
cb.ax.set_ylabel('MJy/str',fontsize=14)

#Set labels 
ax.set_xlabel('X',fontsize=16)
ax.set_ylabel('Y',fontsize=16)
ax.set_title('Final MIRI mosaic', fontsize=16)
plt.tight_layout()


Keep in mind that the source finding algorithm finds peaks with a star-like profile, but not all the sources it finds will be real. MIRI imaging mosaics tend to have false positives near image edges, so examine the catalog against the image carefully to decide whether the individual sources are real or not.

In [None]:
# Look at catalog table that shows all columns, but subset of rows

# Get catalog output from source_catalog step of calwebb_image3
miri_catalog_file = miri_asn_name + '_cat.ecsv'

# Read in catalog from source_catalog step
print('Source catalog output file ', miri_catalog_file)

cat_data = table.Table.read(miri_catalog_file, format='ascii', comment='#')

miri_x = cat_data['xcentroid']
miri_y = cat_data['ycentroid']
cat_data

In [None]:
# Look at mosaic data and sources found with source_catalog

fig, ax = plt.subplots(figsize=(20,20))

# Set up image
cax = ax.imshow(miri_mosaic.data, cmap=cmap, origin='lower', vmin=display_vals[0],vmax=display_vals[1])
ax.scatter(miri_x, miri_y,lw=1, s=10,color='red') # overplot source positions

# Set up colorbar
cb = fig.colorbar(cax)
cb.ax.set_ylabel('MJy/str',fontsize=14)

#Set labels 
ax.set_xlabel('X',fontsize=16)
ax.set_ylabel('Y',fontsize=16)
ax.set_title('Final MIRI mosaic', fontsize=16)
plt.tight_layout()


### Explore Gaia catalog (ouput from tweakreg step) and see how well the Gaia catalog and JWST data line up

Read in the gaia catalog that was output from the tweakreg step (if using gaiadr3, the catalog name should be fit_gaiadr3_ref.ecsv), and convert the RA, Dec coordinates to pixel positions, then overplot those on the image.

You can also get the pixel positions from the full image3 pipeline and plot those positions with Gaia to see what sources match and what sources were only found with either Gaia or the JWST data.

In [None]:
# Read in gaia catalog and plot gaia sources on image

# Get Gaia catalog (or other external output catalog) from tweakreg
gaia_cat_name = glob.glob('*gaiadr3*')
print('Gaia catalog name ', gaia_cat_name)

# Glob creates a matching list. Choose first element of list to display
gaia_cat = table.Table.read(gaia_cat_name[0], format='ascii', comment='#', delimiter=' ')

gaia_cat # display catalog



In [None]:
# Gaia catalog has RA and DEC positions, so convert those to x,y detector coordinates

world_to_detector = miri_mosaic.meta.wcs.get_transform('world', 'detector')
xgaia,ygaia = world_to_detector(gaia_cat['RA'], gaia_cat['DEC'])

In [None]:
# Plot the Gaia sources on the output combined mosaic

fig, ax = plt.subplots(figsize=(20,20))

# Set up image
cax = ax.imshow(miri_mosaic.data, cmap=cmap, origin='lower', vmin=display_vals[0],vmax=display_vals[1])
ax.scatter(xgaia, ygaia,lw=1, s=10,color='red') # overplot source positions

# Set up colorbar
cb = fig.colorbar(cax)
cb.ax.set_ylabel('MJy/str',fontsize=14)

#Set labels 
ax.set_xlabel('X',fontsize=16)
ax.set_ylabel('Y',fontsize=16)
ax.set_title('Final MIRI mosaic', fontsize=16)
plt.tight_layout()

#### Plot Gaia catalog against Source catalog output to see how well sources match position on stars

Plot a zoomed in portion of mosaic so you can see whether the two catalogs match position or whether there is any evidence of source positions being mis-aligned in your image.

In [None]:
# Show zoomed in region to see if stars look like point sources and aren't smeared out or doubled

xmin = 725
xmax = 825
ymin = 550
ymax = 650

gaiazoom = np.where((xgaia > xmin) & (xgaia < xmax) & (ygaia > ymin) & (ygaia < ymax))
print(gaiazoom)
subx = xgaia[gaiazoom] - xmin
suby = ygaia[gaiazoom] - ymin

mirizoom = np.where((miri_x > xmin) & (miri_x < xmax) & (miri_y > ymin) & (miri_y < ymax))
print(gaiazoom)
subxmiri = miri_x[mirizoom] - xmin
subymiri = miri_y[mirizoom] - ymin


plt.figure(figsize=(20,20))
plt.xlabel('X coordinate of subimage')
plt.ylabel('Y coordinate of subimage')

plt.imshow(miri_mosaic.data[ymin:ymax,xmin:xmax], origin='lower', cmap='Greys', vmin=display_vals[0],vmax=display_vals[1])

plt.scatter(subx, suby,lw=1, s=15,color='red')
plt.scatter(subxmiri, subymiri,lw=1, s=5,color='yellow')
#plt.colorbar()

print('Gaia sources are marked in red, MIRI sources from the pipeline in yellow')

If you see several stars with red dots overlaid with yellow dots, then your output catalog aligns well with Gaia. If you see mis-alignment or doubled stars in your images, then you should try to correct the wcs and re-run calwebb_image3.