# NRC-23 - Image Quality Verification by Filter   

## Notebook: Perform PSF photometry multiprocessing

**Author**: Matteo Correnti, STScI Scientist II
<br>
**Created**: October, 2021
<br>
**Last Updated**: February, 2022

## Table of contents
1. [Introduction](#intro)<br>
2. [Setup](#setup)<br>
    2.1 [Python imports](#py_imports)<br>
    2.2 [PSF FWHM dictionary](#psf_fwhm)<br>
3. [Import images to analyze](#data)<br>
    3.1 [Convert image units and apply pixel area map](#convert_data)<br>
4. [Perform PSF photometry](#psf_phot)<br>
    4.1 [Calculate the background](#bkg)<br>
    4.2 [Find sources in the image](#find)<br>
    4.3 [PSF Photometry parameters](#psf_param)<br>

1.<font color='white'>-</font>Introduction <a class="anchor" id="intro"></a>
------------------

This notebook shows how to perform PSF photometry on the images with multiprocessing. PSF photometry allows to derive accurate positions for the sources in the field of view and the output catalogs will be used to compare the possible position offset between different filters (using the notebook `NRC-23_filter_offset.ipynb`).

**Dependencies**:  before running this notebook it is necessary to create the synthetic model PSFs using the notebook `NRC-23_webbpsf.ipynb` or use library empirical PSFs.

2.<font color='white'>-</font>Setup <a class="anchor" id="setup"></a>
------------------

In this section we import all the necessary Python packages.

### 2.1<font color='white'>-</font>Python imports<a class="anchor" id="py_imports"></a> ###

In [None]:
import os 

os.environ['WEBBPSF_PATH'] = '/grp/jwst/ote/webbpsf-data'

import sys
import time
import copy

import glob as glob

import numpy as np

import pandas as pd

from astropy.io import fits
from astropy.visualization import simple_norm
from astropy.modeling.fitting import LevMarLSQFitter
from astropy.table import Table, QTable

from photutils.background import MMMBackground, MADStdBackgroundRMS
from photutils.detection import DAOStarFinder
from photutils.psf import DAOGroup, IterativelySubtractedPSFPhotometry

from webbpsf.utils import to_griddedpsfmodel

import multiprocessing as mp
from multiprocessing import Pool

### 2.2<font color='white'>-</font>PSF FWHM dictionary<a class="anchor" id="psf_fwhm"></a> ###

The dictionary contains the NIRCam point spread function (PSF) FWHM, from the [NIRCam Point Spread Function](https://jwst-docs.stsci.edu/near-infrared-camera/nircam-predicted-performance/nircam-point-spread-functions) JDox page. The FWHM are calculated from the analysis of the expected NIRCam PSFs simulated with [WebbPSF](https://www.stsci.edu/jwst/science-planning/proposal-planning-toolbox/psf-simulation-tool). 
FWHM is a parameter used in the finding algorithm to exclude spurious detection.

**Note**: this dictionary need to be updated once the values for the FWHM will be available for each detectors during commissioning.

In [None]:
filters = ['F070W', 'F090W', 'F115W', 'F140M', 'F150W2', 'F150W', 'F162M', 'F164N', 'F182M',
           'F187N', 'F200W', 'F210M', 'F212N', 'F250M', 'F277W', 'F300M', 'F322W2', 'F323N',
           'F335M', 'F356W', 'F360M', 'F405N', 'F410M', 'F430M', 'F444W', 'F460M', 'F466N', 'F470N', 'F480M']

psf_fwhm = [0.987, 1.103, 1.298, 1.553, 1.628, 1.770, 1.801, 1.494, 1.990, 2.060, 2.141, 2.304, 2.341, 1.340,
            1.444, 1.585, 1.547, 1.711, 1.760, 1.830, 1.901, 2.165, 2.179, 2.300, 2.302, 2.459, 2.507, 2.535, 2.574]

dict_utils = {filters[i]: {'psf fwhm': psf_fwhm[i]} for i in range(len(filters))}

3.<font color='white'>-</font>Import images to analyze<a class="anchor" id="data"></a>
------------------

In [None]:
# define the right path for the directory containing the Level-2 (*cal.fits) images

images_dir = '../Simulation/Pipeline_Outputs/Level2_Outputs'
images = sorted(glob.glob(os.path.join(images_dir, "*cal.fits")))

### 3.1<font color='white'>-</font>Convert image units and apply pixel area map<a class="anchor" id="convert_data"></a> ###

The unit of the Level-2 and Level-3 Images from the pipeline is MJy/sr (hence a surface brightness). The actual unit of the image can be checked from the header keyword **BUNIT**. The scalar conversion constant is copied to the header keyword **PHOTMJSR**, which gives the conversion from DN/s to megaJy/steradian. It is possible to revert back to DN/s setting `convert = True` in the function below.

For images that have not been transformed into a distortion-free frame (i.e. not drizzled), a correction must be applied to account for the different on-sky pixel size across the field of view. A pixel area map (PAM), which is an image where each pixel value describes that pixel's area on the sky relative to the native plate scale, is used for this correction. In the stage 2 of the JWST pipeline, the PAM is copied into an image extension called **AREA** in the science data product. To apply the PAM correction, set the parameter `pam = True` in the function below.

**Note**: We retrieve the NIRCam detector and filter from the image header. Note that for the LW channels, we transform the detector name derived from the header **NRCBLONG** (**NRCALONG**) to **NRCB5** (**NRCA5**). We also transform all the detector names to lowercase. This will allow us to use the correct PSF model (previously created using the `NRC-23_webbpsf.ipynb` notebook and saved as fits files) in the PSF photometry routine.

In [None]:
def convert_data(image, convert=True, pam=True):
    
    im = fits.open(image)
    data_sb = im[1].data
    imh = im[1].header
    
    f = im[0].header['FILTER']
    d = im[0].header['DETECTOR']
    p = im[0].header['PUPIL']

    if d == 'NRCBLONG':
        d = 'NRCB5'
    elif d == 'NRCALONG':
        d = 'NRCA5'
    else:
        d = d
    
    det = str.lower(d)
    
    if p == 'CLEAR':
        filt = f
    else:
        filt = p
    
    if convert:
    
        data = data_sb / imh['PHOTMJSR']
        #print('Conversion factor from {units} to DN/s for filter {f}:'.format(units=imh['BUNIT'], f=filt), imh['PHOTMJSR'])
    else:
        #print('Keep data in MJy/sr')
        data = data_sb
    
    if pam:
        #print('Apply pixel area map')
        area = im[4].data
        data = data * area
    else:
        data = data

    return data, det, filt

4.<font color='white'>-</font>Perform PSF photometry<a class="anchor" id="psf_phot"></a>
------------------

### 4.1<font color='white'>-</font>Calculate the background<a class="anchor" id="bkg"></a> ###

We adopted as Background estimator the function [MMMBackground](https://photutils.readthedocs.io/en/stable/api/photutils.background.MMMBackground.html#photutils.background.MMMBackground), which calculates the background in an array using the DAOPHOT MMM algorithm, on the whole image (The background is calculated using a mode estimator of the form `(3 * median) - (2 * mean)`).

### 4.2<font color='white'>-</font>Find sources in the image<a class="anchor" id="find"></a> ###

To find sources in the image, we use the [DAOStarFinder](https://photutils.readthedocs.io/en/stable/api/photutils.detection.DAOStarFinder.html) function. 

[DAOStarFinder](https://photutils.readthedocs.io/en/stable/api/photutils.detection.DAOStarFinder.html) detects stars in an image using the DAOFIND ([Stetson 1987](https://ui.adsabs.harvard.edu/abs/1987PASP...99..191S/abstract)) algorithm. DAOFIND searches images for local density maxima that have a peak amplitude greater than `threshold` (approximately; threshold is applied to a convolved image) and have a size and shape similar to the defined 2D Gaussian kernel.

**Important parameters**:

* `threshold`: The absolute image value above which to select sources.
* `fwhm`: The full-width half-maximum (FWHM) of the major axis of the Gaussian kernel in units of pixels.

### 4.3<font color='white'>-</font>PSF Photometry parameters<a class="anchor" id="psf_param"></a> ###

For general information on PSF Photometry with PhotUtils see [here](https://photutils.readthedocs.io/en/stable/psf.html). 

Photutils provides three classes to perform PSF Photometry: [BasicPSFPhotometry](https://photutils.readthedocs.io/en/stable/api/photutils.psf.BasicPSFPhotometry.html#photutils.psf.BasicPSFPhotometry), [IterativelySubtractedPSFPhotometry](https://photutils.readthedocs.io/en/stable/api/photutils.psf.IterativelySubtractedPSFPhotometry.html#photutils.psf.IterativelySubtractedPSFPhotometry), and [DAOPhotPSFPhotometry](https://photutils.readthedocs.io/en/stable/api/photutils.psf.DAOPhotPSFPhotometry.html#photutils.psf.DAOPhotPSFPhotometry). Together these provide the core workflow to make photometric measurements given an appropriate PSF (or other) model.

[BasicPSFPhotometry](https://photutils.readthedocs.io/en/stable/api/photutils.psf.BasicPSFPhotometry.html#photutils.psf.BasicPSFPhotometry) implements the minimum tools for model-fitting photometry. At its core, this involves finding sources in an image, grouping overlapping sources into a single model, fitting the model to the sources, and subtracting the models from the image. In DAOPHOT parlance, this is essentially running the “FIND, GROUP, NSTAR, SUBTRACT” once.

[IterativelySubtractedPSFPhotometry](https://photutils.readthedocs.io/en/stable/api/photutils.psf.IterativelySubtractedPSFPhotometry.html#photutils.psf.IterativelySubtractedPSFPhotometry) (adopted here) is similar to [BasicPSFPhotometry](https://photutils.readthedocs.io/en/stable/api/photutils.psf.BasicPSFPhotometry.html#photutils.psf.BasicPSFPhotometry), but it adds a parameter called `n_iters` which is the number of iterations for which the loop “FIND, GROUP, NSTAR, SUBTRACT, FIND…” will be performed. This class enables photometry in a scenario where there exists significant overlap between stars that are of quite different brightness. For instance, the detection algorithm may not be able to detect a faint and bright star very close together in the first iteration, but they will be detected in the next iteration after the brighter stars have been fit and subtracted. Like [BasicPSFPhotometry](https://photutils.readthedocs.io/en/stable/api/photutils.psf.BasicPSFPhotometry.html#photutils.psf.BasicPSFPhotometry), it does not include implementations of the stages of this process, but it provides the structure in which those stages run.

**Important parameters**:

* `finder`: classes to find stars in the image. We use [DAOStarFinder](https://photutils.readthedocs.io/en/stable/api/photutils.detection.DAOStarFinder.html).

* `group_maker`:  clustering algorithm in order to label the sources according to groups. We use [DAOGroup](https://photutils.readthedocs.io/en/stable/api/photutils.psf.DAOGroup.html#photutils.psf.DAOGroup). The method group_stars divides an entire starlist into sets of distinct, self-contained groups of mutually overlapping stars. It accepts as input a list of stars and determines which stars are close enough to be capable of adversely influencing each others’ profile fits. [DAOGroup](https://photutils.readthedocs.io/en/stable/api/photutils.psf.DAOGroup.html#photutils.psf.DAOGroup) aceepts one parameter, `crit_separation`, which is the distance, in units of pixels, such that any two stars separated by less than this distance will be placed in the same group.

* `fitter`: algorithm to fit the sources simultaneously for each group. We use an astropy fitter, [LevMarLSQFitter](https://docs.astropy.org/en/stable/api/astropy.modeling.fitting.LevMarLSQFitter.html#astropy.modeling.fitting.LevMarLSQFitter). 

* `niters`: number of iterations for which the "psf photometry" loop described above is performed.

* `fitshape`: Rectangular shape around the center of a star which will be used to collect the data to do the fitting. 

* `aperture_radius`: The radius (in units of pixels) used to compute initial estimates for the fluxes of sources.

In [None]:
def psf_phot(image, th, ap_radius, fitshape, fov, npsfs):

    fitter = LevMarLSQFitter()
    mmm_bkg = MMMBackground()
    
    data, det, filt = convert_data(image, convert=True, pam=True)
    
    sigma_psf = dict_utils[filt]['psf fwhm']
    #print('FWHM for filter {f}:'.format(f=filt), sigma_psf)
    
    daofind = DAOStarFinder(threshold=th, fwhm=sigma_psf)
    
    daogroup = DAOGroup(5.0 * sigma_psf)
    
    fov = str(fov)
    npsfs = str(npsfs)
    
    outname = 'PSF_'+filt+'_fov'+fov+'_npsfs'+npsfs+'_'+det+'.fits'
    #print('File for PSF model:', outname)
    psf = to_griddedpsfmodel(os.path.join(psf_dir, outname))
    
    psf_model = psf.copy()
    
    #print('Performing the PSF photometry --- Detector {d}, filter {f}'.format(f=filt, d=str.upper(det)))
    
    phot = IterativelySubtractedPSFPhotometry(finder=daofind, group_maker=daogroup,
                                              bkg_estimator=mmm_bkg, psf_model=psf_model,
                                              fitter=LevMarLSQFitter(),
                                              niters=2, fitshape=fitshape, aperture_radius=ap_radius, 
                                              extra_output_cols=('sharpness', 'roundness2'))
    result = phot(data)
    print('Number of sources detected for detector {0}, filter {1}:'.format(str.upper(det), filt), len(result))
    print('')
    
    residual_image = phot.get_residual_image()
        
    filename = str(image)
    num = str(filename[-20:-15])
    
    hdu = fits.PrimaryHDU(residual_image)
    hdul = fits.HDUList([hdu])
    
    residual_outname = 'residual_%s_%s_%s.fits' % (det, filt, num)

    hdul.writeto(os.path.join(res_dir, residual_outname), overwrite=True) 

    outname = 'phot_%s_%s_%s.pkl' % (det, filt, num)
    #print('Photometry catalog output name:', outname)
    tab = result.to_pandas()
    tab.to_pickle(os.path.join(output_phot_dir, outname))

    return

**Note**: due to the different throughputs of the filters, it is possible that we need to set a different thresholds depending on the analyzed images. If that's the case, we can create a dictionary (similar to what we do in `NRC-23_aperture_photometry.ipynb` for example), and perform PSF photometry on images grouped by filters. 

In [None]:
tic = time.perf_counter()

# PSF parameters:

oversample = 4 
fov = 11
npsfs = 16
distorted = True

# PSF photometry parameters:

th = 10
ap_radius = 3.5
fitshape=(11,11)


if distorted:

    psf_dir = 'PSF_MODELS/Distorted/Fov{}px_numPSFs{}_oversample{}'.format(fov, npsfs, oversample)
    
else:
    
    psf_dir = 'PSF_MODELS/Undistorted/Fov{}px_numPSFs{}_oversample{}'.format(fov, npsfs, oversample)

output_phot_dir = 'PSF_PHOT_OUTPUT/numPSFs{}_Th{}_fitshape{}x{}'.format(npsfs, th, fitshape[0], fitshape[1])

if not os.path.exists(output_phot_dir):
    os.makedirs(output_phot_dir)

res_dir = 'RESIDUAL_IMAGES/numPSFs{}_Th{}_fitshape{}x{}'.format(npsfs, th, fitshape[0], fitshape[1])

if not os.path.exists(res_dir):
    os.makedirs(res_dir)

ncpu = mp.cpu_count()
print('Total CPU available:', ncpu)
nimages = len(images)
print('Number of images to process:', nimages)
nsplit = np.min([nimages, ncpu])

with Pool(processes=nsplit) as pool:
    
    p = pool.starmap(psf_phot, [(image, th, ap_radius, fitshape, fov, npsfs) for image in images])
    
    pool.close()
    pool.join()
        
toc = time.perf_counter()
print('Time needed to perform photometry:', '%.2f' % ((toc - tic) / 3600), 'hours')