# Region Cataloger

This notebook takes a DS9 region file and calculates the photometry in the regions
for the specified image types.

## What this Notebook Does

The gist of the notebook is as follows:

1. A set of *HST* filters is read on which the photometry will be measured.
1. A DS9 region file is read for which the photometry will be measured.
1. Each of these are passed to the `RegionCataloger` class which does the following.
    1. Ping the [MAST Database](https://mast.stsci.edu/portal/Mashup/Clients/Mast/Portal.html) for all the images that contain the associated region.
    1. Downloads the discovered images.
    1. Calculates the photometry for each region on each discovered image.

## User Guide

As a user, you will need to:

1. Do not change any of the code except in the indicated cell below.
1. If running on Google Colab, you will need to uncomment and run the Pre-Processing cell below.
1. Define a DS9 Region file of Nucleated Dwarfs to be measured.
    1. The region file only needs to be created with one filter.
    1. The regions must all be *Circles*.
    1. **The regions should be labeled with text.** See/load the example region file.
    1. The region file should be saved in DS9 format in world (not image) coordinates (preferably with the ICRS system).
1. Add/Upload the region file to the directory where the notebook exists.
1. Specify the name of the region file in the appropriate cell below.
1. If needed change the filter set that is listed below.
1. Run the notebook and add the results to the Google Sheet.

### Pre-Processing Directives

<code style="background:yellow;color:black">If running on Google Colab</code>, uncomment (remove the `# `) the line below and run the line.

In [1]:
# %pip install astropy astroquery regions

<div class="alert alert-block alert-warning"><b>Warning:</b> Do not change any of the code below except in the single identified cell.</div>

## Imports

Here is a list of what is imported below:

- `pathlib`: Tells Python where files are / handles reading and making folders.
- `functools`: Caches class method results.
- `typing`: Tells Python what should be coming in and out of Python functions.
- `tqdm`: pretty progress bars.
- `warnings`: Controls what warnings get printed.
- `numpy`: Numerical Processing
- `astropy`: Anything that has to do with Astronomical processing and file reading.
- `astroquery`: Reading files from Astro databases.
- `regions`: Reads/converts DS9 Region Files
- `photutils`: Performs photometry.

In [50]:
# Python Imports
from pathlib import Path
from functools import lru_cache
from typing import Union, Iterable, List, NoReturn
from tqdm.notebook import tqdm
from warnings import catch_warnings, simplefilter
from time import sleep

In [3]:
# Numerical Imports
import numpy as np

In [4]:
# Astropy Imports
from astropy import units as u
from astropy.io import fits
from astropy.wcs import WCS
from astropy.table import Table, QTable, Column, vstack
from astropy.coordinates import SkyCoord

# Astroquery Imports
from astroquery.mast import Observations

# Regions
from regions import Regions

# PhotUtils
from photutils.aperture import ApertureStats, SkyCircularAperture as SkyCA
from photutils.background import Background2D

## Typing

In [5]:
PathLike = Union[str, Path]

## Functions

## Classes

This is where the magic happens. Each of these Python classes have basic functionality to streamline the process.

- `ImageSpecification`: This is a way to standardize the telescope, instrument, detector, and filter that is to be used.
- `ExtendedRegions`: This extends the `regions.Regions` class to add some functionality to streamline the process.
- `ImageGetter`: This is what looks at the region files and downloads the associated image files.
- `RegionCataloger`: This takes the images and regions and calculates the photometry on each.

In [6]:
# Image Type Class
class ImageSpecification:
    """Class to keep the image type associted with a space telescope."""

    # Class Constructor
    def __init__(
            self, telescope: str, instrument: str, detector: str, filter: str,
            base_path: PathLike = Path('.')
        ):
        """Initializes the ImageSpecification with telescope, instrument, detector, and filter."""
        self.telescope = telescope
        self.instrument = instrument
        self.detector = detector
        self.filter = filter
        self.base_path = Path(base_path)

    # Default Representation
    def __repr__(self):
        """Returns a string representation of the ImageSpecification."""
        return f"{self.telescope}-{self.instrument}-{self.detector}-{self.filter}"

    # String Representation
    def __str__(self):
        """Returns a string representation of the ImageSpecification."""
        return self.__repr__()

    # Hash Function
    def __hash__(self):
        """Returns a hash of the ImageSpecification."""
        return hash((self.telescope, self.instrument, self.detector, self.filter))

    # Make Directory
    def make_directory(self) -> NoReturn:
        """Creates a directory for the image type."""

        # Create the directory if it does not exist
        self.cache_path.mkdir(parents=True, exist_ok=True)

    # Get the MAST Instrument
    @property
    def mast_instrument(self) -> str:
        """Returns the MAST instrument name."""
        return f"{self.instrument}/{self.detector}"

    # Get the Cache Directory
    @property
    @lru_cache(maxsize=1)
    def cache_path(self) -> Path:
        """Returns the cache directory for the image type."""
        cache_dir = self.base_path / self.telescope / self.instrument / self.detector / self.filter
        return cache_dir

    # From FITS Header
    @classmethod
    def from_header(cls, header : fits.Header) -> 'ImageSpecification':
        """Creates an ImageSpecification from a FITS header."""

        # Get the Filter Set
        filtSet = set([
            header.get('FILTER'),
            header.get('FILTER1'),
            header.get('FILTER2')
        ])
        filters = [filt for filt in filtSet if filt is not None and not 'clear' in filt.lower()]

        # Return the ImageSpecification
        return cls(
            telescope=header.get('TELESCOP'),
            instrument=header.get('INSTRUME'),
            detector=header.get('DETECTOR'),
            filter='-'.join(filters)
        )

    # From a FITS File
    @classmethod
    def from_fits(cls, fits_file: PathLike) -> 'ImageSpecification':
        """Creates an ImageSpecification from a FITS file."""
        return cls.from_header(fits.getheader(fits_file, ext=0))

    # From String
    @classmethod
    def from_string(cls, img_type_str: str) -> 'ImageSpecification':
        """Creates an ImageSpecification from a string (presumably in the format
        'Telescope-Instrument-Detector-Filter').
        """

        # Split the string and validate the parts
        parts = img_type_str.split('-')
        if len(parts) != 4:
            raise ValueError("Image type string must be in the format 'Telescope-Instrument-Detector-Filter'")
        return cls(*parts)

In [7]:
# Extending the default Regions class to handle SkyCoord more easily
class ExtendedRegions(Regions):
    """Extends the Regions class for easier SkyCoord handling."""

    # Have to Call the Parent Read Method due to the RegionsRegistry
    @classmethod
    def read(
        cls, filename: PathLike, format: str=None, cache: bool=False, **kwargs
    ) -> 'ExtendedRegions':
        """Reads regions from a file and returns an ExtendedRegions object."""

        # Read the regions using the parent class method
        regions = Regions.read(filename, format=format, cache=cache, **kwargs)

        # Return an instance of ExtendedRegions
        return cls(regions)

    # Make a Getter for the Coordinates
    @property
    @lru_cache
    def coordinates(self) -> SkyCoord:
        """Returns the SkyCoord object of the regions."""
        return SkyCoord(
            [reg.center for reg in self]
        )

    # Calculate the Center and Extent of the Regions
    def get_center_and_extent(
            self, extentAdd : u.Quantity=10*u.arcsec
            ) -> tuple[SkyCoord, u.Quantity]:
        """Calculates the center and extent of the regions."""

        # Get the Center
        center = SkyCoord(
            ra=self.coordinates.ra.mean(),
            dec=self.coordinates.dec.mean(),
            frame=self.coordinates.frame
        )

        # Get the Extent (Max Sep)
        extent = center.separation(self.coordinates).max().to(u.arcmin)

        # Return the Center and Extent
        return center, extent + extentAdd

In [8]:
# Image Getter Class
ImageSpecTypes = Union[ImageSpecification, Iterable[ImageSpecification]]
class ImageGetter:

    # Constructor
    def __init__(self, img_types: ImageSpecTypes):
        """Initializes the ImageGetter with an image type and base path."""
        self.img_types = list(img_types)

    # Download the Images
    def download_images(
            self, regions: ExtendedRegions, cache: bool = True
        ) -> dict[ImageSpecification, Union[Table, None]]:
        """Downloads images for the specified regions and image types."""

        # Get the Center and Extent of the Regions
        center, extent = regions.get_center_and_extent()

        # Query MAST for Observations
        obs_table = Observations.query_region(
            center, radius=extent
        )

        # Loop through each Image Type
        responses = {}
        for img_type in tqdm(self.img_types, desc="Downloading Images for Each Filter"):

            # Make the Directory
            img_type.make_directory()

            # Filter the Observations
            # https://masttest.stsci.edu/api/v0/_productsfields.html
            filt_table = Observations.filter_products(
                obs_table,
                intentType='science',
                obs_collection=img_type.telescope,
                instrument_name=img_type.mast_instrument,
                filters=img_type.filter,
                calib_level=[3],
                project='HAP',
                provenance_name=['HAP-SVM'],
                dataproduct_type='image'
            )

            # Download the Images
            # https://mast.stsci.edu/api/v0/_c_a_o_mfields.html
            if len(filt_table):
                responses[img_type] = Observations.download_products(
                    filt_table['obsid'], download_dir=img_type.cache_path,
                    productSubGroupDescription=['DRC'],
                    project='HAP-SVM',
                    calib_level=[3],
                    flat=True,
                    cache=cache
                )
            else:
                responses[img_type] = None

        # Return the Responses
        return responses

In [51]:
# Make the Region Cataloger Class
class RegionCataloger:
    """Class to handle the cataloging of regions and downloading images."""

    # Constructor
    def __init__(self, regions: ExtendedRegions, img_types: ImageSpecTypes):
        """Initializes the RegionCataloger with regions and image types."""

        # Store the Inputs
        self.regions = regions
        self.img_types = list(img_types)

        # Make an ImageGetter
        self.image_getter = ImageGetter(self.img_types)

        # Download the Images
        self.responses = self._download_images(cache=True)

        # Calculate the Photometry
        self.photometry = {}
        for img_type, response in tqdm(self.responses.items(), desc="Calculating Photometry for Each Filter"):
            # Calculate the Photometry for each Image Spec
            self.photometry[img_type] = self._calculate_photometry(img_type, response)

    # Download Images for Regions
    def _download_images(
            self, cache: bool = True
        ) -> dict[ImageSpecification, Union[Table, None]]:
        """Calls the ImageGetter to download images for the specified regions."""
        return self.image_getter.download_images(self.regions, cache=cache)

    # Calculate Photometry for a given Image Type
    def _calculate_photometry(
            self, img_type: ImageSpecification, response: Union[Table, None]
        ) -> Union[QTable, None]:
        """Calculates the photometry for a given image type."""

        # If no response, return None
        if response is None:
            return None
        else:
            # Calculate the Photometry by Calling Photutils
            return self._aperture_photometry(img_type, response)

    # Call Photutils to Calculate the Photometry
    def _aperture_photometry(
            self, img_type: ImageSpecification, response: Table
        ) -> QTable:
        """Calculates the aperture photometry for a given image type."""

        # Get Regions Centroid
        center = self.regions.get_center_and_extent()[0]

        # Get Region IDs
        ids = [reg.meta.get('text', str(i)) for i, reg in enumerate(self.regions)]
        ids = Column(ids, name='id')

        # Loop through the Possible Images
        photTables = []
        for fileName in tqdm(response['Local Path'], desc=f"Processing {img_type} Images", leave=False):

            # Open the Image
            with fits.open(fileName) as hdul:
                image_data = hdul['SCI'].data
                header = hdul['SCI'].header
                wcs = WCS(header)

            # If the WCS does not contain the Regions, continue
            if not wcs.footprint_contains(center):
                sleep(1e-6)
                continue

            # Get the Background
            with catch_warnings():
                simplefilter("ignore")
                bkg = Background2D(image_data, (64, 64), filter_size=(3, 3))

            # Loop over Regions
            stats = []
            for reg in tqdm(self.regions, desc=f"Processing Regions for {Path(fileName).name}", leave=False):

                # Get the Aperture
                aper = SkyCA(reg.center, r=reg.radius)

                # Get the Stats for the Aperture
                stats.append(ApertureStats(
                    image_data - bkg.background, aper, wcs=wcs
                ))

            # Convert the Stats to a QTable
            statsTable = QTable(vstack(
                [stat.to_table(columns=[
                    'id', 'xcentroid', 'ycentroid', 'sky_centroid',
                    'sum', 'sum_aper_area'
                ]) for stat in stats],
                metadata_conflicts='silent'
            ))

            # Convert Area to arcsec2
            pixScale = wcs.proj_plane_pixel_scales()[0] / u.pix
            statsTable['sum_aper_area']  *= pixScale**2
            statsTable['sum_aper_area'] <<= u.arcsec**2

            # Convert Sky Centroid to Sexagesimal
            statsTable['sky_centroid'] = statsTable['sky_centroid'].to_string(
                decimal=False, sep=':', precision=8
            )

            # Store Table
            photTables.append(statsTable)

        # If there is only one table, get it. Otherwise, get the means
        if len(photTables) == 1:
            photTable = photTables[0]
        elif len(photTables) == 0:
            photTable = QTable()
        else:
            # Get first table as a base
            photTable = photTables[0].copy()

            # Get the Mean of Each Column
            # This is overkill as some columns will have the same data in each
            # but its easier to just do this
            with catch_warnings():
                simplefilter("ignore")  # Ignore where columns are NaN everywhere (like error cols)
                for col in photTable.colnames:
                    if col == 'id':
                        continue
                    photTable[col] = np.nanmean(
                        [table[col] for table in photTables],
                        axis=0
                    )

        # Add the IDs
        photTable['id'] = ids

        # Calculate the Aperture Photometry by Calling Photutils
        return photTable

    # Get the AB Zeropoint
    @staticmethod
    def get_ab_zeropoint(
            photflam: float, photplam:float
    ) -> u.Quantity:
        """Calculates the AB Zeropoint from photflam and photplam."""

        # Calculate the Zeropoint
        zeropoint = -2.5*np.log10(photflam) - 5*np.log10(photplam) - 2.408

        # Return the Zeropoint in AB Magnitudes
        return zeropoint * u.ABmag

## The Catalog

<div class="alert alert-block alert-success"><b>Note:</b> The cell below is the only cell you should need to change. Likely, the `ImageSpecifications` will be the same for your entire project. Therefore, after you add the new region file to your directory, you will need to indicate the new file name below. That is, you will need to change what is passed to the `REGION_FILE_NAME`.</div>

In [52]:
# Default Image Specifications
DEF_IMG_SPECS = [
    ImageSpecification('HST', 'WFC3', 'UVIS', 'F336W'),
    ImageSpecification('HST', 'ACS', 'WFC', 'F475W'),
    ImageSpecification('HST', 'ACS', 'WFC', 'F814W')
]

# Region File Name
REGION_FILE_NAME = 'ExampleRegions.reg'

In [53]:
# Load the Regions
regs = ExtendedRegions.read(REGION_FILE_NAME)

In [54]:
# Get the catalogs for the Image Types and Regions
catalogs = RegionCataloger(regs, DEF_IMG_SPECS)

Downloading Images for Each Filter:   0%|          | 0/3 [00:00<?, ?it/s]

INFO: Found cached file HST/WFC3/UVIS/F336W/hst_14182_15_wfc3_uvis_f336w_icwr15_drc.fits with expected size 282726720. [astroquery.query]
INFO: Found cached file HST/WFC3/UVIS/F336W/hst_14182_16_wfc3_uvis_f336w_icwr16_drc.fits with expected size 282726720. [astroquery.query]
INFO: Found cached file HST/WFC3/UVIS/F336W/hst_14182_19_wfc3_uvis_f336w_icwr19_drc.fits with expected size 282726720. [astroquery.query]
INFO: Found cached file HST/WFC3/UVIS/F336W/hst_14182_20_wfc3_uvis_f336w_icwr20_drc.fits with expected size 282726720. [astroquery.query]
INFO: Found cached file HST/ACS/WFC/F475W/hst_10861_02_acs_wfc_f475w_j9ty02_drc.fits with expected size 306984960. [astroquery.query]
INFO: Found cached file HST/ACS/WFC/F475W/hst_10861_03_acs_wfc_f475w_j9ty03_drc.fits with expected size 312134400. [astroquery.query]
INFO: Found cached file HST/ACS/WFC/F475W/hst_10861_09_acs_wfc_f475w_j9ty09_drc.fits with expected size 307105920. [astroquery.query]
INFO: Found cached file HST/ACS/WFC/F475W/hst_

Calculating Photometry for Each Filter:   0%|          | 0/3 [00:00<?, ?it/s]

Processing HST-WFC3-UVIS-F336W Images:   0%|          | 0/4 [00:00<?, ?it/s]

Processing Regions for hst_14182_20_wfc3_uvis_f336w_icwr20_drc.fits:   0%|          | 0/7 [00:00<?, ?it/s]

Processing HST-ACS-WFC-F475W Images:   0%|          | 0/4 [00:00<?, ?it/s]

Processing Regions for hst_10861_02_acs_wfc_f475w_j9ty02_drc.fits:   0%|          | 0/7 [00:00<?, ?it/s]

Processing HST-ACS-WFC-F814W Images:   0%|          | 0/4 [00:00<?, ?it/s]

Processing Regions for hst_10861_02_acs_wfc_f814w_j9ty02_drc.fits:   0%|          | 0/7 [00:00<?, ?it/s]

### Display the Catalogs

In [55]:
# Display the Catalogs
for img_spec, phot_table in catalogs.photometry.items():
    if phot_table is not None:
        print(f"Photometry for {img_spec}")
        phot_table.pprint_all(show_unit=True)
        print('\n\n')
    else:
        print(f"No photometry data available for {img_spec}")

Photometry for HST-WFC3-UVIS-F336W
 id     xcentroid          ycentroid                  sky_centroid                    sum           sum_aper_area   
                                                                                                       arcsec2      
--- ------------------ ------------------ ------------------------------------ ------------------ ------------------
  A  902.0656086955607 1947.2414647035516 195:08:09.38521667 28:03:31.43999625 13.602761052406429 165.90529088485636
  C 1456.4450761701255 1475.6869635522148 195:07:44.49330357 28:03:12.75968815 3.6081527649137675  20.92791324682017
  E 2589.4542283572673  1749.397438619467 195:06:53.62686322 28:03:23.60569804  7.587032591838318  73.71538850084968
  F 3118.2220747506685 2280.6071687164417 195:06:29.88547197 28:03:44.65119274 11.766401937490405  67.05541179199317
  G  2833.452586310629  3950.917682302877 195:06:42.66763202 28:04:50.82959323 2.5740665639977056  34.94150632297272
  B 280.87612418299767 3875.6

In [43]:
tmp = phot_table['sky_centroid']

In [37]:
tmp.to_string(decimal=False, sep=':', precision=8)

['195:08:09.20954907 28:03:31.34915542',
 '195:07:44.48521971 28:03:13.07534642',
 '195:06:54.12665215 28:03:24.19648177',
 '195:06:29.77721619 28:03:44.79775273',
 '195:06:43.62398260 28:04:51.17014804',
 '195:08:37.49843921 28:04:48.15510066',
 '195:07:27.89583744 28:04:00.90812720']