# Cataloguing the data

The primary use for the galfind `Data` object is the creation of photometric catalogues for public and personal use. Once produced, these catalogues can be loaded into the `Catalogue` class to derive specific properties, which we cover in the [next section](../catalogue/catalogue.rst). In this first galfind release, we implement the ability to produce these using SExtractor only, although we aim to include other codes to perform forced photometry in the near future.

The cataloguing procedure involves many different steps that have been explained in previous notebooks in this section. We outline the steps here.

1. Instantiate a blank `Data` object from the reduced imaging
2. Produce segmentation maps for each band using SExtractor
3. Mask the data (i.e. image edges, stars, artefacts) either manually or automatically
4. Perform forced photometry in a set of given apertures in either a single band or an inverse-variance weighted stack of bands.
5. Aperture correct the fluxes based on a given model or empirical PSF
6. Calculate local depths for each source
7. Determine appropriate flux errors based on these depths, accounting for the correlated image noise

We will create a `Data` object and associated .fits catalogue following these 7 steps in two ways, long (example 1) and short (example 2).

## Example 1: Producing photometric catalogues

To start with, we will load the same JOF `Data` object we have seen in previous examples.

In [1]:
from pathlib import Path
from astropy.table import Table
from copy import deepcopy
import astropy.units as u

from galfind import Stacked_Band_Data, Data
from galfind.Data import morgan_version_to_dir

survey = "JOF"
version = "v11"
instrument_names = ["NIRCam"]
aper_diams = [0.32] * u.arcsec
forced_phot_band = ["F277W", "F356W", "F444W"]
min_flux_pc_err = 10.
# 1
JOF_data_long = Data.from_survey_version(
    survey, 
    version, 
    instrument_names = instrument_names, 
    version_to_dir_dict = morgan_version_to_dir,
    aper_diams = aper_diams,
    forced_phot_band = forced_phot_band
)
# 2
JOF_data_long.mask()
# 3
JOF_data_long.segment()
# 4
JOF_data_long.perform_forced_phot()
# 5
JOF_data_long.append_aper_corr_cols()
# 6
JOF_data_long.run_depths()
# 7
JOF_data_long.append_loc_depth_cols(min_flux_pc_err = min_flux_pc_err)


Reading GALFIND config file from: /nvme/scratch/work/austind/GALFIND/galfind/../configs/galfind_config.ini


INFO:galfind:Aperture corrections for ACS_WFC loaded from /nvme/scratch/work/austind/GALFIND/galfind/Aperture_corrections/ACS_WFC_aper_corr.txt
INFO:galfind:Aperture corrections for WFC3_IR loaded from /nvme/scratch/work/austind/GALFIND/galfind/Aperture_corrections/WFC3_IR_aper_corr.txt
INFO:galfind:Aperture corrections for NIRCam loaded from /nvme/scratch/work/austind/GALFIND/galfind/Aperture_corrections/NIRCam_aper_corr.txt
INFO:galfind:Aperture corrections for MIRI loaded from /nvme/scratch/work/austind/GALFIND/galfind/Aperture_corrections/MIRI_aper_corr.txt
INFO:galfind:Aperture corrections for ACS_WFC loaded from /nvme/scratch/work/austind/GALFIND/galfind/Aperture_corrections/ACS_WFC_aper_corr.txt
INFO:galfind:Aperture corrections for WFC3_IR loaded from /nvme/scratch/work/austind/GALFIND/galfind/Aperture_corrections/WFC3_IR_aper_corr.txt
INFO:galfind:Aperture corrections for NIRCam loaded from /nvme/scratch/work/austind/GALFIND/galfind/Aperture_corrections/NIRCam_aper_corr.txt
IN

Now we will search the GALFIND_WORK directory for the individual forced photometry catalogues for each band and the resulting catalogue/README to ensure they exist and have been created correctly.

In [2]:
# search for photometric catalogue
if Path(JOF_data_long.phot_cat_path).is_file():
    print("Photometric catalogue exists at the expected path.")
    # open the photometric catalogue
    phot_cat = Table.read(JOF_data_long.phot_cat_path)
    print(phot_cat)
else:
    print("Photometric catalogue does not exist at the expected path.")

# # search for README
# readme_path = JOF_data_long.phot_cat_path.replace(".fits", "_README.txt")
# if Path(readme_path).is_file():
#     print("README exists at the expected path.")
#     # print the README
#     with open(readme_path, "r") as f:
#         print(f.read())
#         f.close()
# else:
#     print("README does not exist at the expected path.")

Photometric catalogue exists at the expected path.
  NUMBER     X_IMAGE   ... FLUXERR_APER_F444W_loc_depth_10pc_Jy
               pix     ...                                     
---------- ----------- ... ------------------------------------
         1   9219.8145 ...                9.334392262202193e-07
         2   6108.1621 ...                7.172339419546474e-07
         3   7386.6421 ...               3.1515218602729543e-07
         4   8138.5957 ...                3.234000112590365e-07
         5   8047.3140 ...               1.1086978875416937e-07
         6   9006.2188 ...                4.827971234428365e-08
         7    765.8052 ...                5.650294316943437e-07
         8   1647.3230 ...                1.901099027593121e-07
         9   6701.1450 ...               1.7682059984292007e-07
        10     82.3648 ...                2.371577141482952e-08
       ...         ... ...                                  ...
     16326  10042.8691 ...                 7.28102017

Let's have a look at how this changes the `Data` print statement.

In [3]:
print(JOF_data_long)

****************************************
DATA OBJECT:
----------
SURVEY: JOF
VERSION: v11
****************************************
MULTIPLE_FILTER
----------
FACILITY: JWST
INSTRUMENT: NIRCam
FILTERS: ['F090W', 'F115W', 'F150W', 'F162M', 'F182M', 'F200W', 'F210M', 'F250M', 'F277W', 'F300M', 'F335M', 'F356W', 'F410M', 'F444W']
****************************************
****************************************



For safety reasons, once the (path to the) photometric catalogue has been loaded into the Data object, it is not possible to re-run it. This is so that, for example, you don't get confused between the products stemming from the previous catalogue and your newly loaded in one. To be clear, the `overwrite` parameter that we have been using simply states whether the pre-existing paths should be overwritten with the new data and NOT whether the data stored in the object should be updated. Preventing stored paths from being overwritten in a particular object, however, does not entirely prevent you from changing the outputs of any methods run from those stored paths as the information is not cached in a single object, rather extracted from the data products when required. Let's try re-producing this SExtractor forced photometric catalogue but instead using the F356W filter for selection in the same object to see what error message we get out of galfind.

In [4]:
JOF_data_long.perform_forced_phot(forced_phot_band = "F356W")

CRITICAL:galfind:MASTER Photometric catalogue already exists!


## Example 2: Running the Data pipeline

There is one last class method for the Data object that we havn't quite covered yet, `Data.pipeline()` which again just takes `survey` and `version` inputs. This class method is what is used in the EPOCHS pipeline and essentially just chains the cataloguing steps in the previous notebooks together elegantly, skipping those that have already been executed in the past. For further details, please read the previous notebooks in this section if you have not already done so.

In [5]:
# load the data object (short version)
JOF_data_short = Data.pipeline(
    survey, 
    version, 
    instrument_names = instrument_names, 
    version_to_dir_dict = morgan_version_to_dir,
    aper_diams = aper_diams,
    forced_phot_band = forced_phot_band,
    min_flux_pc_err = min_flux_pc_err
)

# ensure the two data objects are the same
assert JOF_data_short == JOF_data_long

INFO:galfind:Aperture corrections for NIRCam loaded from /nvme/scratch/work/austind/GALFIND/galfind/Aperture_corrections/NIRCam_aper_corr.txt


INFO:galfind:Loaded aper_diams=<Quantity [0.32] arcsec> for F277W+F356W+F444W
INFO:galfind:Combined mask for <galfind.Data.Stacked_Band_Data object at 0x7f67f5128730> already exists at /raid/scratch/work/austind/GALFIND_WORK/Masks/JOF/combined/JOF_F277W+F356W+F444W_auto.fits
Calculating depths:   0%|          | 0/15 [00:00<?, ?it/s]
INFO:galfind:Calculated/loaded depths for JOF v11 NIRCam
INFO:galfind:Local depth columns already exist in /raid/scratch/work/austind/GALFIND_WORK/Catalogues/v11/NIRCam/JOF/(0.32)as/JOF_MASTER_Sel-F277W+F356W+F444W_v11.fits


Note that the two implementations are the same only if the default galfind pipeline parameters are used. Any deviation in masking, segmentation, performing forced photometry, running depths, or choice of PSF will produce differences between these two `Data` objects.

Fantastic! You've stuck it out through to the end of the `Data` class documentation. Feel free to now explore the next section which explores the galfind [Catalogue](../catalogue/catalogue.rst) class.