# Cataloguing the data

## Example 1: Forced photometry and producing photometric catalogues

The primary use for the galfind `Data` object is the creation of photometric catalogues for public and personal use. Once produced, these catalogues can be loaded into the `Catalogue` class to derive specific properties, which we cover in the [next section](../catalogue/catalogue.rst). In this first galfind release, we implement the ability to produce these using SExtractor only, although we aim to include other codes to perform forced photometry in the near future. 

The `Data.perform_forced_phot()` is used to:
1. Perform forced photometry using a specific selection band, including the possibility of using stacked data
2. Produce a photometric catalogue with aperture (and Kron) fluxes in a range of chosen aperture sizes, and other associated (currently just SExtractor) derived properties
3. Create a README for the catalogue which is updated at runtime to describe what is included

To start with, after loading the same JOF `Data` object we have seen in previous examples, we will make a catalogue using `forced_phot_band=F444W`.

In [2]:
from pathlib import Path
from astropy.table import Table
from copy import deepcopy

from galfind import Stacked_Band_Data, Data
from galfind.Data import morgan_version_to_dir

survey = "JOF"
version = "v11"
instrument_names = ["NIRCam"]
JOF_data = Data.from_survey_version(
    survey, 
    version, 
    instrument_names = instrument_names, 
    version_to_dir_dict = morgan_version_to_dir,
)
JOF_data.perform_forced_phot(forced_phot_band = "F444W")

__init__ imports took 2.3365020751953125e-05s
Reading GALFIND config file from: /nvme/scratch/work/austind/GALFIND/galfind/../configs/galfind_config.ini


TypeError: 'Band_Data' object is not iterable

Now we will search the GALFIND_WORK directory for the individual forced photometry catalogues for each band and the resulting catalogue/README to ensure they exist and have been created correctly.

In [None]:
# search for hpotometric catalogue
if Path(JOF_data.phot_cat_path).is_file():
    print("Photometric catalogue exists at the expected path.")
    # open the photometric catalogue
    phot_cat = Table.read(JOF_data.phot_cat_path)
    print(phot_cat)
else:
    print("Photometric catalogue does not exist at the expected path.")

# search for README
readme_path = JOF_data.phot_cat_path.replace(".fits", "_README.txt")
if Path(readme_path).is_file():
    print("README exists at the expected path.")
    # print the README
    with open(readme_path, "r") as f:
        print(f.read())
        f.close()
else:
    print("README does not exist at the expected path.")

# search for forced photometry catalogues for each included filter
for band_data in JOF_data:
    if Path(band_data.forced_phot_cat_path).is_file():
        print(f"Forced photometry catalogue for {band_data.filter_name} exists at the expected path.")
    else:
        print(f"Forced photometry catalogue for {band_data.filter_name} does not exist at the expected path.")

Let's have a look at how this changes the `Data` print statement.

In [None]:
print(JOF_data)

For safety reasons, once the (path to the) photometric catalogue has been loaded into the Data object, it is not possible to re-run it. This is so that, for example, you don't get confused between the products stemming from the previous catalogue and your newly loaded in one. To be clear, the `overwrite` parameter that we have been using simply states whether the pre-existing paths should be overwritten with the new data and NOT whether the data stored in the object should be updated. Preventing stored paths from being overwritten in a particular object, however, does not entirely prevent you from changing the outputs of any methods run from those stored paths as the information is not cached in a single object, rather extracted from the data products when required. Let's try re-producing this SExtractor forced photometric catalogue but instead using the F356W filter for selection in the same object to see what error message we get out of galfind.

In [None]:
JOF_data.perform_forced_phot(forced_phot_band = "F356W")

Let's create a second object now (and a deepcopied third pre-forced photometry for later in the notebook) and have a go at performing the forced photometry once again.

In [None]:
JOF_data_2 = Data.from_survey_version(
    survey, 
    version, 
    instrument_names = instrument_names, 
    version_to_dir_dict = morgan_version_to_dir,
)
JOF_data_3 = deepcopy(JOF_data_2)

This time, we will attempt to produce a photometric catalogue selecting from a stack of F277W, F356W, and F444W as in the EPOCHS series as well as a host of other high-redshift studies. To do this, we will need to create a `Stacked_Band_Data` object, as seen before in example 3 of the [Data class introduction notebook](data_intro.ipynb).

In [None]:
select_filters = ["F277W", "F356W", "F444W"]
stacked_NIRCam_LW = Stacked_Band_Data.from_band_data_arr(JOF_data_2[select_filters])
print(stacked_NIRCam_LW)

## Example 2: Running the Data pipeline

There is one last class method for the Data object that we havn't quite covered yet, `Data.pipeline()` which again just takes `survey` and `version` inputs. This class method is what is used in the EPOCHS pipeline and essentially just chains the cataloguing steps in the previous notebooks together elegantly, skipping those that have already been executed in the past. For further details, please read the previous notebooks in this section if you have not already done so.

In [1]:
# imports
from galfind import Data

survey = "JOF"
version = "v11"

# load the data object (short version)
data_short = Data.pipeline(survey, version)

# load the data object (long version)
data_long = Data.from_survey_version(survey, version)
data_long.PSF_homogenize("F444W")
data_long.segment("sextractor")
data_long.perform_forced_phot(["F277W", "F356W", "F444W"])
data_long.mask()
data_long.run_depths()

# ensure the two data objects are the same
assert data_short == data_long

# show the data object attributes
print(data_short)

__init__ imports took 0.9078459739685059s
Reading GALFIND config file from: /nvme/scratch/work/austind/GALFIND/galfind/../configs/galfind_config.ini


SyntaxError: invalid syntax (Data.py, line 462)