# Check input files

## Insprired from Getting started with Delight and LSST


- author : Sylvie Dagoret-Campagne
- affiliation : IJCLab/IN2P3/CNRS
- creation date : 2024-11-02
- last update :  2024-11-02



**test delight.interface.rail** : adaptation of the original tutorial on SDSS and Getting started.


- run at NERSC with **desc-python** python kernel.


Instruction to have a **desc-python** environnement:
- https://confluence.slac.stanford.edu/display/LSSTDESC/Getting+Started+with+Anaconda+Python+at+NERSC


This environnement is a clone from the **desc-python** environnement where package required in requirements can be addded according the instructions here
- https://github.com/LSSTDESC/desc-python/wiki/Add-Packages-to-the-desc-python-environment

We will use the parameter file "tmps/parametersTestRail.cfg".
This contains a description of the bands and data to be used.
In this example we will generate mock data for the ugrizy LSST bands,
fit each object with our GP using ugi bands only and see how it predicts the rz bands.
This is an example for filling in/predicting missing bands in a fully bayesian way
with a flexible SED model quickly via our photo-z GP.

In [None]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats
import sys, os, h5py

sys.path.append("../..")
from delight.io import *
from delight.utils import *
from delight.photoz_gp import PhotozGP

In [None]:
from delight.interfaces.rail.makeConfigParam import makeConfigParam

In [None]:
# path of the config parameter file
param_path = "."
configfullfilename = "./parametersTest.cfg"

- **makeConfigParam** generate a long string defining required parameters

In [None]:
params = parseParamFile(configfullfilename, verbose=False)

In [None]:
params

## Conclusion
Don't be too harsh with the results of the standard template fitting or the new methods since both have a lot of parameters which can be optimized!

If the results above made sense, i.e. the redshifts are reasonnable for both methods on the mock data, then you can start modifying the parameter files and creating catalog files containing actual data! I recommend using less than 20k galaxies for training, and 1000 or 10k galaxies for the delight-apply script at the moment. Future updates will address this issue.

## Test compatibility between textfile and hdf5file

In [None]:
def test_file_same(file_txt, file_hdf, prefix):
    """ """
    try:
        # if os.path.exists(file_txt):
        arr_txt = np.loadtxt(file_txt)
    except Exception as inst:
        print(f">>>> file {file_txt} does not exists ::", inst)
        exit(-1)
    try:
        # if os.path.exists(file_txt):
        arr_h5 = readdataarrayh5(file_hdf, prefix=prefix)
    except Exception as inst:
        print(f">>>> file {file_hdf} does not exists or bad prefix::", inst)
        exit(-1)

    # return np.array_equal(arr_txt,arr_h5)
    # return np.allclose(arr_txt,arr_h5,rtol=1e-10)
    return arr_txt, arr_h5

In [None]:
file_txt = params["training_" + "catFile"]
file_hdf = getFilePathh5(params, prefix="training_", ftype="catalog")
print(file_txt, file_hdf)
arr_txt, arr_h5 = test_file_same(file_txt, file_hdf, prefix="training_")
np.allclose(arr_txt, arr_h5, rtol=1e-12)

In [None]:
arr_h5.shape

In [None]:
file_txt = params["target_" + "catFile"]
file_hdf = getFilePathh5(params, prefix="target_", ftype="catalog")
print(file_txt, file_hdf)
arr_txt, arr_h5 = test_file_same(file_txt, file_hdf, prefix="target_")
np.allclose(arr_txt, arr_h5, rtol=1e-12)

In [None]:
arr_txt.shape

In [None]:
arr_h5.shape

In [None]:
np.argwhere(np.isnan(arr_h5))

In [None]:
plt.hist(arr_h5[:, 13], bins=100);