<img align="left" src = "images/linea.png" width=140 style="padding: 20px"> 
<img align="left" src = "images/rubin.png" width=180 style="padding: 30px"> 

# Photo-z Server
## Tutorial Notebook 2 - Training Sets


Contact author: [Julia Gschwend](mailto:julia@linea.org.br)

Last verified run: **2024-Jul-22**

### Introduction 

Welcome to the PZ Server tutorials. If you are reading this notebook for the first time, we recommend not to skip the introduction notebook: `0_introduction.ipynb` also available in this same repository. 


### Imports and Setup

In [None]:
from pzserver import PzServer 
import matplotlib.pyplot as plt
%reload_ext autoreload 
%autoreload 2

In [None]:
# pz_server = PzServer(token="<your token>", host="pz-dev") # "pz-dev" is the temporary host for test phase  

For convenience, the token can be saved into a file named as `token.txt` (which is already listed in the .gitignore file in this repository). 

In [None]:
with open('token.txt', 'r') as file:
    token = file.read()
pz_server = PzServer(token=token, host="pz-dev") # "pz-dev" is the temporary host for test phase  

### Product types 

The PZ Server API provides Python classes with useful methods to handle particular product types. Let's recap the product types available:   

In [None]:
pz_server.display_product_types()

## Training Sets 
    
In the context of the PZ Server, Training Sets are defined as the product of matching (spatially) a given Spec-z Catalog (single survey or compilation) to the photometric data, in this case, the LSST Objects Catalog. The PZ Server API offers a tool called _Training Set Maker_ for users to build customized Training Sets based on the Spec-z Catalogs available. Please see the companion Jupyter Notebook `pz_tsm_tutorial.ipynb` for details.   


_Note 1: Commonly the training set is split into two or more subsets for photo-z validation purposes. If the Training Set owner has previously defined which objects should belong to each subset (trainining and validation/test sets), this information must be available as an extra column in the table or as clear instructions for reproducing the subsets separation in the data product description._

  
_Note 2: The PZ Server only supports catalog-level Training Sets. Image-based Training Sets, e.g., for deep-learning algorithms, are not supported yet._


Mandatory column: 
* Spectroscopic (or true) redshift - `float`

Other expected columns
* Object ID from LSST Objects Catalog - `integer`
* Observables: magnitudes (and/or colors, or fluxes) from LSST Objects Catalog - `float`
* Observable errors: magnitude errors (and/or color errors, or flux errors) from LSST Objects Catalog - `float`
* Right ascension [degrees] - `float`
* Declination [degrees] - `float`
* Quality Flag - `integer`, `float`, or `string`
* Subset Flag - `integer`, `float`, or `string`



#### PZ Server Pipelines
Training Sets can be uploaded by users on PZ Server website or via the `pzserver` library. Also, they can be created as the spatial cross-matching between a given Spec-z Catalog previously registered in the system and an Object table from a given LSST Data Release available in the Brazilian IDAC by the PZ Sever's pipeline "Training Set Maker" (under development). Any Training Set built by the pipeline is automatically registered as a regular user-generated data product and has no difference from the uploaded ones. 



In [None]:
train_goldenspike = pz_server.get_product(9)

In [None]:
train_goldenspike.display_metadata()

Display basic statistics

In [None]:
train_goldenspike.data.describe()

The training set object has a very basic plot method for quick visualization of catalog properties. For advanced interactive data visualization tips, we recommend the notebook [**DP02_06b_Interactive_Catalog_Visualization.ipynb**](https://github.com/rubin-dp0/tutorial-notebooks/blob/main/DP02_06b_Interactive_Catalog_Visualization.ipynb) from Rubin Observatory's DP0.2 [tutorial-notebooks repository](https://github.com/rubin-dp0/tutorial-notebooks/tree/main). 

In [None]:
train_goldenspike.plot(mag_name="mag_i_lsst")

--- 

### Users feedback 

Is something important missing? [Click here to open an issue in the PZ Server library repository on GitHub](https://github.com/linea-it/pzserver/issues/new). 