# Demo: RAIL Evaluation 

_Sam Schmidt, Alex Malz, Julia Gschwend_ ([julia@linea.gov.br](mailto:julia@linea.gov.br))

The purpose of this notebook is to demonstrate the use of the metrics scripts to be used on the photo-$z$ PDF catalogs produced by the PZ working group. The first implementation of the _evaluation_ module is based on the refactoring of the algorithms used in [Schmidt et al. 2020](https://arxiv.org/pdf/2001.03621.pdf), available on Github repository [PZDC1paper](https://github.com/LSSTDESC/PZDC1paper). 

To run this code, you must install qp and have the notebook in the same directory as metrics.py. You must also install some run-of-the-mill Python packages: matplotlib, numpy, scipy, and skgof.




In [None]:
import numpy as np
import matplotlib.pyplot as plt


#import warnings
#warnings.filterwarnings('ignore')

from sample import Sample
from metrics import Metrics

%matplotlib inline
%load_ext autoreload
%autoreload 2

<font color='red'>WARNING: error when importing skgof -> No module named 'scipy._lib.six' </font>

# 1. Sample  

To compute the photo-z metrics of a given test sample, it is necessary to read the output of a photo-z code containing galaxies' photo-z PDFs. Let's use the toy data available in `tests/data/` (**test_dc2_training_9816.hdf5** and **test_dc2_validation_9816.hdf5**) and the configuration file available in `examples/configs/FZBoost.yaml` to generate a small samples of photo-z PDFs using the **FZBoost** algorithm available on RAIL's _estimation_ module.

### Run FZBoost

Go to dir  `<your_path>/RAIL/examples/` and run the command `python main.py configs/FZBoost.yaml`.
The photo-z output files (inputs for this notebook) will be writen at: `<your_path>/RAIL/examples/results/FZBoost/test_FZBoost.hdf5`. 

<font color='red'>The new RAIL's version will produce output of the codes as qp files rather than the old format hdf5 files (Sam's message on Slack about RAIL's issue#33). TO DO: update the read() function of class Data </font>

In [None]:
my_path = '/Users/julia/github/RAIL' # replace it by your path to RAIL's parent dir
pdfs_file = my_path + '/examples/results/FZBoost/test_FZBoost.hdf5'
valid_file = my_path + '/tests/data/test_dc2_validation_9816.hdf5'

Let's create a Sample object containing both the PDFs and true redshifts for each photo-z code.

In [None]:
sample = Sample(pdfs_file, valid_file, name="FZBoost")

In [None]:
print(sample)

PDFs of 5 galaxies for illustration.

### PDFs

In [None]:
#gals = np.random.choice(len(ztrue), 5)
gals = [540, 2256, 12175, 17802, 19502]
colors = sample.plot_pdfs(gals)

### Validation plots

Traditional validation plots (point colors follow the PDFs above)

In [None]:
sample.plot_old_valid(gals=gals, colors=colors)

# 2. Metrics

The folowing metrics are computed based on the photo-z PDFs. Let's create a Metrics object to access the metrics and plots of interest.

In [None]:
metrics = Metrics(sample)

### 2.1 PIT

The first metric we calculate is the Probability Integral Transform (PIT), 
\begin{equation*}
\mathrm{PIT}(p_{i}(z);\ z_{i})\ =\ \int_{-\infty}^{z_{i}}\ p_{i}(z)\ dz,
\end{equation*}
for every galaxy $i$ in the catalog. For instance, the PIT of the 5 PDFs above are:

In [None]:
metrics.pit[gals]

#### 2.1.1 PIT outlier rate

The PIT outlier rate is a global metric defined as the fraction of galaxies in the sample with extreme PIT values ($PIT < 10^{-4} or PIT>0.9999$). 

### 2.1 PIT and QQ plots

In [None]:
plt.figure(figsize=[12,3])
metrics.plot_pit(bins=20, sp=131)
metrics.plot_pit(bins=60, sp=132)
metrics.plot_pit(bins=100, sp=133)
plt.subplots_adjust()

In [None]:
metrics.qq_vectors[0][0:10]

In [None]:
metrics.qq_vectors[1][0:10]

In [None]:
plt.figure(figsize=(4, 6))
metrics.plot_qq()
plt.subplots_adjust()

# 3. Metrics



### 3.2 CDE Loss

### 3.3 Kolmogorov-Smirnov  


Next, we calculate the Kolmogorov-Smirnov (KS) test statistic,
\begin{equation*}
\mathrm{KS}(\{p_{i}(z)\}_{N};\ \{z_{i}\}_{N})\ =\ \max_{PIT}\left[ \left| CDF(\{PIT(p_{i}(z);\ z_{i})\}_{N}) - CDF(\{z_{i}\}_{N}) \right| \right],
\end{equation*}
on the distribution of PIT values, which should be uniform if the PDFs are perfect.

### 3.4 Cramer-von Mises

Similarly, we calculate the Cramer-von Mises (CvM) test statistic,
\begin{equation*}
\mathrm{CvM}(\{p_{i}(z)\}_{N};\ \{z_{i}\}_{N})\ =\ \int_{-\infty}^{\infty}\ \left(CDF(\{PIT(p_{i}(z);\ z_{i})\}_{N})\ -\ CDF(\{z_{i}\}_{N})\right)^{2}\ \mathrm{d}CDF(\{z_{i}\}_{N}),
\end{equation*}
on the distribution of PIT values, which should be uniform if the PDFs are perfect.

### 3.5 Anderson-Darling 

And the Anderson-Darling (AD) test statistic,
\begin{equation*}
\mathrm{AD}(\{p_{i}(z)\}_{N};\ \{z_{i}\}_{N})\ =\ \int_{-\infty}^{\infty}\frac{\left(CDF(\{PIT(p_{i}(z);\ z_{i})\}_{N})\ -\ CDF(\{z_{i}\}_{N})\right)^{2}}{CDF(\{z_{i}\}_{N})\ \left(1\ -\ CDF(\{z_{i}\}_{N})\right)}\ \mathrm{d}CDF(\{z_{i}\}_{N}),
\end{equation*}
on the distribution of PIT values, which should be uniform if the PDFs are perfect.  However, for this test, we cut the ends of the distribution, which represent catastrophic utliers.  
