# producing $p(z | photometry)$ for ELAsTiCC

_Alex Malz (GCCL@RUB)_

The goal here is to generate mock photo-$z$ posteriors for host galaxies. 
Ideally, we want them to contain no assumptions not present in the $p(z, photometry$ space from which they were drawn.
That's not really feasible. . .

TODO: explain why we can't do this

The next best thing to do is to aim for realistic complexity and make assumptions as similar to those of the underlying $p(z, photometry)$ model, by using [`pzflow`](https://github.com/jfcrenshaw/pzflow).

In [None]:
import corner
import numpy as np
import pandas as pd

import scipy.stats as sps

In [None]:
import matplotlib as mpl
mpl.rcParams['text.usetex'] = False
mpl.rcParams['mathtext.rm'] = 'serif'
mpl.rcParams['font.family'] = 'serif'
mpl.rcParams['font.serif'] = 'DejaVu Serif'
# mpl.rcParams['axes.titlesize'] = 16
# mpl.rcParams['axes.labelsize'] = 14
# mpl.rcParams['savefig.dpi'] = 250
# mpl.rcParams['savefig.format'] = 'pdf'
# mpl.rcParams['savefig.bbox'] = 'tight'
import matplotlib.pyplot as plt
plt.rcParams["font.family"] = "serif"
plt.rcParams["mathtext.fontset"] = "dejavuserif"

## the hostlibs

In [None]:
hl_heads = {'SNIa_GHOST': 18,
            'SNII_GHOST': 18, 
            'SNIbc_GHOST': 18, 
            'UNMATCHED_KN_SHIFT_GHOST_ABS': 18,
            'UNMATCHED_COSMODC2_GHOST': 19}

Let's pick one hostlib for now.

In [None]:
pick_one = 0

In [None]:
which_hl = hl_heads.keys[pick_one]
hl_path = '/global/cfs/cdirs/lsst/groups/TD/SN/SNANA/SURVEYS/LSST/ROOT/PLASTICC_DEV/HOSTLIB/'+which_hl+'_PHOTOZ.HOSTLIB'
# skip 18 lines unless cosmodc2, then 19
df = pd.read_csv(hl_path, skiprows=hl_heads[which_hl], delimiter=' ', header=0)

`pzflow` needs a grid upon which to evaluate redshift posteriors. 
We use a fine grid now but will compress it for the alert stream later.
And we can check what the redshift distribution of the hostlib is.

TODO: investigate the prevalence at $z \sim 3$ and maybe ask to re-run?

In [None]:
zgrid = np.logspace(-3., np.log10(3.), 300)

In [None]:
nhost = len(df)
# nhost = 100
df = df[:nhost]

In [None]:
# mini = True
# weirdo_galid = 10443470676

## forecasting-level photo-z

first create likelihood

In [None]:
sigma = 0.02

In [None]:
true_locs = df['ZTRUE'].values.reshape((nhost, 1))

In [None]:
# sps.norm(loc=[[0.], [1.], [2.]], scale=1).pdf([0.5,1.,1.5,2])
likelihood = sps.truncnorm(0., 3., loc=true_locs, scale=sigma*(1.+true_locs))

then draw point estimate

In [None]:
# np.shape(likelihood.pdf(zgrid))
obs_locs = likelihood.rvs()

then make forecasting posterior

In [None]:
posterior = sps.truncnorm(0., 3., loc=obs_locs, scale=sigma*(1.+obs_locs))

evaluate quantiles

In [None]:
quants = np.linspace(0., 1., 11)[1:-1]

In [None]:
zquants = posterior.ppf(quants)

In [None]:
df[['Z10', 'Z20', 'Z30', 'Z40', 'Z50', 'Z60', 'Z70', 'Z80', 'Z90']] = zquants

In [None]:
logp50 = posterior.logpdf(posterior.median())

In [None]:
df['logP50'] = logp50

save

In [None]:
df.to_csv('/global/cfs/cdirs/lsst/groups/TD/SN/SNANA/SURVEYS/LSST/ROOT/PLASTICC_DEV/HOSTLIB/zquants/'+which_hl+'with_pz.csv')

## manual quantiles from pzflow model

In [None]:
def quant_to_pdf(zgrid, qvals, zvals, zanch):
    
def pdf_to_quant():