# n(z) histograms from Spectroscopic Reduction

This tutorial is a worked example of how to make an n(z) histogram
for a production run of the spectroscopic pipeline. It shows you how the per-tile per-night per-spectrograph
"zbest" redshift files are organized, and gives you some details about working
with commissioning (CMX) and survey validation (SV) target bits and "fibermaps".

In [None]:
#- Basic imports
import sys, os, glob
import numpy as np
import fitsio
from astropy.table import Table

from desitarget.cmx.cmx_targetmask import cmx_mask
from desitarget.sv1.sv1_targetmask import desi_mask

import matplotlib.pyplot as plt

## Spectro pipeline production directories

Official DESI spectro pipeline runs are located at NERSC under `/global/cfs/cdirs/desi/spectro/redux`,
grouped by a "specprod" name. For example:
* "daily" : the daily spectroscopic reductions after the data transfer from KPNO.
* "andes" : production run from May 2020.
* "blanc" : production run from Jan 2021.
* ...

Let's use the "blanc" run. If you want to browse the outputs on the web, see
https://data.desi.lbl.gov/desi/spectro/redux/blanc .  Search the DESI wiki for "collaboration username" (in quotes) to find
the authentication info.

In [None]:
specprod = 'blanc'
specprod_dir = os.path.join(os.environ['DESI_SPECTRO_REDUX'], specprod)

In andes, we did not make a combined redshift catalog across all tiles and all nights (sorry...).
Instead, we have to load the individual "zbest" files for each tile.  These are grouped by
tiles/TILEID/YEARMMDD/zbest-SPECTROGRAPH-TILEID-NIGHT.fits.
See https://desi.lbl.gov/trac/wiki/TargetSelectionWG/SV0 for human-friendly summary of what was observed when.

In [None]:
zbfiles = sorted(glob.glob(specprod_dir+'/tiles/*/20*/zbest*.fits'))
print('{} zbest files found'.format(len(zbfiles)))

In [None]:
#- Print a few of them as an example
zbfiles[0::50]

Even a relatively small production like andes has a lot of data, which motivates why you should
get used to using NERSC (e.g. via Jupyter) instead of always starting by downloading the data
locally.

In [None]:
!du -hs $specprod_dir/tiles

## Reading the zbest files

Now let's loop over all the zbest files, load their redshifts table (ZBEST), plus their "FIBERMAP" table that gives
further information about each target, and accumulate their good redshifts.

### Target selection bitmasks

Target selection bitmasks identify which targets are which.

Since we have a combination of commissioning (CMX) and early survey validation (SV0) tiles,
as well as the post-COVID restart of survey validation (SV1), we need to adjust the bitmasks
depending on the input.

For more details on working with targeting bits, see
https://github.com/desihub/desitarget/blob/master/doc/nb/target-selection-bits-and-bitmasks.ipynb

In [None]:
#- redshift lists that we will extend
zbgs = list()
zlrg = list()
zelg = list()
zqso = list()
nstar = 0   #- Sorry, MWS, we're just going to count stars

#- Since a TARGETID could be observed more than once, keep track of ones that we've already
#- seen before, and only take the first redshift that we find
previous_targetids = list()

num_zbfiles = len(zbfiles)
for i, filename in enumerate(zbfiles):
    if i%100 == 0:
        print(f'{i}/{num_zbfiles}')

    #- The ZBEST HDU contains the redshift fits
    zb = fitsio.read(filename, 'ZBEST')
    
    #- The FIBERMAP HDU contains information about which targets are assigned to which positioners
    fm = fitsio.read(filename, 'FIBERMAP')

    #- ZBEST has one entry per target, while FIBERMAP has one entry per target per exposure
    #- Get one CMX_TARGET entry per target, from the fibermap
    targetid, ii = np.unique(fm['TARGETID'], return_index=True)
    assert np.all(zb['TARGETID'] == fm['TARGETID'][ii])
    
    # Choose CMX/SV0 or SV1 bitmasks on the fly.
    if 'SV1_DESI_TARGET' in fm.dtype.names:
        targetcol = 'SV1_DESI_TARGET'
        bgsMask = desi_mask.mask('BGS_ANY')
        lrgMask = desi_mask.mask('LRG')
        elgMask = desi_mask.mask('ELG')
        qsoMask = desi_mask.mask('QSO')
    elif 'CMX_TARGET' in fm.dtype.names:
        targetcol = 'CMX_TARGET'
        bgsMask = cmx_mask.mask('SV0_BGS')
        lrgMask = cmx_mask.mask('MINI_SV_LRG|SV0_LRG')
        elgMask = cmx_mask.mask('MINI_SV_ELG|SV0_ELG')
        qsoMask = cmx_mask.mask('MINI_SV_QSO|SV0_QSO')
    
    desi_target = fm[targetcol][ii]

    #- ZWARN==0 means no warnings means redrock thinks it's a good redshift
    #- identify ZWARN==0 per spectral classification
    isGal = (zb['SPECTYPE'] == 'GALAXY') & (zb['ZWARN'] == 0)
    isQSO = (zb['SPECTYPE'] == 'QSO') & (zb['ZWARN'] == 0)
    isStar = (zb['SPECTYPE'] == 'STAR') & (zb['ZWARN'] == 0)
    
    #- Good targets that we haven't seen in a previous zbest file (e.g. from an earlier night)
    isNew = np.in1d(targetid, list(previous_targetids), invert=True) & (zb['ZWARN'] == 0)

    #- Count stars
    nstar += np.count_nonzero(isStar & isNew)

    #- collect redshifts
    isBGS = isNew & isGal & ((desi_target & bgsMask) != 0)
    isLRG = isNew & isGal & ((desi_target & lrgMask) != 0)
    isELG = isNew & isGal & ((desi_target & elgMask) != 0)
    isQSO = isNew & isQSO & ((desi_target & qsoMask) != 0)

    zbgs.append(zb['Z'][isBGS])
    zlrg.append(zb['Z'][isLRG])
    zelg.append(zb['Z'][isELG])
    zqso.append(zb['Z'][isQSO])

    #- Keep track of ones we've seen before
    previous_targetids.extend(targetid[isNew])

#- convert those into 1D arrays of redshifts per target class
zbgs = np.concatenate(zbgs)
zlrg = np.concatenate(zlrg)
zelg = np.concatenate(zelg)
zqso = np.concatenate(zqso)

In [None]:
print('# Star:', nstar)
print('# BGS: ', len(zbgs))
print('# LRG: ', len(zlrg))
print('# ELG: ', len(zelg))
print('# QSO: ', len(zqso))

## Let's make some plots

In [None]:
plt.figure(figsize=(8,6))
plt.subplot(411)
n = plt.hist(zbgs, 35, (0, 3.5), color='C4')[0]
plt.text(3.6, int(0.8*np.max(n)), f'{len(zbgs)} BGS', ha='right')
plt.title(f'{specprod} n(z)')

plt.subplot(412)
n = plt.hist(zlrg, 35, (0, 3.5), color='C3')[0]
plt.text(3.6, int(0.8*np.max(n)), f'{len(zlrg)} LRG', ha='right')

plt.subplot(413)
n = plt.hist(zelg, 35, (0, 3.5), color='C2')[0]
plt.text(3.6, int(0.8*np.max(n)), f'{len(zelg)} ELG', ha='right')

plt.subplot(414)
n = plt.hist(zqso, 35, (0, 3.5), color='C0')[0]
plt.text(3.6, int(0.8*np.max(n)), f'{len(zqso)} QSO', ha='right')

plt.xlabel('redshift')
plt.tight_layout()

## Exercises
  * make a radial velocity histogram for stars instead of just counting them
  * explore the contents of the FIBERMAP and the ZBEST HDUs to make plots like
    * fraction of ZWARN==0 (good) vs. FLUX_R
    * g-r color vs. redshift
    * histogram number of exposures per target