# Core Concepts

In [1]:
%%html
<!-- this cell renders tables left aligned instead of centered in the cells -->
<style>table {float:left}</style>

## SURVEY and PROGRAM

DESI observations are organized by
  * **SURVEY** = Phases of DESI observations with a common goal, e.g. "main" or "sv1"
  * **PROGRAM** = Subsets of SURVEYs split by observing conditions, e.g. "dark" or "bright"

**Why this matters**: Data processing groups data by SURVEY and PROGRAM on disk,
and does not combine data across SURVEY and PROGRAM even if it is the same
object on the sky.  This keeps the different scientific goals of DESI
independent of each other.  When analyzing DESI data, you need to know what
SURVEY and PROGRAM you care about.

Primary **SURVEY**s and **PROGRAMS** in DESI are

| SURVEY | Purpose |
| :---- | :--- |
| Survey Validation 1 (sv1) | Tune cuts for target selection; extra high S/N data |
| Survey Validation 3 (sv3) | Many overlapping observations to get all targets on a given patch of sky ("highly complete") |
| Main (main)               | The core cosmology program of DESI |

| PROGRAM | Purpose |
| :----  | :--- |
| dark   | Best observing conditions for faintest targets: ELG, LRG, QSO |
| bright | Moon up / poor seeing / poor transparency: BGS, MWS |
| backup | Very bad conditions: bright stars |



## Tiles and Healpix

A DESI "tile" is a specific pointing of the telescope and fiber positioners to
observe a specific set of objects on the sky.  Tiles are associated with a
SURVEY and PROGRAM.  Tiles are observed with one more more exposures on
one more more nights until they have achieved a consistent signal-to-noise (S/N) goal
set by the SURVEY+PROGRAM. Since a single tile cannot observe all the targets on
a given patch of sky, the DESI tiles overlap so that if a given target is not observed
on one tile, it gets another chance on a future overlapping tile.

Some targets are observed on multiple tiles to get more S/N than they would get on
a single tile, e.g. Lyman-alpha QSOs at z>2.  In this case we want to coadd data
across tiles. Some science studies also want all spectra in a single patch of sky
and it would be a pain to have to look up and read N>>1 seperate tile files just to
get those spectra.  For these reasons, data processing also groups spectra by
"healpix", which is a diamond-shaped tesselation of the sky.  All spectra in a given (healpix, survey, program) are grouped together into files and coadded.

**Why this matters**: If you want the highest S/N data combined across all observations, you want to use the healpix data.  If you need to track performance vs. time or are working with
custom observations on a special tile, you want the tile data.

**Digging Deeper**: The [DESI_petal_healpix_rosette](https://github.com/desihub/tutorials/blob/main/getting_started/DESI_petal_healpix_rosette.ipynb) tutorial explores these ideas in more detail including reading and plotting targets grouped by tiles vs. healpix.

## Petals, Spectrographs, Cameras, and Fibers

The DESI focal plane is divided into 10 separate "petals".
Each petal has 500 fibers which map to a single "spectrograph".
The petal number [0-9] is the same as the spectrograph number [0-9]
and in practice these are used interchangeably.
The `10*500=5000` DESI fibers are mapped to the spectrographs such that
```
PETAL = SPECTROGRAPH = FIBER//500
```

Each spectrograph has 3 "cameras" which split the light by blue (b), red (r), and near-infrared (z) bands.  These cameras are named by the band+spectrograph, e.g. "b0", "r1", "z9".


**Caveat**: if you get involved in hardware operations, there is a different numbering
scheme for the hardware spectrographs `smN` developed while they were being manufactured,
before they were plugged in to the petals.  Most people do not need to know about this
distinction.

## Spectra and Catalogs

The core DESI data are *spectra*, i.e. flux vs. wavelength.
When we measure quantities from spectra like the redshift or the flux in emission lines,
these measurements can be grouped in to tables in *catalogs*.  Many analyses can be
performed on catalogs generated by others without ever needing to read the much larger spectra files.

## Mountains and spectroscopic productions

DESI data processing runs are named after mountains, alphabetically increasing with time.
A given mountain or "spectroscopic production run" (specprod) represents a self-consistent processing of the data with a set of code tags.

Productions are located at NERSC under
```
/global/cfs/cdirs/desi/spectro/redux/$SPECPROD
```

It is good practice for all of your scripts and notebooks to set the production directory once at the very top instead of hardcoding e.g. "kibo" many places.  This makes it easier to switch from one production to a newer one e.g. "loa".  It's even better to reference this to an environment variable $DESI_ROOT (=/global/cfs/cdirs/desi at NERSC) so that you can copy a subset of the data to your laptop or home institution and still have the same scripts work without a bunch of NERSC-specific hardcoded paths.

For example:

In [5]:
import os
specprod = 'loa'
desi_root = os.environ['DESI_ROOT']
datadir = f'{desi_root}/spectro/redux/{specprod}'
print(f'Using data in {datadir}')

Using data in /global/cfs/cdirs/desi/spectro/redux/loa


If you follow that pattern in all of your notebooks and scripts, it will be much easier to re-run your
analysis on future productions.