# Introduction to Survey Simulations

The goal of this notebook is to introduce you to the outputs available from DESI "survey simulations". These are the fastest type of DESI simulation to run and only involve the following components:
 - Simulated stochastic weather (seeing, transparency, ...). See [DESI-3087](https://desi.lbl.gov/DocDB/cgi-bin/private/ShowDocument?docid=3087) for details.
 - Afternoon planning, which updates tile observing priorities and schedules fiber assignment.
 - Next tile selector, which determines which tile to observe next, based on recent progress and current weather.
 
The software for these components is mostly in the [desisurvey](https://desisurvey.readthedocs.io) and [surveysim](https://surveysim.readthedocs.io) packages.  Note that survey simulations operate at the level of tiles, not targets, and never generate spectra or redshifts and do not refer to any input catalog.  For a recent overview of the different DESI simluation types, see [DESI-3377](https://desi.lbl.gov/DocDB/cgi-bin/private/ShowDocument?docid=3377).

This tutorial focuses on the using the outputs of a survey simulation. After working with the outputs from some existing simulations, you might want to run your own survey simulations: that tutorial is [here](http://surveysim.readthedocs.io/en/latest/tutorial.html).  For other tutorials, covering topics such as simulating your own DESI spectra, see [this list](https://github.com/desihub/tutorials/blob/master/README.md).

For general questions and suggestions on this tutorial, email desi-data@desi.lbl.gov. For more specific suggestions or bug reports, please [create a github issue](https://github.com/desihub/tutorials/issues).

## Getting Started

This notebook is optimized for use with the jupyter-dev service at NERSC, which provides pre-installed DESI software running in a jupyter notebook. If this is your first time using jupyter-dev at NERSC, follow [these instructions](https://desi.lbl.gov/trac/wiki/Computing/JupyterAtNERSC) to get it configured.

If you prefer to work on your laptop, you will need to [install the necessary DESI software locally](https://desi.lbl.gov/trac/wiki/Pipeline/GettingStarted/Laptop).

**If you are working through this notebook in a live jupyter session, I recommend removing all the output below for a more interactive experience.** Use the "Cell > Current Outputs > Clear" menu item.

**There are several exercises below for you to work on once you master the basics.**

#### DESI Version Compatibility

- 2017-12-04 : tested using the `DESI master` kernel on jupyter-dev with the `surveysim2017/depth_0m/` outputs.
- 2018-03-30 : tested using the `DESI 18.3` kernel on jupyter-dev with the `surveysim2017/depth_0m/` outputs (which were generated with an earlier version of the code).
- 2018-07-20 : tested using the `DESI 18.6` kernel on jupyter-dev with the `surveysim2017/depth_0m/` outputs (which were generated with an earlier version of the code).
- 2018-10-15 : tested using the `DESI 18.7` kernel on jupyter-dev with the `surveysim2017/depth_0m/` outputs (which were generated with an earlier version of the code).

### Load Modules

Import numpy and matplotlib and draw plots directly to the notebook:

In [None]:
%pylab inline

Import the `desisurvey` modules we need below:

In [None]:
import desisurvey.progress
import desisurvey.utils
import desisurvey.plots

Ignore expected harmless warnings (or don't run these lines if you prefer to see them):

In [None]:
import warnings, matplotlib.cbook, astropy._erfa.core
warnings.filterwarnings('ignore', category=matplotlib.cbook.mplDeprecation)
warnings.filterwarnings('ignore', category=astropy._erfa.core.ErfaWarning)

### Find Simulation Outputs

Identify which survey simulation you want to study by setting the `$DESISURVEY_OUTPUT` environment variable:
 - `depth_0m`: Simulates a simple depth-first survey strategy for one random weather realization.
 - `baseline_1m`: Simulates the baseline survey strategy described in DESI-doc-1767-v3 for one random weather realization.
 
The `0m` and `1m` in the name refer to the fiber assignment (FA) cadence policy (see [DESI-3194](https://desi.lbl.gov/DocDB/cgi-bin/private/ShowDocument?docid=3194) for details):
 - `m` indicates that FA is run on a monthly cadence (during the 7-night full moon shutdowns).  Other options are `d` (daily) and `q` (quarterly = every 3 full moons).
 - `0` or `1`  indicates the number of complete (daily/monthly/quarterly) cycles that must elapse after a tile is completely covered by earlier tiles (so that decisions about reobserving QSOs, etc, can be made) before it will have its fibers assigned.  A delay of `0` indicates that FA is run as soon as possible, at the next cycle.
 
Note that `$DESISURVEY_OUTPUT` is only read the first time you use a `desisurvey` function, so the easiest way to make a change below take effect is to restart the jupyter kernel and re-run the initial cells.

In [None]:
import os
os.environ['DESISURVEY_OUTPUT'] = '/global/projecta/projectdirs/desi/datachallenge/surveysim2017/depth_0m/'

## Survey Progress

The main output from a survey simulation is a FITS file that records the simulated survey progress. Progress is organized around **tiles** and **exposures**.  Tiles are predefined ([DESI-717](https://desi.lbl.gov/DocDB/cgi-bin/private/ShowDocument?docid=717)) to cover the whole survey footprint in 8 dithered passes. Each tile is observed with one or more exposures.  Multiple exposures of a tile are sometimes required to:
 - Split a long exposure to minimize the impact of cosmic rays.
 - Continue an exposure that is terminated early due to a program change (or dawn).
 - Continue an exposure that is found to have insufficient signal to noise after pipeline processing.

After setting `$DESISURVEY_OUTPUT`, load the corresponding progress using:

In [None]:
progress = desisurvey.progress.Progress(restore='progress.fits')

The returned object has some useful summary methods and attributes, for example:

In [None]:
print('Survey runs {} to {} and observes {} tiles with {} exposures'
      .format(
          desisurvey.utils.get_date(progress.first_mjd),
          desisurvey.utils.get_date(progress.last_mjd),
          progress.num_tiles, progress.num_exp))

Note that progress uses MJD timestamps internally, which can be converted to dates using [`desisurvey.utils.get_date()`](http://desisurvey.readthedocs.io/en/latest/api.html?highlight=get_date#desisurvey.utils.get_date).

In [None]:
def progress_report(progress=progress):
    for program, passes in dict(DARK=(0,1,2,3), GRAY=(4,), BRIGHT=(5,6,7)).items():
        stats = progress.completed(only_passes=passes, as_tuple=True)
        print('Observed {:6.1f} / {} tiles ({:.1f}%) of the {} program '
              .format(*stats, program))
        
progress_report()

The progress after the survey completes (100%!) is not very interesting but you can also see progress over any time interval using the `copy_range()` method:

In [None]:
help(progress.copy_range)

Use this method to get a progress report after the first year of the survey:

In [None]:
year1 = progress.copy_range(progress.first_mjd, progress.first_mjd + 365)
progress_report(year1)

The progress object is internally organized as an [astropy table](http://docs.astropy.org/en/stable/table/) with one row per tile:

In [None]:
progress._table[:3]

The structure of this table is designed around the needs of the operations software, but some user-oriented views are provided by two methods:

In [None]:
help(progress.get_summary)

In [None]:
help(progress.get_exposures)

Create these views now. The next sections show how to use them.

In [None]:
summary = progress.get_summary()

In [None]:
exposures = progress.get_exposures()

## Progress Summary

The summary table has one row per tile containing summary statistics of all exposures (if any) of that tile:

In [None]:
summary[:3]

The primary metric used to set the goal total exposure time for each tile is signal-to-noise ratio (SNR) for a set of predefined "threshold targets":
 - DARK & GRAY programs: ELGs with integrated \[OII\] flux of 8e-17 erg/(s cm^2)
 - BRIGHT program: BGS targets with r=19.5 and no emission lines
 
Plot the ratio of actual / goal SNR for each tile:

In [None]:
plt.hist(summary['snr2frac'], range=(0.75, 1.25), bins=25)
plt.xlabel('Tile SNR(actual) / SNR (goal)')
plt.axvline(np.median(summary['snr2frac']), c='r');

Plot the corresponding total exposure times, which shows two peaks for the BRIGHT and DARK+GRAY programs:

In [None]:
plt.hist(summary['exptime'] / 60, range=(0, 60), bins=30)
plt.xlabel('Tile Total Exposure Time [min]')
plt.axvline(np.median(summary['exptime'] / 60), c='r');

To plot the distribution of any column's values over the sky, separately for each of the 8 passes, use `plot_sky_passes`:

In [None]:
help(desisurvey.plots.plot_sky_passes)

For example, to see the distributions of SNR(actual) / SNR(goal) over the sky after year 1 (this function takes ~30s to run):

In [None]:
year1_summary = year1.get_summary()
desisurvey.plots.plot_sky_passes(
    year1_summary['ra'], year1_summary['dec'], year1_summary['pass'],
    year1_summary['snr2frac'], label='Tile SNR(actual) / SNR (goal)');

For more control of sky plots like this, see [this tutorial](https://github.com/desihub/desiutil/blob/master/doc/nb/SkyMapExamples.ipynb) on the lower-level `desiutil.plots` functions being used.

The following columns summarize the afternoon planning and scheduling of fiber assignment (FA):
 - covered: Date the tile is first covered by previous layers and thus eligible for FA.
 - available: Date the tile first has fibers assigned.
 - planned: Date the tile is first included in the observing plan.
 
All dates are specified as an integer number of days from the survey start date (defined by [this utility function](http://desisurvey.readthedocs.io/en/latest/api.html#desisurvey.utils.day_number)).  As an example, plot the number of days into the survey that each tile is covered:

In [None]:
desisurvey.plots.plot_sky_passes(
    summary['ra'], summary['dec'], summary['pass'],
    summary['covered'], label='Day When Tile is Covered');

Note that the depth-first strategy has all tiles planned (=0) at the start of the survey, but other strategies have more complex dependencies between different regions of the sky in each pass.

### Exercises

In [None]:
# Plot a histogram of the number of exposures of each tile in the full survey.

In [None]:
# Plot histograms of snr2frac after year-1 separately for the DARK, GRAY, BRIGHT programs.

In [None]:
# Create all-sky plots of the mean airmass that each tile was observed at in the full survey.

In [None]:
# Study the tile "overhead", defined as 86400 * (mjd_max - mjd_min) - exptime.

## Exposures List

The exposures list is a table with rows corresponding to each simulated exposure, in increasing time order, with columns for their simulated observing conditions. Note that column names are all UPPER CASE.

In [None]:
exposures[:3]

To see the distribution of individual exposure times (and compare with the total exposure time plot above), use:

In [None]:
plt.hist(exposures['EXPTIME'] / 60, range=(0, 25), bins=25)
plt.xlabel('Individual Exposure Time [min]')
plt.axvline(np.median(exposures['EXPTIME'] / 60), c='r');

To see the distribution of atmospheric seeing during the simulated survey, use:

In [None]:
plt.hist(exposures['SEEING'], bins=25)
plt.xlabel('Per-Exposure FWHM Seeing [arcsec]')
plt.axvline(np.median(exposures['SEEING']), c='r');

To study the correlation between  exposure time and seeing in the first DARK pass, use:

In [None]:
pass1 = exposures[exposures['PASS'] == 0]
plt.scatter(pass1['EXPTIME'] / 60, pass1['SEEING'], c=pass1['AIRMASS'], lw=0, s=5);
plt.colorbar().set_label('Airmass')
plt.xlabel('Exposure Time [min]')
plt.ylabel('Atmospheric FWHM Seeing [arcsec]');

### Exercises

In [None]:
# Study the correlation between exposure time and moon altitude (which is underestimated in these simulations)

In [None]:
# Plot histograms of the number of exposures per night in each program.

In [None]:
# Study how often GRAY and BRIGHT exposures are taken with no moon in the sky.

In [None]:
# Study which of the 3 moon parameters correlates most strongly with exposure time.