In [None]:
__author__ = 'Benjamin Weaver <benjamin.weaver@noirlab.edu>'
__version__ = '202100301'
__datasets__ = ['sdss_dr16']
__keywords__ = ['HowTo','spectra','query','visualization','SDSS']

# How to Plot Spectral Data with Prospect, Specutils and the Astro Data Lab Spectrum Service

## Table of Contents

* [Goals](#Goals)
* [Summary](#Summary)
* [Disclaimer & Attribution](#Disclaimer-&-Attribution)
* [Imports & Setup](#Imports-&-Setup)
* [Authentication](#Authentication)
* [SDSS/eBOSS spPlate file](#SDSS/eBOSS-spPlate-file)
  - [Find Plate 2955 in the Spectrum Service](#Find-Plate-2955-in-the-Spectrum-Service)
  - [Spectrum Identifiers](#Spectrum-Identifiers)
  - [Spectrum Service: Flux versus Wavelength](#Spectrum-Service:-Flux-versus-Wavelength)
  - [Redshift and other Metadata](#Redshift-and-other-Metadata)
    * [Redshift Catalog](#Redshift-Catalog)
    * [Target Catalog](#Target-Catalog)
* [Launch the Visualization](#Launch-the-Visualization)
  - [Interface Visualization Details](#Interface-Visualization-Details)
* [SDSS Spectra and Redshifts from Files](#SDSS-Spectra-and-Redshifts-from-Files)
* [References & Resources](#References-&-Resources)

## Goals

Obtain SDSS spectra using the Astro Data Lab spectrum service, convert data to [specutils](https://github.com/astropy/specutils) objects as needed, and use [Prospect](https://github.com/desihub/prospect) to display the data.

## Summary

Prospect is an *interactive* spectrum visualization service that is being actively used by the [DESI](https://www.desi.lbl.gov/) project for visual inspection of commissioning and survey validation spectra right now.  Rather than just a simple *flux* versus *wavelength* plot, the interactive spectrum display provides pan, zoom, known spectral lines, targeting and other catalog information, metadata, *etc.*  The [Bokeh](https://bokeh.org) library handles the low-level interactivity.

We can leverage the generic spectrum container objects provided by [specutils](https://github.com/astropy/specutils) to allow Prospect to plot data from other surveys.  In this example, we are using [SDSS DR16](https://www.sdss.org/dr16/).

In the future we will extend this functionality to additional instruments.  For example, we are excited by the possibility of using Prospect and the Astro Data Lab spectrum service for Gemini spectra.  And, of course, once DESI data become public, both the spectrum service and this tool will be ready.

## Disclaimer & Attribution

If you use this notebook for your published science, please acknowledge the following:

* Astro Data Lab concept paper: Fitzpatrick et al., "The NOAO Data Laboratory: a conceptual overview", SPIE, 9149, 2014, http://dx.doi.org/10.1117/12.2057445
* Astro Data Lab disclaimer: https://datalab.noirlab.edu/disclaimers.php

## Imports & Setup

This cell performs all necessary imports for this notebook.

In [None]:
import os
import sys
import time
from getpass import getpass
#
# Numpy, astropy, etc.
#
import numpy as np
import astropy.units as u
from astropy.table import Table
from astropy.nddata import InverseVariance
from specutils import Spectrum1D, SpectrumCollection, SpectrumList
#
# Prospect imports
#
from prospect.specutils import read_spPlate, read_spZbest
from prospect.viewer import plotspectra
#
# Astro Data Lab imports
#
from dl import specClient as spec  # primary spectral data client interface
from dl import storeClient as sc   # needed to use virtual storage
from dl import queryClient as qc   # needed to query Astro Data Lab catalogs
from dl import authClient as ac    # needed for login authentication

start_time = time.time()  # save start time of notebook run

## Authentication

Much of the functionality of spectrum services can be accessed without explicitly logging into Astro Data Lab (the service then uses an anonymous login). But some capabilities, for instance saving the results of your queries to your virtual storage space, require a login (*i.e.* you will need a [registered user account](https://datalab.noao.edu/help/register.php?qa=register)).

If you need to log in to Astro Data Lab, uncomment the `ac.login()` command and respond according to the prompts.  If you have previously logged into Astro Data Lab, this cell will simply print your active user name.

In [None]:
# ac.login(input("Enter user name: (+ENTER) "),getpass("Enter password: (+ENTER) "))
ac.whoAmI()

## SDSS/eBOSS spPlate file

SDSS spectra are stored per-plate in spPlate files.  These contain 640 spectra for the original SDSS spectrograph or 1000 spectra for the BOSS/eBOSS spectrograph.  All spectra in an individual spPlate file have a common wavelength solution, so a single spPlate file can be represented by a [Spectrum1D](https://specutils.readthedocs.io/en/stable/api/specutils.Spectrum1D.html#specutils.Spectrum1D) object.  The Prospect visualization package can then take this Spectrum1D object as input data.

### Find Plate 2955 in the Spectrum Service

In the cells below, we will find spectra corresponding to SDSS plate 2955, but using the spectrum service instead of reading the spectra from a file.  This starts by finding the coordinates of the plate center.

In [None]:
plate = 2955
q = f"SELECT ra, dec FROM sdss_dr16.platex WHERE plate = {plate}"
r = qc.query(sql=q, fmt='array')
r

### Spectrum Identifiers

SDSS plates are circular, so we'll do a cone search around the plate center found above.  This gets us the spectrum identifiers we need to retrieve the spectra themselves.

In [None]:
sdss_ids = spec.query(r[0], r[1], 1.5, fmt='array', out='', constraint=f'plate = {plate} ORDER BY specobjid LIMIT 50')
sdss_ids

### Spectrum Service: Flux versus Wavelength

Now we fetch the spectra.  The option `align=True` directes the service to return all spectra as a single `Spectrum1D` object with a common wavelength axis for all spectra.  Since all spectra on a single SDSS plate have a common wavelength axis anyway, this is a simple operation.

In [None]:
%%time
sdss = spec.getSpec(sdss_ids, fmt='Spectrum1D', align=True)
sdss

### Redshift and other Metadata

The Prospect viewer requires certain spectral metadata that are normally provided with SDSS spectra--*i.e.*, these metadata are stored alongside the flux *versus* wavelength data.  There are two categories of metadata:

1. Redshift catalog: this contains the redshift (how far away it is) and classification (what type of object it is) for each spectrum, as identified by the SDSS pipeline.
2. Target catalog: this contains the information needed for taking a spectrum of the object in the first place.  This includes information about the object as identified in SDSS *imaging* (RA, Dec, optical magnitude), as well as the rationale for taking a spectrum of the object.

In the cells below, we will extract the necessary metadata from the `specobj` and `photoplate` tables and reformat them slightly to match the appearance of these data as they would be extracted from SDSS files.

#### Redshift Catalog

This query extracts the redshift and classification information from the `specobj` table.

In [None]:
q = "SELECT specobjid, class, subclass, z, zerr, rchi2diff, zwarning FROM sdss_dr16.specobj WHERE specobjid IN ({0}) ORDER BY specobjid".format(','.join(map(str, sdss_ids.tolist())))
sdss_z = qc.query(sql=q, fmt='table')
for c in ('specobjid', 'class', 'subclass', 'z', 'zerr', 'rchi2diff', 'zwarning'):
    if c == 'zerr':
        sdss_z.rename_column(c, 'Z_ERR')
    else:
        sdss_z.rename_column(c, c.upper())
sdss_z

#### Target Catalog

This query extracts photometric targeting information from the `photoplate` table.  The data obtained needs to be reformatted to match the ["plugmap" metadata](https://data.sdss.org/datamodel/files/BOSS_SPECTRO_REDUX/RUN2D/PLATE4/spPlate.html#hdu5head) that is stored with SDSS spectrum files, including spPlate files.

In [None]:
q = """SELECT s.specobjid, s.targetobjid, s.ra, s.dec,
    s.primtarget, s.sectarget,
    s.boss_target1,
    s.ancillary_target1, s.ancillary_target2,
    s.eboss_target0, s.eboss_target1, s.eboss_target2,
    p.u, p.g, p.r, p.i, p.z
FROM sdss_dr16.specobj AS s
LEFT JOIN sdss_dr16.photoplate AS p
ON s.bestobjid = p.objid
WHERE specobjid IN ({0})
ORDER BY specobjid""".format(','.join(map(str, sdss_ids.tolist())))
sdss_plugmap = qc.query(sql=q, fmt='table')
for c in ('ra', 'dec', 'primtarget', 'sectarget',
          'boss_target1',
          'ancillary_target1', 'ancillary_target2',
          'eboss_target0', 'eboss_target1', 'eboss_target2'):
    sdss_plugmap.rename_column(c, c.upper())
sdss_plugmap['OBJID'] = np.vstack((np.bitwise_and(sdss_plugmap['targetobjid'] >> 32, 2**16 - 1).data,
                                   np.bitwise_and(sdss_plugmap['targetobjid'] >> 48, 2**11 - 1).data,
                                   np.bitwise_and(sdss_plugmap['targetobjid'] >> 29, 2**3 - 1).data,
                                   np.bitwise_and(sdss_plugmap['targetobjid'] >> 16, 2**12 - 1).data,
                                   np.bitwise_and(sdss_plugmap['targetobjid'], 2**16 - 1).data)).T
sdss_plugmap['MAG'] = np.vstack((np.where(np.isnan(sdss_plugmap['u'].data), 0, sdss_plugmap['u'].data),
                                 np.where(np.isnan(sdss_plugmap['g'].data), 0, sdss_plugmap['g'].data),
                                 np.where(np.isnan(sdss_plugmap['r'].data), 0, sdss_plugmap['r'].data),
                                 np.where(np.isnan(sdss_plugmap['i'].data), 0, sdss_plugmap['i'].data),
                                 np.where(np.isnan(sdss_plugmap['z'].data), 0, sdss_plugmap['z'].data))).T

sdss.meta['plugmap'] = sdss_plugmap
sdss_plugmap

## Launch the Visualization

Once all the spectral data are assembled, the visualization is launched with a single `plotspectra()` function in the cell below.  The primary input is the Spectrum1D object created above.  Prospect expects a certain scaling on the input flux, so `sdss.new_flux_unit()` handles that (this changes the numerical scale factor on the flux unit, not the flux unit itself).  Other inputs:

* `zcatalog`: the [redshift table](#Redshift-Catalog) created above.
* `model`: the model spectrum used by the SDSS pipeline to obtain the redshift and classification.
* `notebook`: instructs Prospect to use notebook display mode (as opposed to standalone HTML).
* `title`, `model_from_zcat`, etc.: these options control certain display options that are not directly related to the input data or that need to be switched off for SDSS data.  A [full description is available](https://desi-prospect.readthedocs.io/en/stable/api.html#prospect.viewer.plotspectra).

Once the cell is run, it will take several seconds for the data to load and render.  Then you will be able to pan and zoom on the primary flux *versus* wavelength display, page through different spectra, show known emission or absorption lines, etc.  [More details are below the display.](#Visualization-Interface-Details)

In [None]:
plotspectra(sdss.new_flux_unit(u.Unit('1e-17 erg/(cm**2 s Angstrom)')),
            zcatalog=sdss_z,
            model=(sdss.spectral_axis.value, sdss.meta['model'].to(u.Unit('1e-17 erg/(cm**2 s Angstrom)'))),
            notebook=True, title=os.path.basename('SDSS Plate %d' % plate), model_from_zcat=False, with_coaddcam=False, mask_type='PRIMTARGET', with_thumb_tab=False, with_vi_widgets=False)

### Visualization Interface Details

* The primary visualization area consists of three plots:
  - The flux versus wavelength in the large area on the left.
    * Pan and zoom functions can be adjusted by clicking on the icons on the upper right.
    * Display of values can be toggled by clicking on the names in the legend in the upper right.  For example, to hide the noise spectrum, click on 'noise' in the legend.
  - The zoom display on the lower right.  This plot is automatically updated when the mouse is hovered or moved over the large spectrum display on the left.
  - The targeting image on the upper right.  This shows the image corresponding to the object.  The images are provided by the [Legacy Survey](https://www.legacysurvey.org/), and are actually deeper than SDSS images.  Click on the image to take you to the Legacy Survey Viewer.
* Target information table: this table displays information related to the *photometric object* associated with the spectrum.
  - `TARGETID`: For SDSS data, this tuple of numbers corresponds to the photometric imaging run, photometric processing version, camera column ("camcol"), imaging field (within the run), and object ID within the field.
  - Target class: This contains information about why the photometric object was selected for spectroscopic observation.  For SDSS data this is currently a dummy value, but will change in future versions of Prospect.
  - `mag_u`, ...: These are the photometric magnitudes in SDSS *u*, *g*, *r*, *i*, *z* filters.
* Reset X-Y range button: reset the display to the default pan and zoom values.
* OII-zoom button / Undo: pan and zoom the display to the position of the \[OII\] emission line.  This is not necessarily relevant for SDSS data.  "Undo" restores the previous pan/zoom values.
* Spectrum slider (< / >): Use these buttons or the slider to page through the loaded spectra.
* Pipeline fit (table): Displayes the classification and redshift obtained by the SDSS processing pipeline.
  - `SPECTYPE`: The type of object, typically 'GALAXY', 'QSO', 'STAR'.
  - `SUBTYPE`: A more specific classification of `SPECTYPE`, for example a 'STAR' may be 'K5' or a 'GALAXY' may be 'STARFORMING'.
  - `Z`: The redshift.
  - `ZERR`: The uncertainty on `Z`.
  - `ZWARN`: If the processing pipeline produced errors, this value will be non-zero.
  - $\Delta\chi^2$: the goodness of fit of the fitted model relative to the second-best model.
* Redshift box (orange): Use this box to adjust the redshift of the *model* on the display.
  - Rough and fine tuning adjust the displayed redshift in units of hundredths and ten-thousandths, respectively.
  - Reset to z_pipe: restore the displayed redshift to the redshift value obtained from the SDSS pipeline.
  - Copy z to VI: this button is disabled.
* Gaussian Sigma: SDSS spectra are inherently faint and noisy.  Smoothing will reduce the "jaggedness" caused by noise.  This is especially useful when comparing to the fitted model or investigating spectra lines.
* Emission lines / Absorption lines: Toggle the display of known spectral lines.  If "Show only major lines" is checked, only the spectral lines most likely to be visible are displayed.
* Obs / Rest: Display flux as a function of observed wavelength or rest wavelength, respectively.

## SDSS Spectra and Redshifts from Files

For reference, and to compare timing, below is how one would fetch spectra and redshift metadata from file-based storage.  First, obtain the path to the files in the Astro Data Lab file service.

In [None]:
spectro_redux = 'sdss_dr16://' + os.path.join('sdss', 'spectro', 'redux')
run2d = '26'
plate = 2955
mjd = '54562'
sdss_spectra = os.path.join(spectro_redux, run2d, str(plate), f'spPlate-{plate:d}-{mjd}.fits')
sdss_redshifts = os.path.join(spectro_redux, run2d, str(plate), f'spZbest-{plate:d}-{mjd}.fits')
print(sdss_spectra)
print(sdss_redshifts)

Next, read the spPlate file.  This automatically loads the [targeting metadata](#Target-Catalog).

In [None]:
%%time
sdss_file = read_spPlate(sc.get(sdss_spectra, mode='fileobj'), limit=50)
sdss_file

Read the spZbest file, which contains the [redshift catalog](#Redshift-Catalog).

In [None]:
sdss_z_file, sdss_model_file = read_spZbest(sc.get(sdss_redshifts, mode='fileobj'), limit=50)

Launch the visualization.  Reading from files produces slightly different scaling on the flux values, but otherwise the options are the same as the example above.

In [None]:
plotspectra(sdss_file,
            zcatalog=sdss_z_file,
            model=(sdss_model_file.spectral_axis.value, sdss_model_file.flux.value,),
            notebook=True, title=os.path.basename('SDSS Plate %d' % plate), model_from_zcat=False, with_coaddcam=False, mask_type='PRIMTARGET', with_thumb_tab=False, with_vi_widgets=False)

## References & Resources

* [Getting Started with Spectral Data](https://github.com/noaodatalab/specserver/blob/master/doc/03_GettingStartedWithSpectra.ipynb)

In [None]:
end_time = time.time()
print('Notebook run time: %.3g sec' % (end_time - start_time))