In [None]:
__author__ = 'Benjamin Weaver <benjamin.weaver@noirlab.edu>'
__version__ = '20210125'
__datasets__ = ['sdss_dr16']
__keywords__ = ['HowTo','spectra','query','visualization','SDSS']

# How to Plot Spectral Data with Prospect, Specutils and the Data Lab Spectrum Service

## Table of Contents

* [Goals](#Goals)
* [Summary](#Summary)
* [Disclaimer & Attribution](#Disclaimer-&-Attribution)
* [Imports & Setup](#Imports-&-Setup)
* [Authentication](#Authentication)
* [SDSS/eBOSS spPlate file](#SDSS/eBOSS-spPlate-file)
  - [Find Plate 2955 in the Spectrum Service](#Find-Plate-2955-in-the-Spectrum-Service)
  - [Spectrum Identifiers](#Spectrum-Identifiers)
  - [Spectrum Service: Flux versus Wavelength](#Spectrum-Service:-Flux-versus-Wavelength)
  - [Redshift and other Metadata](#Redshift-and-other-Metadata)
* [Launch the Visualization](#Launch-the-Visualization)
* [SDSS Spectra and Redshifts from Files](#SDSS-Spectra-and-Redshifts-from-Files)
* [References & Resources](#References-&-Resources)

## Goals

Obtain SDSS spectra using the Data Lab spectrum service, convert data to [specutils](https://github.com/astropy/specutils) objects as needed, and use [Prospect](https://github.com/desihub/prospect) to display the data.

## Summary

Prospect is an *interactive* spectrum visualization service that is being actively used by the [DESI](https://www.desi.lbl.gov/) project for visual inspection of commissioning and survey validation spectra right now.  Rather than just a simple *flux* versus *wavelength* plot, the interactive spectrum display provides pan, zoom, known spectral lines, targeting and other catalog information, metadata, *etc.*  The [Bokeh](https://bokeh.org) library handles the low-level interactivity.

We can leverage the generic spectrum container objects provided by [specutils](https://github.com/astropy/specutils) to allow Prospect to plot data from other surveys.  In this example, we are using [SDSS DR16](https://www.sdss.org/dr16/).

In the future we will extend this functionality to additional instruments.  For example, we are excited by the possibility of using Prospect and the Data Lab spectrum service for Gemini spectra.  And, of course, once DESI data become public, both the spectrum service and this tool will be ready.

## Disclaimer & Attribution

If you use this notebook for your published science, please acknowledge the following:

* Data Lab concept paper: Fitzpatrick et al., "The NOAO Data Laboratory: a conceptual overview", SPIE, 9149, 2014, http://dx.doi.org/10.1117/12.2057445
* Data Lab disclaimer: https://datalab.noirlab.edu/disclaimers.php

## Imports & Setup

This cell performs all necessary imports for this notebook.

**TODO**: final clean-up needed when prospect is installed on the production server.

In [None]:
import os
import sys
import time
#
# Manually install prospect for now.
#
sys.path.insert(0, os.path.join(os.environ['HOME'], 'Documents', 'Code', 'git', 'desihub', 'prospect', 'py'))
#
# Numpy, astropy, etc.
#
import numpy as np
import astropy.units as u
from astropy.table import Table
from astropy.nddata import InverseVariance
from specutils import Spectrum1D, SpectrumCollection, SpectrumList
#
# Prospect imports
#
from prospect.specutils import read_spPlate, read_spZbest
from prospect.plotspecutils import plotspectra
#
# Data Lab imports
#
from dl import specClient as spec  # primary spectral data client interface
from dl import storeClient as sc   # needed to use virtual storage
from dl import queryClient as qc   # needed to query Data Lab catalogs
from dl import authClient as ac    # needed for login authentication

start_time = time.time()  # save start time of notebook run

## Authentication

Much of the functionality of spectrum services can be accessed without explicitly logging into Data Lab (the service then uses an anonymous login). But some capabilities, for instance saving the results of your queries to your virtual storage space, require a login (*i.e.* you will need a [registered user account](https://datalab.noao.edu/help/register.php?qa=register).

If you need to log in to Data Lab, uncomment the `ac.login()` command and respond according to the prompts.  If you have previously logged into Data Lab, this cell will simply print your active user name.

In [None]:
# ac.login(input("Enter user name: (+ENTER) "),getpass("Enter password: (+ENTER) "))
ac.whoAmI()

## SDSS/eBOSS spPlate file

SDSS spectra are stored per-plate in spPlate files.  These contain 640 spectra for the original SDSS spectrograph or 1000 spectra for the BOSS/eBOSS spectrograph.  All spPlate files have a common wavelength solution, so a spPlate file can be represented by a [Spectrum1D](https://specutils.readthedocs.io/en/stable/api/specutils.Spectrum1D.html#specutils.Spectrum1D) object.  This object is an input to the visualization system, so we need to get outputs from the spectrum service that match the inputs to the visualization.

### Find Plate 2955 in the Spectrum Service

Here we find spectra corresponding to SDSS plate 2955, but using the spectrum service instead of reading the spectra from a file.

In [None]:
plate = 2955
# What the heck? Can't use a semicolon any longer?
# q = "SELECT ra, dec FROM sdss_dr16.platex WHERE plate = %d;" % plate
q = "SELECT ra, dec FROM sdss_dr16.platex WHERE plate = {0:d}".format(plate)
r = qc.query(sql=q, fmt='array')
r

### Spectrum Identifiers

SDSS plates are circular, so we'll do a cone search around the nominal plate center.  This gets us the spectrum identifiers we need to retrieve the spectra themselves.

In [None]:
sdss_ids = spec.query(r[0], r[1], 1.5, fmt='array', out='', constraint='plate=2955 ORDER BY specobjid LIMIT 50')
sdss_ids

### Spectrum Service: Flux versus Wavelength

Now we fetch the spectra.  The returned data will be compacted into a single `Spectrum1D` object.

In [None]:
%%time
sdss = spec.getSpec(sdss_ids, fmt='Spectrum1D', align=True)
sdss

### Redshift and other Metadata

We also need target and redshift information, which is stored in the SpecObj catalog.  Although all information is extractable from the database, we need to reformat it to match the expectations of Prospect.

In [None]:
q = "SELECT specobjid, class, subclass, z, zerr, rchi2diff, zwarning FROM sdss_dr16.specobj WHERE specobjid IN ({0}) ORDER BY specobjid".format(','.join(map(str, sdss_ids.tolist())))
sdss_z = qc.query(sql=q, fmt='table')
for c in ('specobjid', 'class', 'subclass', 'z', 'zerr', 'rchi2diff', 'zwarning'):
    if c == 'zerr':
        sdss_z.rename_column(c, 'Z_ERR')
    else:
        sdss_z.rename_column(c, c.upper())
sdss_z

In [None]:
q = """SELECT s.specobjid, s.targetobjid, s.ra, s.dec,
    s.primtarget, s.sectarget,
    s.boss_target1,
    s.ancillary_target1, s.ancillary_target2,
    s.eboss_target0, s.eboss_target1, s.eboss_target2,
    p.u, p.g, p.r, p.i, p.z
FROM sdss_dr16.specobj AS s
LEFT JOIN sdss_dr16.photoplate AS p
ON s.bestobjid = p.objid
WHERE specobjid IN ({0})
ORDER BY specobjid""".format(','.join(map(str, sdss_ids.tolist())))
sdss_plugmap = qc.query(sql=q, fmt='table')
for c in ('ra', 'dec', 'primtarget', 'sectarget',
          'boss_target1',
          'ancillary_target1', 'ancillary_target2',
          'eboss_target0', 'eboss_target1', 'eboss_target2'):
    sdss_plugmap.rename_column(c, c.upper())
sdss_plugmap['OBJID'] = np.vstack((np.bitwise_and(sdss_plugmap['targetobjid'] >> 32, 2**16 - 1).data,
                                   np.bitwise_and(sdss_plugmap['targetobjid'] >> 48, 2**11 - 1).data,
                                   np.bitwise_and(sdss_plugmap['targetobjid'] >> 29, 2**3 - 1).data,
                                   np.bitwise_and(sdss_plugmap['targetobjid'] >> 16, 2**12 - 1).data,
                                   np.bitwise_and(sdss_plugmap['targetobjid'], 2**16 - 1).data)).T
sdss_plugmap['MAG'] = np.vstack((np.where(np.isnan(sdss_plugmap['u'].data), 0, sdss_plugmap['u'].data),
                                 np.where(np.isnan(sdss_plugmap['g'].data), 0, sdss_plugmap['g'].data),
                                 np.where(np.isnan(sdss_plugmap['r'].data), 0, sdss_plugmap['r'].data),
                                 np.where(np.isnan(sdss_plugmap['i'].data), 0, sdss_plugmap['i'].data),
                                 np.where(np.isnan(sdss_plugmap['z'].data), 0, sdss_plugmap['z'].data))).T

sdss.meta['plugmap'] = sdss_plugmap
sdss_plugmap

## Launch the Visualization

Once all the spectral data are assembled, the visualization is launched with a single `plotspectra()` function.

In [None]:
plotspectra(sdss.new_flux_unit(u.Unit('1e-17 erg/(cm**2 s Angstrom)')), zcatalog=sdss_z, model=(sdss.spectral_axis.value, sdss.meta['model']),
            notebook=True, title=os.path.basename('SDSS Plate %d' % plate),
            model_from_zcat=False, with_coaddcam=False, mask_type='PRIMTARGET', with_thumb_tab=False, with_vi_widgets=False)

## SDSS Spectra and Redshifts from Files

For reference, and to compare timing, below is how one would fetch spectra and redshift metadata from file-based storage.

In [None]:
os.environ['SPECTRO_REDUX'] = 'sdss_dr16://' + os.path.join('sdss', 'spectro', 'redux')
run2d = '26'
plate = 2955
mjd = '54562'
sdss_spectra = os.path.join(os.environ['SPECTRO_REDUX'], run2d, str(plate), f'spPlate-{plate:d}-{mjd}.fits')
sdss_redshifts = os.path.join(os.environ['SPECTRO_REDUX'], run2d, str(plate), f'spZbest-{plate:d}-{mjd}.fits')
print(sdss_spectra)
print(sdss_redshifts)

In [None]:
%%time
sdss_file = read_spPlate(sc.get(sdss_spectra, mode='fileobj'), limit=50)
sdss_file

In [None]:
sdss_z_file, sdss_model_file = read_spZbest(sc.get(sdss_redshifts, mode='fileobj'), limit=50)

In [None]:
plotspectra(sdss_file, zcatalog=sdss_z_file, model=(sdss_model_file.spectral_axis.value, sdss_model_file.flux.value,),
            notebook=True, title=os.path.basename('SDSS Plate %d' % plate),
            model_from_zcat=False, with_coaddcam=False, mask_type='PRIMTARGET', with_thumb_tab=False, with_vi_widgets=False)

## References & Resources

* [Getting Started with Spectral Data](https://github.com/noaodatalab/specserver/blob/master/doc/03_GettingStartedWithSpectra.ipynb)

In [None]:
end_time = time.time()
print('Notebook run time: %.3g sec' % (end_time - start_time))