# Using Nested-Pandas with Astronomical Spectra

In Astronomy, a spectrum is a measurement (or combination of measurements) of an object that shows the intensity of light emitted over a range of energies. In this tutorial, we'll walk through a simple example of working with spectra from the Sloan Digital Sky Survey (SDSS), in particular showing how it can be represented as a `NestedFrame`.

First, we'll use `astroquery` and `astropy` to download a handful of spectra from SDSS:

In [None]:
from astroquery.sdss import SDSS
from astropy import coordinates as coords
import astropy.units as u
import nested_pandas as npd

# Query SDSS for a set of objects with spectra
pos = coords.SkyCoord("0h8m10.63s +14d50m23.3s", frame="icrs")
xid = SDSS.query_region(pos, radius=3 * u.arcmin, spectro=True)
xid_ndf = npd.NestedFrame(xid.to_pandas())
xid_ndf

This initial query returns a set of objects with spectra (as specified by the `spectro=True` flag). To actually retrieve the spectra, we can do the following:

In [None]:
# Query SDSS for the corresponding spectra
sp = SDSS.get_spectra(matches=xid)
sp

The result is a list of FITS formatted data. From this point there are a few ways that we could move towards a nested-pandas representation. The most straightforward is to build a "flat" spectra table from all the objects, where we gather the information from each spectrum into a single combined table.

In [None]:
import numpy as np

# Build a flat spectrum dataframe

# Initialize some empty arrays to hold the flat data
wave = np.array([])
flux = np.array([])
err = np.array([])
index = np.array([])
# Loop over each spectrum, adding it's data to the arrays
for i, hdu in enumerate(sp):
    wave = np.append(wave, 10 ** hdu["COADD"].data.loglam)  # * u.angstrom
    flux = np.append(flux, hdu["COADD"].data.flux * 1e-17)  # * u.erg/u.second/u.centimeter**2/u.angstrom
    err = np.append(err, 1 / hdu["COADD"].data.ivar * 1e-17)  # * flux.unit

    # We'll need to set an index to keep track of which rows correspond
    # to which object
    index = np.append(index, i * np.ones(len(hdu["COADD"].data.loglam)))

# Build a NestedFrame from the arrays
flat_spec = npd.NestedFrame(dict(wave=wave, flux=flux, err=err), index=index.astype(np.int8))
flat_spec

From here, we can simply nest our flat table within our original query result:

In [None]:
spec_ndf = xid_ndf.add_nested(flat_spec, "coadd_spectrum").set_index("objid")
spec_ndf

And we can see that each object now has the "coadd_spectrum" nested column with the full spectrum available.

In [None]:
# Look at one of the spectra
spec_ndf.iloc[1].coadd_spectrum

We now have our spectra nested, and can proceed to do any filtering and analysis as normal within nested-pandas.


In [None]:
import matplotlib.pyplot as plt

# Plot a spectrum
spec = spec_ndf.iloc[1].coadd_spectrum

plt.plot(spec["wave"], spec["flux"])
plt.xlabel("Wavelength (Å)")
plt.ylabel(r"Flux ($ergs/s/cm^2/Å$)")