# Specutils Analysis

![Specutils: An Astropy Package for Spectroscopy](data/specutils_logo.png)


This notebook provides an overview of some of the spectral analysis capabilities of the Specutils Astropy coordinated package.  While this notebook is intended as an interactive introduction to specutils at the time of its writing, the canonical source of information for the package is the latest version's documentation: 

https://specutils.readthedocs.io

Note that the below assumes you have knowledge of the material in the [overview notebook](Specutils_overview.ipynb).  If this is not the case you may wish to review that notebook before proceding here.

## Imports

We start with some fundamental imports for working with specutils and simple visualization of spectra:

In [None]:
import numpy as np

import astropy.units as u

import specutils
from specutils import Spectrum1D, SpectralRegion
specutils.__version__

In [None]:
# for plotting:
%matplotlib inline
import matplotlib.pyplot as plt


# for showing quantity units on axes automatically:
from astropy.visualization import quantity_support
quantity_support();

## Sample Spectrum and SNR

For use below, we also load the sample SDSS spectrum downloaded in the [overview notebook](Specutils_overview.ipynb).  See that notebook if you have not yet downloaded this spectrum.

In [None]:
from urllib.request import urlretrieve

sdss_spectrum_path = 'data/sdss_spectrum.fits'

url = 'https://data.sdss.org/sas/dr16/sdss/spectro/redux/26/spectra/1323/spec-1323-52797-0012.fits'
urlretrieve(url, sdss_spectrum_path)

sdss_spec = Spectrum1D.read(sdss_spectrum_path, format='SDSS-III/IV spec')
plt.step(sdss_spec.wavelength, sdss_spec.flux);

This example file already has uncertainties, but they are initially in inverse variance form.  We convert that to standard deviation form to simplify some of the operations below, and for the same region unmask the whole spectrum.

In [None]:
sdss_spec.uncertainty.uncertainty_type

In [None]:
from astropy.nddata import StdDevUncertainty

sdss_spec.uncertainty = StdDevUncertainty(sdss_spec.uncertainty.quantity**-0.5)
sdss_spec.uncertainty.uncertainty_type

In [None]:
sdss_spec.mask[:] = False

With these uncertainties, it is straightforward to use one of the fundamental quantifications of a spectrum: the whole-spectrum signal-to-noise ratio:

In [None]:
from specutils import analysis

analysis.snr(sdss_spec)

# Spectral Regions

Most analysis required on a spectrum requires specification of a part of the spectrum - e.g., a spectral line.  Because such regions may have value independent of a particular spectrum, they are represented as objects distrinct from a given spectrum object.  Below we outline a few ways such regions are specified.

In [None]:
ha_region = SpectralRegion((6563-50)*u.AA, (6563+50)*u.AA)
ha_region

Regions can also be raw pixel values (although of course this is more applicable to a specific spectrum):

In [None]:
pixel_region = SpectralRegion(2100*u.pixel, 2600*u.pixel)
pixel_region

Additionally, *multiple* regions can be in the same `SpectralRegion` object. This is useful for e.g. measuring multiple spectral features in one call:

In [None]:
HI_wings_region = SpectralRegion([(1.44*u.GHz, 1.43*u.GHz), (1.41*u.GHz, 1.4*u.GHz)])
HI_wings_region

While regions are useful for a variety of analysis steps, fundamentally they can be used to extract sub-spectra from larger spectra:

In [None]:
from specutils.manipulation import extract_region

subspec = extract_region(sdss_spec, pixel_region)
plt.step(subspec.wavelength, subspec.flux)

analysis.snr(subspec)

# Line Measurements

While line-fitting (detailed more below) is a good choice for high signal-to-noise spectra or when detailed kinematics are desired, more empirical measures are often used in the literature for noisier spectra or just simpler analysis procedures. Specutils provides a set of functions to provide these sorts of measurements, as well as similar summary statistics about spectral regions.  The [analysis part of the specutils documentation](https://specutils.readthedocs.io/en/latest/analysis.html) provides a full list and detailed examples of these, but here we demonstrate some example cases.

Note: these line measurements generally assume your spectrum is continuum-subtracted or continuum-normalized. Some spectral pipelines do this for you, but often this is not the case.  For our examples here we will do this step "by-eye", but for a more detailed discussion of continuum modeling, see the next section.  Based on the above plot we estimate a continuum level for the area of the SDSS spectrum around the H-alpha emission line, and use basic math to construct the continuum-normalized and continuum-subtracted spectra.

In [None]:
# estimate a reasonable continuum-level estimate for the h-alpha area of the spectrum
sdss_continuum = 205*subspec.flux.unit

sdss_halpha_contsub = extract_region(sdss_spec, ha_region) - sdss_continuum

plt.axhline(0, c='k', ls=':')
plt.step(sdss_halpha_contsub.wavelength, sdss_halpha_contsub.flux)
plt.ylim(-50, 50)

With the continuum level identified, we can now make some measurements of the spectral lines that are apparent by eye - in particular we will focus on the H-alpha emission line. While there are techniques for identifying the line automatically (see the fitting section below), here we assume we are doing "quick-look" procedures where manual identification is possible. 

In the cell below, change the values for `LOWER` and `UPPER` to make a spectral region that just encompasses the  H-alpha line (the middle of the three lines). You may find it useful to change the values, re-run the cell, and change again to "hone in" on the right number.

In [None]:
LOWER = 6562 * u.angstrom
UPPER = 6575 * u.angstrom
halpha_lines_region = SpectralRegion(LOWER, UPPER)

plt.step(sdss_halpha_contsub.wavelength, sdss_halpha_contsub.flux)

yl1, yl2 = plt.ylim()
plt.fill_between([halpha_lines_region.lower, halpha_lines_region.upper], 
                 yl1, yl2, alpha=.2)
plt.ylim(yl1, yl2)

You can now call a variety of analysis functions on the continuum-subtracted spectrum to estimate various properties of the line (you can see the full list of relevant analysis functions [in the analysis part of the specutils docs](https://specutils.readthedocs.io/en/stable/analysis.html#functions)):

In [None]:
analysis.centroid(sdss_halpha_contsub, halpha_lines_region)

In [None]:
analysis.fwhm(sdss_halpha_contsub, halpha_lines_region)

In [None]:
analysis.line_flux(sdss_halpha_contsub, halpha_lines_region)

Equivalent width, being a continuum dependent property, can either be computed directly from the spectrum if the continuum level is given, or measured on a continuum-normalized spectrum. The latter is mainly useful if the continuum is non-uniform over the line being measured.

In [None]:
analysis.equivalent_width(sdss_spec, sdss_continuum, regions=halpha_lines_region)

In [None]:
sdss_halpha_contnorm = sdss_spec / sdss_continuum
analysis.equivalent_width(sdss_halpha_contnorm, regions=halpha_lines_region)

## Exercise

Load one of the spectrum datasets you made in the overview exercises into this notebook (i.e., your own dataset, a downloaded one, or the blackbody with an artificially added spectral feature).  Make a flux or width measurement of a line in that spectrum directly.  Is anything odd?

In [None]:
from astropy.modeling import models

wl = np.linspace(4000, 12000, 1024)*u.angstrom
flux = models.BlackBody(temperature=5800*u.K)(wl)

noise = np.random.randn(flux.size)*1e-6 * flux.unit
bbspec = Spectrum1D(spectral_axis=wl, flux=flux+noise, uncertainty=StdDevUncertainty(1e-6*np.ones_like(flux)))

gmodel = models.Gaussian1D(amplitude=bbspec.flux.mean(), mean=6000, stddev=50)
gflux = gmodel(bbspec.spectral_axis.value)

bbspec_w_gaussian = bbspec + Spectrum1D(spectral_axis=bbspec.spectral_axis, flux=gflux)
plt.step(bbspec_w_gaussian.spectral_axis, bbspec_w_gaussian.flux)

gregion = SpectralRegion(5900*u.angstrom, 6100*u.angstrom)
plt.axvline(gregion.lower, c='k')
plt.axvline(gregion.upper, c='k')

In [None]:
analysis.line_flux(bbspec_w_gaussian, gregion)

In [None]:
# Is that right?  Lets compare to a flat chunk of the size of the line region
200*3e-5

In [None]:
# It's too big! I.e., it includes the continuum

# Continuum Subtraction

While continuum-fitting for spectra is sometimes thought of as an "art" as much as a science, specutils provides the tools to do a variety of approaches to continuum-fitting, without making a specific recommendation about what is "best" (since it is often very data-dependent).  More details are available [in the relevant specutils doc section](https://specutils.readthedocs.io/en/latest/fitting.html#continuum-fitting), but here we outline the two basic options as it stands: an "often good-enough" function, and a more customizable tool that leans on the [`astropy.modeling`](http://docs.astropy.org/en/stable/modeling/index.html) models to provide its flexibility.

### The "often good-enough" way

The `fit_generic_continuum` function provides a function that is often sufficient for reasonably well-behaved continuua, particular for "quick-look" or similar applications where high precision is not that critical.  The function yields a continuum model, which can be evaluated at any spectral axis value:

In [None]:
from specutils.fitting import fit_generic_continuum

In [None]:
generic_continuum = fit_generic_continuum(sdss_spec)

generic_continuum_evaluated = generic_continuum(sdss_spec.spectral_axis)

plt.step(sdss_spec.spectral_axis, sdss_spec.flux)
plt.plot(sdss_spec.spectral_axis, generic_continuum_evaluated)
plt.ylim(100, 300);

(Note that in some versions of astropy/specutils you may see a warning that the "Model is linear in parameters" upon executing the above cell. This is not a problem unless performance is a serious concern, in which case more customization is required.)

With this model in hand, continuum-subtracted or continuum-normalized spectra can be produced using basic spectral manipulations:

In [None]:
sdss_gencont_sub = sdss_spec - generic_continuum(sdss_spec.spectral_axis)
sdss_gencont_norm = sdss_spec / generic_continuum(sdss_spec.spectral_axis)

ax1, ax2 = plt.subplots(2, 1)[1]

ax1.step(sdss_gencont_sub.wavelength, sdss_gencont_sub.flux)
ax1.set_ylim(-50, 50)
ax1.axhline(0, color='k', ls=':')  # continuum should be at flux=0

ax2.step(sdss_gencont_norm.wavelength, sdss_gencont_norm.flux)
ax2.set_ylim(0, 2)
ax2.axhline(1, color='k', ls='--');  # continuum should be at flux=1

### The customizable way

The `fit_continuum` function operates similarly to `fit_generic_continuum`, but is meant for you to provide your favorite continuum model rather than being tailored to a specific continuum model. To see the list of models, see the [astropy.modeling documentation](http://docs.astropy.org/en/stable/modeling/index.html).

In [None]:
from specutils.fitting import fit_continuum
from astropy.modeling import models

For example, suppose you want to use a 3rd-degree Chebyshev polynomial as your continuum model. You can use `fit_continuum` to get an object that behaves the same as for `fit_generic_continuum`:

In [None]:
chebdeg3_continuum = fit_continuum(sdss_spec, models.Chebyshev1D(3))

generic_continuum_evaluated = generic_continuum(sdss_spec.spectral_axis)

plt.step(sdss_spec.spectral_axis, sdss_spec.flux)
plt.plot(sdss_spec.spectral_axis, chebdeg3_continuum(sdss_spec.spectral_axis))
plt.ylim(100, 300);

This then provides total flexibility.  For example, you can also try other polynomials like higher-degree Hermite polynomials:

In [None]:
hermdeg7_continuum = fit_continuum(sdss_spec, models.Hermite1D(degree=7))
hermdeg17_continuum = fit_continuum(sdss_spec, models.Hermite1D(degree=17))

plt.step(sdss_spec.spectral_axis, sdss_spec.flux)
plt.plot(sdss_spec.spectral_axis, chebdeg3_continuum(sdss_spec.spectral_axis))
plt.plot(sdss_spec.spectral_axis, hermdeg7_continuum(sdss_spec.spectral_axis))
plt.plot(sdss_spec.spectral_axis, hermdeg17_continuum(sdss_spec.spectral_axis))
plt.ylim(150, 250);

This immediately demonstrates the tradeoffs in polynomial fitting: while the high-degree polynomials capture the wiggles of the spectrum better than the low, they also *over*-fit near the strong emission lines.

## Exercise

Try combining the `SpectralRegion` and continuum-fitting functionality to only fit the parts of the spectrum that *are* continuum (i.e. not including emission lines).  Can you do better?

In [None]:
LOWER = 6562 * u.angstrom
UPPER = 6575 * u.angstrom
halpha_lines_region_ex = SpectralRegion(LOWER, UPPER)

In [None]:
better_hermdeg17_continuum = fit_continuum(sdss_spec, models.Hermite1D(degree=17), exclude_regions=halpha_lines_region_ex)

In [None]:
plt.step(sdss_spec.spectral_axis, sdss_spec.flux)
plt.plot(sdss_spec.spectral_axis, hermdeg17_continuum(sdss_spec.spectral_axis))
plt.plot(sdss_spec.spectral_axis, better_hermdeg17_continuum(sdss_spec.spectral_axis))
plt.ylim(100, 300);

In [None]:
# Apparent that the fit is better around the Halpha line

## Exercise

Using the spectrum from the previous exercise, first subtract a continuum, then re-do your measurement.  Is it better?

In [None]:
generic_continuum_bbspec = fit_generic_continuum(bbspec) # we cheat a smidge and fit on the line-less version
bbspec_w_gaussian_contsub = bbspec_w_gaussian - generic_continuum_bbspec(bbspec_w_gaussian.spectral_axis)

plt.step(bbspec_w_gaussian_contsub.spectral_axis, bbspec_w_gaussian_contsub.flux)

In [None]:
analysis.line_flux(bbspec_w_gaussian_contsub, gregion)

In [None]:
# Much better number than before!

# Line-Fitting

In addition to the more empirical measurements described above, `specutils` provides tools for doing spectral line fitting. The approach is akin to that for continuum modeling: models from [astropy.modeling](http://docs.astropy.org/en/stable/modeling/index.html) are fit to the spectrum, and either those models can be used directly, or their parameters.

In [None]:
from specutils import fitting

The fitting machinery must first be given guesses for line locations. This process can be automated using functions designed to identify lines (more detail on the options is [in the docs](https://specutils.readthedocs.io/en/latest/fitting.html#line-finding)).  For data sets where these algorithms are not ideal, you may substitute your own (i.e., skip this step and start with line location guesses). 

Here we identify the three lines near the Halpha region in our SDSS spectrum, finding the lines above about a $\sim 3 \sigma$ flux threshold.  They are then output as an astropy Table:

In [None]:
halpha_lines = fitting.find_lines_threshold(sdss_halpha_contsub, 3)

plt.step(sdss_halpha_contsub.spectral_axis, sdss_halpha_contsub.flux, where='mid')
for line in halpha_lines:
    plt.axvline(line['line_center'], color='k', ls=':')

halpha_lines_region

(If you see a warning about the signal-to-noise, you can ignore it, or follow the instructions it gives to supress the warning. It is occurring because our cutout has a lot of real flux so it *could* be the case instead that we forgot to subtract the continuum.)

Now for each of these lines, we need to fit a model. Sometimes it is sufficient to simply create a model where the center is at the line and excise the appropriate area of the line to do a  line estimate.  This is not *too* sensitive to the size of the region, at least for well-separated lines like these.  The result is a list of models that carry with them them the details of the fit:

In [None]:
halpha_line_models = []
for line in halpha_lines:
    line_region = SpectralRegion(line['line_center']-5*u.angstrom,
                                 line['line_center']+5*u.angstrom)
    line_spectrum = extract_region(sdss_halpha_contsub, line_region)
    # here's the workaround from above again
    line_spectrum = Spectrum1D(flux=line_spectrum.flux, spectral_axis=line_spectrum.spectral_axis, uncertainty=line_spectrum.uncertainty)
    line_estimate = models.Gaussian1D(mean=line['line_center'])
    line_model = fitting.fit_lines(line_spectrum, line_estimate)
    
    halpha_line_models.append(line_model)
    
plt.step(sdss_halpha_contsub.spectral_axis, sdss_halpha_contsub.flux, where='mid')
for line_model in halpha_line_models:
    evaluated_model = line_model(sdss_halpha_contsub.spectral_axis)
    plt.plot(sdss_halpha_contsub.spectral_axis, evaluated_model)  
    
halpha_line_models

For more complicated models or fits it may be better to use the `estimate_line_parameters` function instead of manually creating e.g. a `Gaussian1D` model and setting the center.  An example of this pattern is given below.

Note that we provided a default `Gaussian1D` model to the `estimate_line_parameters` function above.  This function makes reasonable guesses for `Gaussian1D`, `Voigt1D`, and `Lorentz1D`, the most common line profiles used for spectral lines, but may or may not work for other models.  See [the relevant docs section](https://specutils.readthedocs.io/en/latest/fitting.html#parameter-estimation) for more details.

In this example we also show an example of a *joint* fit of all three lines at the same time.  While the difference may seems subtle, in cases of blended lines this typically provides much better fits:

In [None]:
halpha_line_estimates = []
for line in halpha_lines:
    line_region = SpectralRegion(line['line_center']-3*u.angstrom,
                                 line['line_center']+3*u.angstrom)
    line_spectrum = extract_region(sdss_halpha_contsub, line_region)
    line_estimate = fitting.estimate_line_parameters(line_spectrum, models.Gaussian1D())
    
    halpha_line_estimates.append(line_estimate)

# this could be done more flexibly with a for loop but we are explicit here for simplicity
combined_model_estimate = halpha_line_estimates[0] + halpha_line_estimates[1] + halpha_line_estimates[2]
combined_model_estimate

In [None]:
combined_model = fitting.fit_lines(sdss_halpha_contsub, combined_model_estimate)

plt.step(sdss_halpha_contsub.spectral_axis, sdss_halpha_contsub.flux, where='mid')
plt.plot(sdss_halpha_contsub.spectral_axis, 
         combined_model(sdss_halpha_contsub.spectral_axis))  
    
combined_model

## Exercise

Fit a spectral feature from your own spectrum using the fitting methods outlined above. Try the different line profile types (Gaussian, Lorentzian, or Voigt).  If you are using the blackbody spectrum (where you know the "true" answer for the spectral line), compare your answer to the true answer.

In [None]:
plt.step(bbspec_w_gaussian.spectral_axis, bbspec_w_gaussian.flux);

In [None]:
line_region = SpectralRegion(5800*u.angstrom, 6200*u.angstrom)
line_spectrum = extract_region(bbspec_w_gaussian_contsub, line_region)
line_estimate = fitting.estimate_line_parameters(line_spectrum, models.Gaussian1D())
line_estimate

In [None]:
line_model = fitting.fit_lines(line_spectrum, line_estimate)
line_model

In [None]:
plt.step(line_spectrum.spectral_axis, line_spectrum.flux);
model_wl = np.linspace(line_spectrum.spectral_axis[0], line_spectrum.spectral_axis[-1], 100)
plt.plot(model_wl, line_model(model_wl))

In [None]:
# Bonus: we can fit the *combined* spectrum since it's a blackbody with a single line:

In [None]:
# amplitude of 1 for the bb is what we used to create it so that's a good estimate
combined_estimated_model = line_estimate + models.BlackBody(temperature=5800*u.K) 

plt.step(bbspec_w_gaussian.spectral_axis, bbspec_w_gaussian.flux);
model_wl = np.linspace(bbspec_w_gaussian.spectral_axis[0], bbspec_w_gaussian.spectral_axis[-1], 100)
plt.plot(model_wl, combined_estimated_model(model_wl))
combined_estimated_model

In [None]:
# notice this is using the *line spectrum* to focus on getting the line part right
full_spec_model = fitting.fit_lines(line_spectrum, combined_estimated_model)
full_spec_model

In [None]:
plt.step(bbspec_w_gaussian.spectral_axis, bbspec_w_gaussian.flux);
model_wl = np.linspace(bbspec_w_gaussian.spectral_axis[0], bbspec_w_gaussian.spectral_axis[-1], 100)
plt.plot(model_wl, full_spec_model(model_wl))

In [None]:
# huzzah!  the fit gives a pretty close answer given the noise
gmodel, full_spec_model.left