# SPLAT Tutorials: Working with Spectral Datasets

## Authors
Adam Burgasser

## Version date
27 May 2023

## Learning Goals
* Select sets of spectra from the SPLAT library (splat.searchLibrary)
* Analyze these sets together (splat.classifyByStandard, splat.measureIndexSet)
* Plot batches of spectra (splat.plot.plotSpectrum, splat.plot.plotBatch) 

## Keywords
spectral archive, spectral analysis, indices, classification

## Companion Content
None

## Summary
In this tutorial, we are going to see how we can select subsets of spectra from the SPLAT library for analysis, and conduct some basic analyses that we save to spreadsheets.


In [None]:
# main splat import
import splat
import splat.plot as splot

# other useful imports
import matplotlib.pyplot as plt
import numpy as np
import pandas
import astropy.units as u
%matplotlib inline


# Selecting sets of spectra

Here we're going to see how we select sets of spectra using splat.searchLibrary(). It is generally faster to read in a spreadsheet of the sources you are interested in before you actually read in the spectral data. We're going to see a couple of ways of selecting subsets of spectra

In [None]:
splat.searchLibrary?

In [None]:
# selecting by spectral type
dp = splat.searchLibrary(spt='T5')
dp

In [None]:
# selecting by spectral type range
dp = splat.searchLibrary(spt=['L5','L8'])
dp

In [None]:
# selecting by spectral type range and signal-to-noise (value given is minimum S/N)
dp = splat.searchLibrary(spt=['L5','L8'],snr=50)
dp

In [None]:
# selecting by OPTICAL spectral type range and signal-to-noise (value given is minimum S/N)
dp = splat.searchLibrary(opt_spt=['L5','L8'],snr=50)
dp

In [None]:
# select young L dwarfs
dp = splat.searchLibrary(opt_spt=['L0','L9'],young=True)
dp

In [None]:
# select metal-poor L dwarfs
dp = splat.searchLibrary(opt_spt=['L0','L9'],subdwarf=True)
dp

In [None]:
# select giants
dp = splat.searchLibrary(giant=True)
dp

# Reading in the spectra

Once you've identified the spectra you want, you can read them in based on the spreadsheet info or splat.getSpectrum(). Be sure you have a manageable list!

In [None]:
# select metal-poor L dwarfs
# then read in using the data key
dp = splat.searchLibrary(opt_spt=['L0','L9'],subdwarf=True)
splist = []
for i in dp['DATA_KEY']:
    splist.append(splat.Spectrum(i))
    print('Read in spectrum of {}'.format(splist[-1].name))
splist

In [None]:
# do the same but read in by filename
dp = splat.searchLibrary(opt_spt=['L0','L9'],subdwarf=True)
splist = []
for f in dp['DATA_FILE']:
    splist.append(splat.Spectrum(file=f))
    print('Read in spectrum of {}'.format(splist[-1].name))
splist

In [None]:
# the same syntax can be used to read in a list of spectra using splat.getSpectrum()
splist = splat.getSpectrum(opt_spt=['L0','L9'],subdwarf=True)
splist

# Measurements on samples of spectra

We can add measurements to the pandas spreadsheet created by searchLibrary(), a convenient way to manage and save analyses

In [None]:
# let's measure the classifications of our sources
dp = splat.searchLibrary(opt_spt=['L0','L9'],subdwarf=True)
dp['SPEX_SPT'] = ['']*len(dp)
# note the use of enumerate here
for i,f in enumerate(dp['DATA_FILE']):
    sp = splat.Spectrum(file=f)
    spt,spt_e = splat.classifyByStandard(sp,method='kirkpatrick')
    dp['SPEX_SPT'].iloc[i] = spt
dp['SPEX_SPT']

In [None]:
# another way of doing this
dp = splat.searchLibrary(opt_spt=['L0','L9'],subdwarf=True)
spts = []
# note the use of enumerate here
for i,f in enumerate(dp['DATA_FILE']):
    sp = splat.Spectrum(file=f)
    spts.append(splat.classifyByStandard(sp,method='kirkpatrick')[0])
dp['SPEX_SPT'] = spts
dp['SPEX_SPT']

In [None]:
# here's how you can measure many indices on the spectra and store them to your pandas dataframe
dp = splat.searchLibrary(opt_spt=['L0','L9'],subdwarf=True)

# first figure out what indices we're measuring
# the names of the indices are in the keys
sp = splat.Spectrum(file=dp['DATA_FILE'].iloc[0])
ind = splat.measureIndexSet(sp)
indices = ind.keys()

# add these to the dataframe
for i in indices: dp[i] = np.zeros(len(dp))
    
# now measure all of the spectra
for i,f in enumerate(dp['DATA_FILE']):
    sp = splat.Spectrum(file=f)
    ind = splat.measureIndexSet(sp)
    for indname in indices: dp[indname].iloc[i]=ind[indname][0]

# print out the values you've measureed
dp[indices]


# Plotting batches of spectra

Here's some examples of plotting samples of spectra using either plotSpectrum() or plotBatch(); you can see more examples at this page: https://spl-toolkit.readthedocs.io/en/latest/splat_plot/ 

In [None]:
# learn more about these functions
splot.plotSpectrum?

In [None]:
# learn more about these functions
splot.plotBatch?

In [None]:
# read in batch of spectra
splist = splat.getSpectrum(opt_spt=['L0','L9'],subdwarf=True)

In [None]:
# now plot them all using plotSpectrum with the multiplot option
splot.plotSpectrum(splist,multiplot=True)

In [None]:
# let's clean this up a bit by making a 2x2 grid
splot.plotSpectrum(splist,multiplot=True,layout=[2,2])

In [None]:
# the normalization is not so great here, so lets first normalize the spectra in a certain range
# and then set the y-axis range
for sp in splist: sp.normalize([0.9,1.4])
splot.plotSpectrum(splist,multiplot=True,layout=[2,2],yrange=[-0.05,1.2])

In [None]:
# now let's add some details, including the legend giving the name of the source
# and labeling L dwarf features; we'll also save this out as a multi-page pdf file
names = [sp.name for sp in splist]
splot.plotSpectrum(splist,multiplot=True,layout=[2,2],yrange=[-0.05,1.2],legend=names,features=['h2o','feh','co'],telluric=True,grid=True,multipage=True,file='myplot.pdf')


In [None]:
# plotBatch does many of these tasks in a compact way; here's the baseline call
splot.plotBatch(splist)


In [None]:
# now with the same options as before
# NOTE: ignore the warning messages here
splot.plotBatch(splist,features=['h2o','feh','co'],telluric=True,grid=True,yrange=[-0.05,1.2],output='myplot.pdf')


In [None]:
# plotBatch has a nice feature in that it can automatically classify spectra
# NOTE: the scaling on this doesn't seem to be working properly right now!
splot.plotBatch(splist,classify=True,normalize=True)


In [None]:
# here's an example of comparing all of our sources to one particular comparison source, the sdL0.0 standard
# The subdwarf standards are contained in the splat.STDS_SD_SPEX variable
splat.initializeStandards(allstds=True)
comptype = 'sdL0.0'
spcomp = splat.STDS_SD_SPEX[comptype]
spcomp.normalize([0.9,1.4])
names = ['{} vs {}'.format(sp.name,comptype) for sp in splist]

splot.plotSpectrum(splist,multiplot=True,layout=[2,2],yrange=[-0.05,1.2],legend=names,comparison=spcomp,colorComparison='r')
