# STag Demonstration

The following iPython Jupyter notebook gives a step-by-step demonstration of how to use STag to get the tag probabilities and the predicted class for a spectra.

# Setup

The first step is to read in the beta values for each of the tags as well as an example spectrum (this can be modified to read in an appropriate spectrum of your choice).


In [1]:
import beta_reader
import numpy as np
import os 

path = os.getcwd()
beta = beta_reader.beta_reader(path)
spectra = '%s/DES15C2aty_C2_combined_150917_v03_b00.fits' % path
name = 'DES15C2aty'
z = 0.149

# Pre-processing

In order to use STag, spectra need to be pre-processed appropriately. This involves filtering, de-redshifting, binning, continuum removal, apodisation, and scaling.

All of these steps are handled by the spectra_preprocessing package, which largely uses methods made for the software [DASH](https://github.com/daniel-muthukrishna/astrodash).

In [2]:
import spectra_preprocessing as sp
from astropy.io import fits

#Read in the fits file of the spectra and extract the flux and wavelength
fits_file = spectra
table = fits.open(fits_file)
flux = table[0].data
w0 = table[0].header['CRVAL1']
dw = table[0].header['CDELT1']
p0 = table[0].header['CRPIX1']
nlam = len(flux)
wave = w0+dw*(np.arange(nlam, dtype='d')-p0)
table.close()

full = np.column_stack((wave, flux))

#Initialise for pre-processing
preProcess = sp.PreProcessing(full, 2500, 10000, 1024)

#Do the pre-processing steps
sfWave, sfFlux, minInd, maxInd, sfZ, sfArea = preProcess.two_column_data(z, smooth=6, minWave=2500, maxWave=10000)

#Do scaling                                                                            
flux_pro = sfFlux/sfArea

# Cutting the Spectra

Many of the tags use specific wavelength ranges of the spectrum rather than the whole thing and so we create multiple instances of the original spectrum cut at the corresponding wavelengths for each tag. 

In [3]:
class feature_data(object):
    """a class for holding the wavelength and flux for a specific tag."""
    def __init__(self, label):
        self.label = label
        self.wavelength = []
        self.flux = []

cuts = np.genfromtxt('%s/cuts.txt' % path, dtype=int)
#Silicon 6355
si_6355_tag = feature_data('Si 6355')
si_6355_tag.wavelength = cuts[0]
si_6355_tag.flux = flux_pro[si_6355_tag.wavelength[0]:si_6355_tag.wavelength[1]]

#Calcium H&K
ca_tag = feature_data('Ca')
ca_tag.wavelength = cuts[2]
ca_tag.flux = flux_pro[ca_tag.wavelength[0]:ca_tag.wavelength[1]]

#Iron 4924
fe_4924_tag = feature_data('Fe 4924')
fe_4924_tag.wavelength = cuts[4]
fe_4924_tag.flux = flux_pro[fe_4924_tag.wavelength[0]:fe_4924_tag.wavelength[1]]

#Sulphur
s_tag = feature_data('S')
s_tag.wavelength = cuts[5]
s_tag.flux = flux_pro[s_tag.wavelength[0]:s_tag.wavelength[1]]

#Hydrogen alpha
ha_tag = feature_data('HA')
ha_tag.wavelength = cuts[6]
ha_tag.flux = flux_pro[ha_tag.wavelength[0]:ha_tag.wavelength[1]]

#5876 absorption feature
line_a_tag = feature_data('Na I')
line_a_tag.wavelength = cuts[8]
line_a_tag.flux = flux_pro[line_a_tag.wavelength[0]:line_a_tag.wavelength[1]]

#Helium 6450
he_6450_tag = feature_data('He 6450')
he_6450_tag.wavelength = cuts[9]
he_6450_tag.flux = flux_pro[he_6450_tag.wavelength[0]:he_6450_tag.wavelength[1]]

#Iron 5018
fe_5018_tag = feature_data('Fe 5018')
fe_5018_tag.wavelength = cuts[11]
fe_5018_tag.flux = flux_pro[fe_5018_tag.wavelength[0]:fe_5018_tag.wavelength[1]]

#Iron 5170
fe_5170_tag = feature_data('Fe 5170')
fe_5170_tag.wavelength = cuts[12]
fe_5170_tag.flux = flux_pro[fe_5170_tag.wavelength[0]:fe_5170_tag.wavelength[1]]

#Hydrogen gamma
hg_tag = feature_data('HG')
hg_tag.wavelength = cuts[13]
hg_tag.flux = flux_pro[hg_tag.wavelength[0]:hg_tag.wavelength[1]]

#Silicon 4000
si_4000_tag = feature_data('Si 4000')
si_4000_tag.wavelength = cuts[14]
si_4000_tag.flux = flux_pro[si_4000_tag.wavelength[0]:si_4000_tag.wavelength[1]]

#Hydrogen beta
hb_tag = feature_data('HB')
hb_tag.wavelength = cuts[15]
hb_tag.flux = flux_pro[hb_tag.wavelength[0]:hb_tag.wavelength[1]]

# Tagging

With spectra pre-processed and the necessary cuts made, we can now get the tag probabilities of the spectra and add them to an array ready to be given to the trained classifier.

In [12]:
from tagging import log_reg_two

final = np.zeros([1,14])

#Get Hydrogen alpha tag probabilities
HA_result = log_reg_two(ha_tag.flux, beta[0])
final[0][0] = HA_result

#Get Hydrogen beta tag probabilities
HB_result = log_reg_two(hb_tag.flux, beta[11])
final[0][1] = HB_result

#Get Hydrogen gamma tag probabilities
HG_result = log_reg_two(hg_tag.flux, beta[9])
final[0][2] = HG_result

#Get Silicon 4000 tag probabilities
Si_4000_result = log_reg_two(si_4000_tag.flux, beta[10])
final[0][3] = Si_4000_result

#Get Silicon 6355 tag probabilities
Si_6355_result = log_reg_two(si_6355_tag.flux, beta[1])
final[0][4] = Si_6355_result

#Get Sulphur tag probabilities
S_result = log_reg_two(s_tag.flux, beta[5])
final[0][5] = S_result

#Get Helium 6450 tag probabilities
He_6450_result = log_reg_two(he_6450_tag.flux, beta[6])
final[0][6] = He_6450_result

#Get feature at 5876 tag probabilities
line_abs_result = log_reg_two(line_a_tag.flux, beta[2])
final[0][7] = line_abs_result

#Silicon 5876
si_5876 = ((Si_6355_result + Si_4000_result)/2) * line_abs_result
final[0][8] = si_5876

#Helium 5876
he_5876 = (He_6450_result/1) * line_abs_result
final[0][9] = he_5876

#Get Calcium tag probabilities
Ca_result = log_reg_two(ca_tag.flux, beta[3])
final[0][10] = Ca_result

#Get Fe 4924 tag probabilities
Fe_4924_result = log_reg_two(fe_4924_tag.flux, beta[4])
final[0][11] = Fe_4924_result

#Get Fe 5018 tag probabilities
Fe_5018_result = log_reg_two(fe_5018_tag.flux, beta[7])
final[0][12] = Fe_5018_result

#Get Fe 5170 tag probabilities
Fe_5170_result = log_reg_two(fe_5170_tag.flux, beta[8])
final[0][13] = Fe_5170_result

# Tag Probabilities

One of the key features of STag is that all of the tags have probabilties, which can be accessed on demand.

In [13]:
tag_names = ['HA      ','HB      ','HG      ','Si 4000 ','Si 6355 ','S       ','He 6450 ', '5876    ','Si 5876 ','He 5876 ','Ca      ','Fe 4924 ','Fe 5018 ','Fe 5170 ']
for i in range(0,len(tag_names)):
    print("{0:s} {1:5.2f}".format(tag_names[i],final[0][i]))

HA        0.00
HB        0.00
HG        0.26
Si 4000   0.71
Si 6355   0.99
S         0.01
He 6450   0.01
5876      0.87
Si 5876   0.74
He 5876   0.01
Ca        1.00
Fe 4924   0.99
Fe 5018   0.00
Fe 5170   0.00


# Classifying

We can now make our predictions for the class of the supernova by using the trained model. Since we are using softmax, we use 'np.argmax' to select the class with the highest probability, though one can see the probabilities of all the classes by printing 'class_prob'.

The predicted class is given a number, which corresponds to one of the 5 possible classes:

0 = Type Ia-norm

1 = Type Ia-csm

2 = Type Ib-norm

3 = Type Ic-norm

4 = Type II

In [15]:
import keras
v = keras.__version__
from packaging import version
if version.parse(v) < version.parse('2.5.0'):
    print("You may need to update Keras")

to_classify = np.zeros([1,9])

#Hydrogen alpha
to_classify[0][0] = HA_result

#Hydrogen beta
to_classify[0][6] = HB_result

#Hydrogen gamma
to_classify[0][4] = HG_result

#Silicon 4000
to_classify[0][5] = Si_4000_result

#Silicon 6355
to_classify[0][4] = Si_6355_result

#Sulphur
to_classify[0][2] = S_result

#Helium 6450
to_classify[0][3] = He_6450_result

#Silicon 5876
to_classify[0][7] = si_5876

#Helium 5876
to_classify[0][8] = he_5876

#Load in the trained model
model = keras.models.load_model('%s/STag Version II.h5' % path)

#Make classification prediction
class_prob = model.predict(to_classify)
preds = np.argmax(class_prob, axis=-1)
print("SN %s (with redshift %.3f) predicted class is %d with a %.3f probability " %  (name,z,preds,class_prob[0][preds]))

SN DES15C2aty (with redshift 0.149) predicted class is 0 with a 0.999 probability 


# Closing Remarks

One can use STag by following the steps outlined in this notebook, and with slight modifications one can adapt this code to run on multiple spectra rather than one at a time. 

Note that the classifying model used has only been trained on the 10 tags shown in this notebook, if one wishes to add additional tags then the model will need to be trained again. A more detailed description of how the tags have been made and how the model was built can be found in our paper: https://arxiv.org/abs/2108.10497

Version: This is STag 2.0, detailed change logs can be found in the readme file.