# JWST Data Analysis
# Young Stellar Objects in the Large Magellanic Cloud: Part 1

##### For the first part, I use ALMA 13CO data cubes to read in the cube and make figures.

##### For the second part, I use Spitzer/IRS spectra of a known early-stage YSO.

##### The purpose of this notebook is to use photutils to automatically detect point sources and extract photometry.

##### Another purpose is to use specutils to find the important lines to first identify if something is or is not a YSO based on what lines do exist or do not exist.

##### Adding %matplotlib notebook to the beginning of the notebook did not fix the issue of the GUI not being interactive.

### Import things you need

In [None]:
from astropy import units as u
from astropy.wcs import WCS
from astropy import constants as const
from astropy.io import ascii, fits
from astropy.nddata import StdDevUncertainty
from astropy.modeling import models
from astropy.table import Table, Column, vstack
from astropy.stats import sigma_clipped_stats

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

from scipy.optimize import curve_fit
from spectral_cube import SpectralCube
from photutils import DAOStarFinder, CircularAperture

from specutils import Spectrum1D, SpectralRegion
from specutils.analysis import snr, line_flux, centroid, equivalent_width
from specutils.fitting import fit_generic_continuum, fit_continuum, find_lines_threshold,  find_lines_derivative, estimate_line_parameters, fit_lines
from specutils.manipulation import noise_region_uncertainty, box_smooth, extract_region, gaussian_smooth, SplineInterpolatedResampler
from specutils.spectra import SpectralRegion

from astrodendro import Dendrogram, ppv_catalog

### Check versions of imported things

In [None]:
import aplpy
import astrodendro
import astropy
import jwst
import matplotlib
import photutils
import scipy
import specutils
import spectral_cube

print("AplPY: {}".format(aplpy.__version__))
print("Astrdendro: {}".format(astrodendro.__version__))
print("Astropy: {}".format(astropy.__version__))
print("JWST: {}".format(jwst.__version__))
print("Matplotlib: {}".format(matplotlib.__version__))
print("Numpy: {}".format(np.__version__))
print("Pandas: {}".format(pd.__version__))
print("Photutils: {}".format(photutils.__version__))
print("Scipy: {}".format(scipy.__version__))
print("Specutils: {}".format(specutils.__version__))
print("SpectralCube: {}".format(spectral_cube.__version__))

### Set plot paramters

In [None]:
params={'legend.fontsize':'18',
        'axes.labelsize':'18',
        'axes.titlesize':'18',
        'xtick.labelsize':'18',
        'ytick.labelsize':'18',
        'lines.linewidth':2,
        'axes.linewidth':2,
        'animation.html': 'html5'}
plt.rcParams.update(params)
plt.rcParams.update({'figure.max_open_warning': 0})

### Set path to data

##### I am using ALMA 13CO from a star formation region in the LMC.

In [None]:
#data_cube_file='/Users/inayak/Desktop/Sprint/LMC_13CO.fits'
data_cube_file='https://data.science.stsci.edu/redirect/JWST/jwst-data_analysis_tools/MIRI_IFU_YSOs_in_the_LMC/LMC_13CO.fits'

### Setup output directory for figures and spectra

In [None]:
output_images='/Users/inayak/Desktop/Sprint/Images'
output_spectra='/Users/inayak/Desktop/Sprint/Spectra'

### Load and display data cube

In [None]:
cube = SpectralCube.read(data_cube_file, hdu=1)  
print(cube)

### Trim the cube

##### Not sure about JWST MRS cube, but in ALMA the beginning and end of cube have poor data quality so you can trim the cube to make it smaller if you want. I used the whole cube in this example.

In [None]:
#subcube = cube.spectral_slab(240 * (u.m/u.s), 265 * (u.m/u.s)) 
#cube=subcube
cmin = cube.minimal_subcube()
cube.allow_huge_operations=True

### Make a 2D image from the 3D cube either with average, median, or sum

##### This is to look for point sources to do aperture photometry with in the next step.

##### I chose the sum in order to see low-level emission data that otherwise would have been missed with the ALMA 13CO emission.

In [None]:
#cont_img = cmin.median(axis=0)
#cont_img = cmin.mean(axis=0)
cont_img = cmin.sum(axis=0)

### Plot the image to do a visual check

##### Just see if you even see any emission to make sure you have the correct data cube.

In [None]:
fig = plt.figure()
plt.imshow(cont_img.value)
plt.tight_layout()
plt.show()

### Plot the summed image with WCS coordinates and save the figure

In [None]:
name='13CO'
F = aplpy.FITSFigure(cont_img.hdu, north=True)
F.show_colorscale()
F.add_label(0.1, 0.9, name, relative=True, size=22, weight='bold')
F.axis_labels.set_font(size=22)
F.tick_labels.set_font(size=18, stretch='condensed')
F.save(output_images+"_"+name+".pdf", dpi=300) 

### Identify all point sources using photutils

In [None]:
#Empty array to store values
name_val = []
source_val = []
ra_val =[]
dec_val =[]

#Find mean, meadian, and standard deviation of the summed image
mean, median, std = sigma_clipped_stats(cont_img.value, sigma=2.0)

### Get a list of all point sources
##### Note that usually it is 3*std to find sources above noise level, but there are 247 point sources when I do that. So I made it 6*std to look at 4 point sources and make sure this step works.

In [None]:
daofind = DAOStarFinder(fwhm=2.0, threshold=6*std)
sources = daofind(cont_img.value - median) 
print("\n  Number of sources in field: ", len(sources))

### Extract and plot spectrum of all sources

In [None]:
#Make table be consistent with RA and DEC coordinates
if len(sources) > 0:
    print()            
    for col in sources.colnames:    
        sources[col].info.format = '%.8g'

    print(sources)  
    
    #Convert xcentroid and ycentroid to RA and DEC coordiantes
    positions = Table([sources['xcentroid'], sources['ycentroid']])                
    w = WCS(cont_img.header)                                                       
    radec_lst  = w.pixel_to_world(sources['xcentroid'], sources['ycentroid'])

    #Aperture extract spectrum of point source Using a cirular aperture            
    for countS, _ in enumerate(sources):
        print(radec_lst[countS].to_string('hmsdms'))                  #Print the RA and Dec in hms dms values        
        name_val.append(name)
        source_val.append(countS)
        ra_val.append(radec_lst[countS].ra.deg)
        dec_val.append(radec_lst[countS].dec.deg)
    
        #Size of frame 
        ysize_pix = cmin.shape[1]
        xsize_pix = cmin.shape[2]

        #Set up some centroid pixel for the source 
        ycent_pix = sources['ycentroid'][countS]
        xcent_pix = sources['xcentroid'][countS]

        #Make an aperture radius for source. This can be something the user inputs based on their own experience on their own science expertise.
        apertureRad_pix = 2

        #Make a masked array for the apeture
        yy, xx = np.indices([ysize_pix,xsize_pix], dtype='float')                    #Check ycentpix, xcentpix are in correct order 
        radius = ((yy-ycent_pix)**2 + (xx-xcent_pix)**2)**0.5                        #Make a circle in the frame

        mask = radius <= apertureRad_pix                                             #Select pixels within the aperture radius
        maskedcube = cmin.with_mask(mask)                                            #Make a masked cube
        pixInAp = np.count_nonzero(mask == 1)                                        #Pixels in apeture

        spectrum = maskedcube.sum(axis=(1,2))                                        #Extract the spectrum from only the annulus - use sum
        noisespectrum = maskedcube.std(axis=(1, 2))                                  #Extract the noise spectrum for the source 

        #Measure a spectrum from the background. Use an annulus arround the source.
        an_mask = (radius > apertureRad_pix + 1) & (radius <= apertureRad_pix + 2)   #Select pixels within an anulus
        an_maskedcube = cmin.with_mask(an_mask)                                      #Make a masked cube
    
        #Plot the spectrum extracted from cirular aperture via: a sum extraction 
        
        fig = plt.figure(figsize=(10,5))
        plt.plot(maskedcube.spectral_axis.value,spectrum.value, label='Source')       #Source spectrum 

        plt.xlabel('Frequency [Hz]')
        plt.ylabel('Kelvin') 

        plt.gcf().text(0.5, 0.85,name, fontsize=14, ha='center')
        plt.gcf().text(0.5, 0.80,radec_lst[countS].to_string('decimal'), ha='center', fontsize=14)

        plt.legend(frameon=False, fontsize='medium')
        plt.tight_layout()
        plt.show()
        plt.close()
    
        #Do a visual check to see if all point sources have been identified. 
        #Although the end result should be that this is not necessary. 
        #Visually, the eye can miss sources. 
        #Also as a science user, I would like the algorithm to tell me what is and isn't a point source mathematically.
        #There should be some criteria that differences point and extended sources that the user shouldn't have to visually check.
        #Of the four sources, two are noise spectra because it detected noise spikes on the border of the ALMA cube.
        positions_pix = (sources['xcentroid'], sources['ycentroid'])
    
        apertures = CircularAperture(positions_pix, r=2.)
        fig = plt.figure()            

        plt.subplot(1, 2, 1)
        plt.imshow(cont_img.value, cmap='Greys', origin='lower')
        apertures.plot(color='blue', lw=1.5, alpha=0.5)

        plt.subplot(1, 2, 2)
        plt.imshow(cont_img.value, origin='lower')

        plt.tight_layout()
        plt.show()
        plt.close()
    

### Make a table for the extracted source

In [None]:
sourceExtSpecTab = Table([name_val, source_val, ra_val, dec_val], 
                   names=("name", "source_no", "ra", "dec"))
print(sourceExtSpecTab)   

#Write table of extacted spectra for bookkeeping
#ascii.write(sourceExtSpecTab, outdir_spectra+"YSOsourcesSpec_list.csv", format='csv', overwrite=True) 

### Use Spitzer IRS YSO Spectra from Here on Out For Science Test Cases

##### (1)Look for lines in Spectra.
###### Ice features in the 5-7 micron range from H20, NH3, CH3OH, HCOOH, and H2CO are difficult to identify because confusion with PAH. The 15.2 micron CO2 ice absorbtion is better to identify.
###### More evolved YSOs will have PAH and fine-structure features at 6.2 micron, 7.7 micron, 8.6 micron, 11.3 micron, and 12.7 micron. But PAH and fine-sturcture could mean a more evolved HII region rather than an embedded YSO.
###### H2 emission is expected from YSOs. Both PDRs and shocks lead to H2 emission near YSO environments. 

##### (2)Look at ice features, PAH feature, and silicate features in more detail.

##### (3)Identify YSOs.

In [None]:
#Set Path to YSO1 Data
#YSO1='/Users/inayak/Desktop/Sprint/YSO1.txt'
YSO1='https://data.science.stsci.edu/redirect/JWST/jwst-data_analysis_tools/MIRI_IFU_YSOs_in_the_LMC/YSO1.txt'

#Set Path to YSO2 Data
YSO1='https://data.science.stsci.edu/redirect/JWST/jwst-data_analysis_tools/MIRI_IFU_YSOs_in_the_LMC/YSO2.txt'

In [None]:
# Read in the spectra and plot it for initial visualization check
data = ascii.read(YSO1)

if data.colnames[0] == 'col1':
    data['col1'].name = 'wave_mum'
    data['col2'].name = 'cSpec_Jy'            
    data['col3'].name = 'errFl_Jy'         

wav = data['wave_mum'] * u.micron                                                  # Wavelength: microns
fl  = data['cSpec_Jy'] * u.Jy                                                      # Fnu:  Jy
efl  = data['errFl_Jy'] * u.Jy                                                     # Error flux: Jy

spec = Spectrum1D(spectral_axis=wav, flux=fl, uncertainty=StdDevUncertainty(efl))  # Make a 1D spectrum object
fig = plt.figure(figsize=(8,4))
plt.plot(spec.spectral_axis, spec.flux, label='spectrum')                
plt.xlabel('Wavelength (microns)')
plt.ylabel("Flux ({:latex})".format(spec.flux.unit))

plt.legend(frameon=False, fontsize='medium')
plt.tight_layout()
plt.show()
plt.close()

### Fit a generic continuum

##### The continuum seems to be overestimated 5 micron - 7 micron and then again 17 micron - 25 micron when fitting a generic continuum without excluding any wavelengths. This leads to misidentification of emissiona and abroption lines.

##### Rather I fit a continuum to 5-13 micron range and another continuum for the 13-35 micron range. This gives a much better continuum subtracted spectrum.

##### It would be better if I can fit one continuum or use the spline fitting function.

In [None]:
#Calculate S/N
sig2noise = np.round(snr(spec), 2)

#Fit one continuum for wavelengths greater than 13 microns excluding a couple lines
to_exclude_1 = [(5.0, 13.0)*u.micron, (14.5, 15.5)*u.micron, (17.0, 18.0)*u.micron]                 #Define lines/regions to exclude
exclude_region_1 = SpectralRegion(to_exclude_1)                                                       # Make a specutils region

#Another continuum fit for wavelengths under 13 microns excluding the silicate absorption feature
to_exclude_2 = [(7.0, 11.0)*u.micron, (13.0, 35.0)*u.micron]
exclude_region_2 = SpectralRegion(to_exclude_2)

continuum_model1 = fit_generic_continuum(spec, exclude_regions=exclude_region_1)                     # Generate the first contimiumn
continuum_model2 = fit_generic_continuum(spec, exclude_regions=exclude_region_2)                     # Generate the second contimiumn
y_continuum_1 = continuum_model1(spec.spectral_axis)                                                 # Put the first continiumn into 1d spectra object
y_continuum_2 = continuum_model2(spec.spectral_axis)                                                 # Put the second continiumn into 1d spectra object

#Generate a continuum subtracted and continuum normalised spectra. Both needed for later analysis. 
spec_norm2 =  spec / y_continuum_2
count_axis=0
for i in range(0,len(spec.spectral_axis)):
    if spec.spectral_axis[i].value < 13.0:
        count_axis=count_axis+1
spec_contsub1 = spec[count_axis:len(spec.spectral_axis)] - y_continuum_1[count_axis:len(spec.spectral_axis)]
spec_contsub2 = spec[0:count_axis] - y_continuum_2[0:count_axis]


In [None]:
# Plot the continiumn and spectra
fig = plt.figure(figsize=(10,6))
plt.plot(spec.spectral_axis, spec.flux, label='Source')                                      # Source spectrum 
plt.plot(spec.spectral_axis, y_continuum_1, label='Continuum Fit to lambda > 13 microns')    # Continuum (lambda > 13 microns)
plt.plot(spec.spectral_axis, y_continuum_2, label='Continuum Fit to lambda < 13 microns')    # Continuum (lambda < 13 microns)

plt.xlabel("Wavelength ({:latex})".format(spec.spectral_axis.unit))
plt.ylabel("Flux ({:latex})".format(spec.flux.unit))

plt.legend(frameon=False, fontsize='medium')
plt.tight_layout()
plt.show()
plt.close()

In [None]:
# Plot the contimum subtracted spectrum

fig = plt.figure(figsize=(10,6))
plt.plot(spec_contsub1.spectral_axis, spec_contsub1.flux,color='black')
plt.plot(spec_contsub2.spectral_axis, spec_contsub2.flux,color='black')
plt.axhline(y=0.0, color='r', linestyle='-')

plt.xlabel("Wavelength ({:latex})".format(spec.spectral_axis.unit))
plt.ylabel("Flux ({:latex})".format(spec.flux.unit))

plt.tight_layout()
plt.show()
plt.close()

### Look For Emission and Absorption Lines

##### Common YSO lines include PAH emission line, ice absorption lines and silicate absorption lines.

In [None]:
# Find emmsion and absorption lines in the continuum-subtracted spectra.

if sig2noise > 5:
    lines = find_lines_threshold(spec_contsub1, noise_factor=5) # Selects lines 5 x spectrum uncertainty for wavelengths greater than 13 microns
    if len(lines) >0:
        emissionlines = lines[lines['line_type'] == 'emission']   # Grab a list of the emission lines
        abslines = lines[lines['line_type'] == 'absorption']      # Grab a list of the absorption lines
        print("Number of emission lines found:", len(emissionlines))
        print("Number of absorption lines found:", len(abslines))
    else:
        emissionlines = [0]
        print("No emission lines found!") 
        
    lines = find_lines_threshold(spec_contsub2, noise_factor=5) # Selects lines 5 x spectrum uncertainty for wavelengths less than 13 microns
    if len(lines) >0:
        emissionlines2 = lines[lines['line_type'] == 'emission']   # Grab a list of the emission lines
        abslines2 = lines[lines['line_type'] == 'absorption']      # Grab a list of the absorption lines
        print("Number of emission lines found:", len(emissionlines2))
        print("Number of absorption lines found:", len(abslines2))
    else:
        emissionlines = [0]
        print("No emission lines found!")
    
# This will print out two emission line lists and two absorption line lists because I broke the spectrum up into two parts for the two fitted continuum.   

In [None]:
#Extract emission lines greater than 13 microns

if len(emissionlines) >= 1 and emissionlines[0] != 0:
    emissionlines['gauss_line_center'] = 0. *u.micron
    emissionlines['gauss_line_amp'] = 0. * u.jansky
    emissionlines['gauss_line_stddev'] = 0. *u.micron
    emissionlines['gauss_line_FWHM'] = 0. *u.micron
    emissionlines['gauss_line_area'] = np.log10(1e-20)  * u.W /u.m**2

    emissionlines['no_AltRestWav']  = 0             # Number of possible lines which could be the feature
    emissionlines['RestWav']     = 0.               # RestWavelength of closest lab measured line
    emissionlines['diff_ft_wav'] = 0.               # Diff. in initial line wav estimate and the fitted gausian
    emissionlines['line_suspect'] = 0               # Set == 1 if large diffrence in wavelength possition
    emissionlines['line'] = "                 "     # For storing the line name
    

    # Loop through all the found emission lines in the spectra
    for idx, emlines in enumerate(emissionlines):

        #Look at the region surrounding the found lines from the original smoothed spectrum
        sw_line = emissionlines["line_center"][idx].value-0.001
        lw_line = emissionlines["line_center"][idx].value+0.001     
        line_region =  SpectralRegion(sw_line*u.um, lw_line*u.um)

        #Find the S/N ratio of the line region
        #line_snr = np.round(snr(spec, line_region), 2)
        line_cnr = emissionlines["line_center"][idx]
        #print("The line center is: ", line_cnr)
        #print("The S/N of the line region is:  ", line_snr)

        #Extract the line from the orginal spectrum and refit a contimiumn (excluding the line itself)
        #line_spec =  extract_region(spec, line_region)   # line spectrum
print(emissionlines)

#Extract emission lines less than 13 microns

if len(emissionlines2) >= 1 and emissionlines2[0] != 0:
    emissionlines2['gauss_line_center'] = 0. *u.micron
    emissionlines2['gauss_line_amp'] = 0. * u.jansky
    emissionlines2['gauss_line_stddev'] = 0. *u.micron
    emissionlines2['gauss_line_FWHM'] = 0. *u.micron
    emissionlines2['gauss_line_area'] = np.log10(1e-20)  * u.W /u.m**2

    emissionlines2['no_AltRestWav']  = 0             # Number of possible lines which could be the feature
    emissionlines2['RestWav']     = 0.               # RestWavelength of closest lab measured line
    emissionlines2['diff_ft_wav'] = 0.               # Diff. in initial line wav estimate and the fitted gausian
    emissionlines2['line_suspect'] = 0               # Set == 1 if large diffrence in wavelength possition
    emissionlines2['line'] = "                 "     # For storing the line name
    

    #Loop through all the found emission lines in the spectra

    for idx, emlines2 in enumerate(emissionlines2):

        #Look at the region surrounding the found lines from the original smoothed spectrum
        sw_line = emissionlines2["line_center"][idx].value-0.01
        lw_line = emissionlines2["line_center"][idx].value+0.01     
        line_region =  SpectralRegion(sw_line*u.um, lw_line*u.um)

        #Find the S/N ratio of the line region
        #line_snr = np.round(snr(spec, line_region), 2)
        line_cnr = emissionlines2["line_center"][idx]
        #print("The line center is: ", line_cnr)
        #print("The S/N of the line region is:  ", line_snr)

        # Extract the line from the orginal spectrum and refit a contimiumn (excluding the line itself)
        #line_spec2 =  extract_region(spec, line_region)   # line spectrum        
         
print(emissionlines2)

In [None]:
#Extract absorption lines greater than 13 microns

if len(abslines) >= 1 and abslines[0] != 0:
    abslines['gauss_line_center'] = 0. *u.micron
    abslines['gauss_line_amp'] = 0. * u.jansky
    abslines['gauss_line_stddev'] = 0. *u.micron
    abslines['gauss_line_FWHM'] = 0. *u.micron
    abslines['gauss_line_area'] = np.log10(1e-20)  * u.W /u.m**2

    abslines['no_AltRestWav']  = 0             # Number of possible lines which could be the feature
    abslines['RestWav']     = 0.               # RestWavelength of closest lab measured line
    abslines['diff_ft_wav'] = 0.               # Diff. in initial line wav estimate and the fitted gausian
    abslines['line_suspect'] = 0               # Set == 1 if large diffrence in wavelength possition
    abslines['line'] = "                 "     # For storing the line name
    

    #Loop through all the found absorption lines in the spectra

    for idx, absorplines in enumerate(abslines):

        #Look at the region surrounding the found lines from the original smoothed spectrum
        sw_line = abslines["line_center"][idx].value-0.01
        lw_line = abslines["line_center"][idx].value+0.01     
        line_region =  SpectralRegion(sw_line*u.um, lw_line*u.um)

        #Find the S/N ratio of the line region
        #line_snr_abs = np.round(snr(spec, line_region), 2)
        line_cnr_abs = abslines["line_center"][idx]
        #print("The line center is: ", line_cnr_abs)
        #print("The S/N of the line region is:  ", line_snr_abs)

        #Extract the line from the orginal spectrum and refit a contimiumn (excluding the line itself)
        #line_spec_abs =  extract_region(spec, line_region)   # line spectrum
print(abslines)

#Extract absorption lines less than 13 microns

if len(abslines2) >= 1 and abslines2[0] != 0:
    abslines2['gauss_line_center'] = 0. *u.micron
    abslines2['gauss_line_amp'] = 0. * u.jansky
    abslines2['gauss_line_stddev'] = 0. *u.micron
    abslines2['gauss_line_FWHM'] = 0. *u.micron
    abslines2['gauss_line_area'] = np.log10(1e-20)  * u.W /u.m**2

    abslines2['no_AltRestWav']  = 0             # Number of possible lines which could be the feature
    abslines2['RestWav']     = 0.               # RestWavelength of closest lab measured line
    abslines2['diff_ft_wav'] = 0.               # Diff. in initial line wav estimate and the fitted gausian
    abslines2['line_suspect'] = 0               # Set == 1 if large diffrence in wavelength possition
    abslines2['line'] = "                 "     # For storing the line name
    

    #Loop through all the found absorption lines in the spectra

    for idx, absorplines2 in enumerate(abslines2):

        #Look at the region surrounding the found lines from the original smoothed spectrum
        sw_line = abslines2["line_center"][idx].value-0.01
        lw_line = abslines2["line_center"][idx].value+0.01     
        line_region =  SpectralRegion(sw_line*u.um, lw_line*u.um)

        #Find the S/N ratio of the line region
        #line_snr_abs = np.round(snr(spec, line_region), 2)
        line_cnr_abs = abslines2["line_center"][idx]
        #print("The line center is: ", line_cnr_abs)
        #print("The S/N of the line region is:  ", line_snr_abs)

        #Extract the line from the orginal spectrum and refit a contimiumn (excluding the line itself)
        #line_spec_abs2 =  extract_region(spec, line_region)   # line spectrum

print(abslines2)

### Look for emission and absorption lines

##### PAH emission feature: 6.2, 7.7, 8.6, 11.3, 12.0, 12.7, 14.2, 16.2
##### Silicate absoprtion feature: 10.0, 18.0, 23.0
##### CO2 ice absorption feature: 15.3
##### Other ice features: CO (4.67 micron), H2O + HCOOH (6 micron),  CH3OH (6.89 micron), CH4 (7.7 micron)
##### If none of these 8 emission lines and 8 absorption lines exist, then this is not a YSO.

In [None]:
#First Look for PAH emission features

PAH_emission = [6.2, 7.7, 8.6, 11.3, 12.0, 12.7, 14.2, 16.2] * u.micron            #list or known YSO PAH emission lines
len_PAH_list = len(PAH_emission)

PAH_emission_detected=[]                                                           #empty array to store detected lines
line_cnr_list = emissionlines["line_center"]                                       #list of emission lines extracted from first spectrum
len_line_list = len(line_cnr_list)
line_cnr_list2 = emissionlines2["line_center"]                                     #list of emission lines extracted from second spectrum
len_line_list2 = len(line_cnr_list2)

count_pah1=0                                                                       #counting how many PAH emission lines exist in first spectrum
for i in range(0,len_PAH_list):
    for j in range (0,len_line_list):
        if line_cnr_list[j].value-0.05 < PAH_emission.value[i] < line_cnr_list[j].value+0.05 :
            count_pah1=count_pah1+1
count_pah2=0                                                                       #counting how many PAH emission lines exist in second spectrum
for i in range(0,len_PAH_list):
    for j in range (0,len_line_list2):
        if line_cnr_list2[j].value-0.05 < PAH_emission.value[i] < line_cnr_list2[j].value+0.05 :
            count_pah2=count_pah2+1

PAH_emission_detected=[0.0]*(count_pah1+count_pah2)                                 #empty array to store detected lines
count_pah=0 #restart count
for i in range(0,len_PAH_list):
    for j in range (0,len_line_list):
        if line_cnr_list[j].value-0.05 < PAH_emission.value[i] < line_cnr_list[j].value+0.05 :
            PAH_emission_detected[count_pah]= line_cnr_list[j]
            count_pah=count_pah+1
    for j in range (0,len_line_list2):
        if line_cnr_list2[j].value-0.05 < PAH_emission.value[i] < line_cnr_list2[j].value+0.05 :
            PAH_emission_detected[count_pah]= line_cnr_list2[j]
            count_pah=count_pah+1
            
print(PAH_emission_detected)
print(count_pah,"PAH emission lines detected in spectrum")
if count_pah1==0:
    PAH_emission_detected=[0.0]*u.micron

In [None]:
#Look for Silicate absorption features

sil_absorption = [10.0, 18.0, 23.0] * u.micron                                                   #list or known YSO silicate absorption lines
len_sil_list = len(sil_absorption)

sil_absorption_detected=[]                                                                       #empty array to store detected lines
line_cnr_list = abslines["line_center"]                                                          #list of absorption lines extracted from first spectrum
len_line_list = len(line_cnr_list)
line_cnr_list2 = abslines2["line_center"]                                                        #list of absorption lines extracted from second spectrum
len_line_list2 = len(line_cnr_list2)

count_sil1=0                                                                                     #counting how many silicate absorption lines exist in first spectrum
for i in range(0,len_sil_list):
    for j in range (0,len_line_list):
        if line_cnr_list[j].value-0.2 < sil_absorption.value[i] < line_cnr_list[j].value+0.2 :   #absorption features are wider than emission features therefore +/-0.2
            count_sil1=count_sil1+1
count_sil2=0                                                                                     #counting how many silicate absorption lines exist in second spectrum
for i in range(0,len_sil_list):
    for j in range (0,len_line_list2):
        if line_cnr_list2[j].value-0.2 < sil_absorption.value[i] < line_cnr_list2[j].value+0.2 : #absorption features are wider than emission features therefore +/-0.2
            count_sil2=count_sil2+1
            
sil_absorption_detected=[0.0]*(count_sil1+count_sil2)                                            #empty array to store detected lines
count_sil=0                                                                                      #restart count
for i in range(0,len_sil_list):
    for j in range (0,len_line_list):
        if line_cnr_list[j].value-0.2 < sil_absorption.value[i] < line_cnr_list[j].value+0.2 :
            sil_absorption_detected[count_sil]= sil_absorption[i]
            count_sil=count_sil+1
    for j in range (0,len_line_list2):
        if line_cnr_list2[j].value-0.2 < sil_absorption.value[i] < line_cnr_list2[j].value+0.2 :
            sil_absorption_detected[count_sil]= sil_absorption[i]
            count_sil=count_sil+1
        
if count_sil==0:
    sil_absorption_detected=[0.0]*u.micron
          
#Remove multiple detection of same line
test_list=[0.0]*count_sil

if count_sil > 0:
    for i in range(0,count_sil):
        test_list[i]=sil_absorption_detected[i].value
    drop_dups_sil  = pd.Series(test_list).drop_duplicates().tolist()
    print(len(drop_dups_sil),"silicate absorption lines detected in spectrum")
else:
    print("0 silicate absorption lines detected in spectrum")

In [None]:
#Look for ice absorption features

ice_absorption = [4.67, 6.0, 6.9, 7.7, 15.3] * u.micron                                          #list or known YSO ice absorption lines
len_ice_list = len(ice_absorption)

ice_absorption_detected=[]                                                                       #empty array to store detected lines
line_cnr_list = abslines["line_center"]                                                          #list of absorption lines extracted from first spectrum
len_line_list = len(line_cnr_list)
line_cnr_list2 = abslines2["line_center"]                                                        #list of absorption lines extracted from second spectrum
len_line_list2 = len(line_cnr_list2)

count_ice1=0                                                                                     #counting how many ice absorption lines exist in first spectrum
for i in range(0,len_ice_list):
    for j in range (0,len_line_list):
        if line_cnr_list[j].value-0.2 < ice_absorption.value[i] < line_cnr_list[j].value+0.2 :   
            count_ice1=count_ice1+1
count_ice2=0                                                                                     #counting how many ice absorption lines exist in second spectrum
for i in range(0,len_ice_list):
    for j in range (0,len_line_list2):
        if line_cnr_list2[j].value-0.2 < ice_absorption.value[i] < line_cnr_list2[j].value+0.2 : 
            count_ice2=count_ice2+1
            
ice_absorption_detected=[0.0]*(count_ice1+count_ice2)                                            #empty array to store detected lines
count_ice=0                                                                                      #restart count
for i in range(0,len_ice_list):
    for j in range (0,len_line_list):
        if line_cnr_list[j].value-0.2 < ice_absorption.value[i] < line_cnr_list[j].value+0.2 :
            ice_absorption_detected[count_ice]= ice_absorption[i]
            count_ice=count_ice+1
    for j in range (0,len_line_list2):
        if line_cnr_list2[j].value-0.2 < ice_absorption.value[i] < line_cnr_list2[j].value+0.2 :
            ice_absorption_detected[count_ice]= ice_absorption[i]
            count_ice=count_ice+1

if count_ice==0:
    ice_absorption_detected=[0.0]*u.micron
          
#Remove multiple detection of same line
test_list=[0.0]*count_ice

if count_ice > 0:
    for i in range(0,count_ice):
        test_list[i]=ice_absorption_detected[i]
    drop_dups_ice  = pd.Series(test_list).drop_duplicates().tolist()
    print(len(drop_dups_ice),"ice absorption lines detected in spectrum")
else:
    print("0 ice absorption lines detected in spectrum")

In [None]:
#If no PAH, silicate, and ice lines found then print this is not a YSO
if PAH_emission_detected[0].value==0 and sil_absorption_detected[0].value==0 and ice_absorption_detected[0].value:
    print("This is not a YSO.")
#Else find out if YSO 1 (youngest, most embedded YSO), YSO 2, or YSO 3.
else:
    print("This is a YSO.")
    if len(drop_dups_ice) > 0:
        for i in range(0,count_ice):
            if ice_absorption_detected[i].value < 15.4 and ice_absorption_detected[i].value > 15.0:
                print("This is a Class 1 YSO.")
    if (count_ice == 0) and (len(drop_dups_sil) > 0):
        print("This is a Class 2 YSO.")


##### Still left to do:
##### Need to search for common atomic emission lines because that is the feature for YSO 3.
##### If this is a YSO, but cannot be categorized then maybe print out that user needs to take closer look to determine YSO stage.

### Plot the spectra and label the emission and absorption features

##### Still need to remove multiple detection of same line.

In [None]:
fig = plt.figure(figsize=(10,6))
plt.plot(spec_contsub1.spectral_axis, spec_contsub1.flux,color='black')
plt.plot(spec_contsub2.spectral_axis, spec_contsub2.flux,color='black')

plt.xlabel("Wavelength ({:latex})".format(spec.spectral_axis.unit))
plt.ylabel("Flux ({:latex})".format(spec.flux.unit))
plt.axhline(y=0.0, color='r', linestyle='-')

max_emission=np.max(spec_contsub2.flux.value)

if len(PAH_emission_detected) >0 :
    for i in range(0,len(PAH_emission_detected)):
        x=[PAH_emission_detected[i].value, PAH_emission_detected[i].value]
        y=[max_emission*3, max_emission*4]
        plt.plot(x,y,'r')
    plt.plot(x,y,'r',label='PAH emission')    

if len(sil_absorption_detected) >0 :
    for i in range(0,len(sil_absorption_detected)):
        x=[sil_absorption_detected[i].value, sil_absorption_detected[i].value]
        y=[max_emission*3, max_emission*4]
        plt.plot(x,y,'blue')
    plt.plot(x,y,'blue',label='silicate absorption')
    
if len(ice_absorption_detected) >0 :
    for i in range(0,len(ice_absorption_detected)):
        x=[ice_absorption_detected[i].value, ice_absorption_detected[i].value]
        y=[max_emission*3, max_emission*4]
        plt.plot(x,y,'g')
    plt.plot(x,y,'g',label='ice absorption')

plt.xlim(5,35)
plt.legend(frameon=False, fontsize='medium')
plt.tight_layout()
plt.show()
plt.close()

# Young Stellar Objects in the Large Magellanic Cloud: Part 2

##### Use ALMA data cube to show how dendrogram can be used to study the heirarchical nature of CO gas clumps and possible extended emission that might be detected using MRS.
##### Use list of YSOs and list of dendrogram clumps to correlate properties for YSO (mass, luminosity) to properties of gas clumps (mass, mass density, size).
##### Dendrograms can be used to do virial analysis (radius versus linewidth) of gas clumps to determine if clouds have extra energetics that cannot be accounted for.
##### This can be a test case for anyone interested in using dendrograms for their own personal science, even if it is not related to YSOs in the LMC>

In [None]:
#Compute the dendrogram.
image = fits.getdata('/Users/inayak/Desktop/Sprint/LMC_13CO.fits')
d = Dendrogram.compute(image, min_value=12, min_delta=1., min_npix=10,verbose=True)

In [None]:
#Plot the dendrogram
v = d.viewer()
v.show()

### Make a catalog

##### This catalog already calculates sizes, total fluxes, and linewidths of the clumps.

In [None]:
metadata = {}
metadata['data_unit'] = u.K
metadata['spatial_scale'] =  0.2 * u.arcsec    #need to confirm spatial scale, it might not be 0.2 arcsec
metadata['beam_major'] =  1.6387 * u.arcsec
metadata['beam_minor'] =  1.4163 * u.arcsec
metadata['wavelength'] = 0.00136 * u.m

In [None]:
cat = ppv_catalog(d, metadata)

In [None]:
#Use the catalog to make Radius versus RMS Velocity Plot
fig = plt.figure(figsize=(10,6))
plt.plot(cat['radius'], cat['v_rms'],'*')

plt.xlabel("Radius [arcsec]")
plt.ylabel("RMS Velocity [pixel]")

plt.tight_layout()
plt.show()
plt.close()

In [None]:
#Use the catalog to make Radius versus RMS Velocity Plot of Just the Leaves (i.e. the small high density cores in which star formation takes place)

trunk=d.trunk
leaves=d.leaves

##### Still left to do:

##### Plot just the leaves and just the branches
##### Check units and plot the Larson r-sigma relation to show expectation. The difference will allow you to see how much turbulence exists in the local ISM.
##### Relate YSO properties to clump properties