Skip to content

Identifying Young Stellar Object in Multi-dimensional Magnitude Space

Notifications You must be signed in to change notification settings

JordanWu1997/YSO_Identifier

Repository files navigation

YSO Identifier: Identifying Young Stellar Objects in Multi-dimensional Magnitude Space

Table of Contents

Created by gh-md-toc

Context

1. Introduction

Young Stellar Objects (YSOs) are young stars at early stage of evolution. YSOs consist of protostars and pre-main-sequence stars. Identifying YSOs is important to derive statistical properties e.g. star formation rate (SFR) which helps to better constrain star formation theories. In this work, we take the indirect approach to find YSOs by constructing a pipeline to classify astronomical objects: evolved stars, stars, galaxies, and YSOs, solely based on their photometry measurements from multiple bands. The classification is based on object-populated regions of evolved stars, stars, and galaxies in the multi-dimensional magnitude space, and sources are classified as YSOs if they are not in the previous regions.

1.1 YSO Identification

There are two major approaches to do YSO identification: direct approach and indirect approach

  • Direct approach: Find objects with feature of YSOs
    • Spectroscopy (pros: accurate; cons: NOT very efficient)
  • Indirect approach: Remove objects that are not YSOs
    • Evans et al. 2007: Color-color diagram (CCD), Color-magnitude diagram (CMD)
    • Hsieh and Lai 2013: Multi-D magnitude space (this approach is adopted by this work)
    • Chiu et al. 2021: Machine Learning

1.2 How Magnitude Space Works

Each location in magnitude space corresponds to a type of spectral energy distribution (SED) which can represent composition of objects. This means classifying objects in magnitude space is equivalent to classifying objects with SED based on their composition.

Cartoon_MMD_and_SED
Blue/Green dots in 3D magnitude space corresponds to different types of 3-band SEDs

For locations along the faint direction (diagonal direction), SED shape of each location is identical but with different magnitude. This can be viewed as the same type of objects with different brightness due to the distance

Cartoon_SED_AND_MMD_diagonal_probe
Green dots within orange probe along faint direction can be viewed as same type of objects

However, since YSOs and galaxies have similar composition, both are made of star and dust, we cannot simply use SED shape to separate them. But since their distances to us are very different, most YSOs we can observe locate within Milky Way Galaxy, as most galaxies are far-away from our Milky Way Galaxy. Therefore, we use their brightness difference to separate them. Note that this method has a caveat, since the separation of YSOs and galaxies are based on brightness, we might miss very faint YSOs and contaminate YSOs with very bright galaxies. Fortunately, there are not so many very bright galaxies.

Cartoon_SED_AND_MMD_YSO_AND_Galaxy
Green/Blue dots within orange probe indicate Galaxies/YSOs separated due to their brightness difference

2. Method

In this work, we use object samples to naturally defined object-populated region in multi-D magnitude space. The object will be classified into evolved star, star, galaxy or YSO based on the object-populated region it locates. The concept of multi-D magnitude space is first proposed by Hsieh & Lai 2013, this work improves their work to the higher dimension.

Multi-D_magnitude_space
Multi-D magnitude space in this work (2D magnitude space schematics)

2.1 Find Object-populated Region

In Hsieh & Lai 2013, they use multi-D array to construct the whole multi-D magnitude space, however it needs enormous RAM to store that array. To solve the RAM problem, in this work, we change the storage method from multi-d array to 2D array composed by sets of location of boundary points. We first project all object samples along the faint direction (as shown in previous section, they have identical SED shapes) to find all SED shapes of samples. Then, we find the brightest dot and the faintest dot for the individual type of SED shape and store them as bright-end boundary and faint-end boundary respectively. In this work, we assume object-populated region are always continuous, therefore the bright-end boundaries and faint-end boundaries define the object-populated region of the samples. For samples used in this work, please check ./tables/README.md.

Cartoon_Finding_Boundary
Probe green samples with orange probe and find both bright-end and faint-end boundaries

2.2 Classification Pipeline

Input objects will first be binned to save computation time and compared their location in multi-D magnitude space to object-populated regions that are probed with the method in previous section. Note that here we define the bright and faint regions to classify those objects outside the region of interest (where all samples locate) and give them object type bright and faint. For those bright/faint objects, due to their brightness/faintness, we suggest them as YSOs/galaxies. For more detailed description, please check ./classification/README.md.

Cartoon_Classification_Pipeline_2D_MMD
This work classification pipeline with 2D magnitude space schematics

2.3 Isolated Object and Reclassification Process

Since the multi-D magnitude space is huge and it is hard to observe all SED shapes in practice, there are some regions that do not have observed samples. This region is called the isolated region because of missing SED shapes of samples and objects locate in this region are called isolated objects.

Isolated_Region
Isolated region defined in this work, which also indicates the region that we do not have samples

To maximize usage of the samples, we introduce reclassification process to do classification to those isolated objects. This process is to classify isolated objects using boundary points with the most similar SED as a reference. This process acts equivalently to do interpolation/extrapolation to our samples. Note that we only do this process to galaxy samples in this work.

Reclassification_Process
Reclassification process and detailed criteria

3. TL;DR How to use this tool?

3.1. Preparation

3.1.1. Install Required Python Packages (Python 3)

python3 -m pip install -r ./requirements.txt

3.1.2. Prepare Sample Catalogs

This work needs three sample catalogs for evolved stars, stars and galaxies. We provides these three catalogs in ./tables directory. But note that since the size of template star catalog is too large (~120 MB) for github, we provide the scripts for user to generate template star on their own. Also, you can just skip this section if you want to use your own sample catalogs. For sample catalog format, please check ./tables/README.md.

cd ./tables # Make sure you are in the tables directory
chmod u+x ./generate_star_sample_catalog.sh
./generate_star_sample_catalog.sh
cd ..

3.1.3. Check Parameters

Python object Model stores parameters for multi-dimensional magnitude space. For more details, please check ./model.py file. Use vim or whatever editor you like to check variable in Model.

vim ./model.py

3.2 Probe Object-populated Region

Probe object samples in multi-dimensional magnitude space to get object-populated region. By default, we probe evolved star, star and galaxy samples with input sample catalogs in ./tables directory with bin size 1.0, 0.5, 0.2 magnitude respectively. For more details about input/output/module files, please check ./probe_model directory. Use vim or whatever editor you like to check inputs.

vim ./run_probe_model.py

Please check following 1D lists in main(), especially you are using your own sample catalogs Note that list 1 and 2 should have same list length

  1. input_catalog_list: input catalog list for samples (e.g. evolved star, star, and galaxy)
  2. input_name_list: input catalog name list (this would be later used as output model name)
  3. binsize_model_list: bin size list (bin size used to probe multi-D space)

If input check is done, run

python3 ./run_probe_model.py

3.3 Run Classification

Choose either ways to run classification. For more details about input/output/module files, please check ./classification directory.

3.3.1. With Interactively Input Catalogs

python3 ./run_classification.py interactively

3.3.2. With Preset Input Catalogs

Recommended if you have a lot of catalogs for classification. But note that you have to assign models (e.g. evolved star, star, galaxy, and bin size) for every input catalog. Use vim or whatever editor you like to check inputs.

vim ./run_classification.py

Please check following 1D lists in main() to make sure you have correct inputs, especially you are using your own models generated from your own sample catalogs. Note that list 1~5 should have same list length.

  1. catalog_list: input catalog list
  2. evolved_star_model_list: evolved star model name list
  3. star_model_list: star model name list
  4. galaxy_model_list: galaxy model name list
  5. binsize_model_list: bin size list

If input check is done, run

python3 ./run_classification.py

3.4 Visualization

For more details about input/output/module files, please check ./make_plot directory.

3.4.1 Magnitude-Magnitude Diagram (MMD)

vim ./plot_sample_MMD.py # Check input catalogs
python3 ./plot_sample_MMD.py

MMD_sample

vim ./plot_result_MMD.py # Check input catalogs
python3 ./plot_result_MMD.py

MMD_result

3.4.2 Spectral Energy Distribution (SED) in Magnitude

vim ./plot_sample_SED.py # Check input catalogs
python3 ./plot_sample_SED.py

SED_this_work_sample

3.4.3 Venn Diagram for Models

vim ./plot_model_venn_diagram.py # Check input catalogs
python3 ./plot_model_venn_diagram.py

VD_WO_projection VD_WI_projection

About

Identifying Young Stellar Object in Multi-dimensional Magnitude Space

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published