Skip to content
Alexey Mints edited this page Dec 10, 2020 · 7 revisions

The (outdated) original version of this page is here

UniDAM

A Unified tool to estimate Distances, Ages, and Masses (UniDAM) is described in Mints&Hekker (2017) and Mints&Hekker (2018)

Installation

Installation instructions

Input data

Model file

Model file contains a set of stellar models (so far only from PARSEC, other are in development). Model file is a FITS file that contains two tables. An example of the structure is given below:

No.    Name      Ver    Type      Cards   Dimensions   Format
  0  PRIMARY       1 PrimaryHDU       5   ()      
  1  parsec_models.fits    1 BinTableHDU     40   2871540R x 14C   [E, E, E, E, E, E, E, E, E, E, E, E, I, E]   
  2                1 BinTableHDU     10   177R x 1C   [E]   

Columns in the first BinTableHDU are arbitrary, except the last two, which should be stage label (integer) and model weight (float). In the current version the columns are: ..... The second BinTableHDU contains log(age) grid used in the first table.

Input file

Input file is a astropy-readable file with the following columns:

Value Units
id
T K
logg log(cm*s-2)
feh dex
dT K
dlogg $log(cm*s^{-2})$
dfeh dex
Jmag mag
e_Jmag mag
Hmag mag
e_Hmag mag
Kmag mag
e_Kmag mag
W1mag mag
e_W1mag mag
W2mag mag
e_W2mag mag

Magnitudes here are for 2MASS and AllWISE. In a more general case, two groups of columns can be used (id column should always be present):

  • parameter columns, in pairs of parameter and dparameter for value and its uncertainty. These columns can further be referred to in the configuration file parameter fitted_columns. These parameters should be present in the model file with exactly same name.
  • photometry columns, in pairs of Xmag and e_Xmag for magnitude and uncertainty in band X. These columns can further be referred to in the configuration file parameter band_columns. The band X should be present in the model file columns.

If you know the parallax (e.g. from Gaia), you can add the following columns (column names can be different, but this should be reflected in the configuration file):

Value Units
parallax arcsec
parallax_error arcsec
extinction mag
extinction_error mag

Extinction value is a soft upper limit of $A_K$ value, with extinction prior being flat below the limit and decreasing exponentially (with width of the given extinction_error) above that.

Other columns are possible, but they will be ignored.

Configuration file

For the configuration we use Python's configparser module standard. Here is an example of the configuration file:

[general]
# Path to the model file
model_file=PARSEC.fits

# Store stacked full PDF data in distance modulus, log(age) and 
# 2D (distance modulus vs log(age)) PDFs
dump_pdf=True

# Folder name for debug dump data (activated by -d switch, default=dump)
dump_prefix=dump_gaia

# Distance prior values are:
# 0: no prior,
# 1: volume prior (default)
# 2: decreasing exponential density prior (experimental)
distance_prior=1

# These columns are passed to output without changes.
keep_columns=

# Maximum difference between model and observed values in sigmas (default=4)
# Setting higher values can decrease the calculations dramatically with little effect on results
# Setting lower values speed things up, but has impact on the outcome.
max_param_err=4.

# Wether to allow extinction to be negative (see discussion in sec 4.4 of Mints&Hekker 2017).
allow_negative_extinction=False

# This refers to columns in the model_file. 
# Input file columns should have the following columns: 
# For each column A from model_columns 
# 1) a column A with a value
# 2) a column dA with the uncertainty
model_columns=T,logg,feh

# For each band column A there should be 
# 1) a column Amag with a value
# 2) a column e_Amag with the uncertainty
band_columns=J,H,K,W1,W2

# These columns will be used to derive output parameters
# Fitted column names are taken from the model file.
# any of [distance_modulus,distance,extinction,parallax] can also be used.
fitted_columns=stage,age,mass,distance_modulus,distance,extinction,parallax

# If parallax is known, parallax and extinction sections below are required.
parallax_known=True

[parallax]
column=parallax
err_column=parallax_error

[extinction]
column=extinction
err_column=extinction_error

Running UniDAM

>./unidam_runner.py --help
usage: unidam_runner.py [-h] -i INPUT [-o OUTPUT] -c CONFIG [--id ID] [-t]
                        [-C CONFIG_OVERRIDE] [--parallax-zero PARALLAX_ZERO]
                        [-d | -p]

Tool to estimate distances to stars.

optional arguments:
  -h, --help            show this help message and exit
  -i INPUT, --input INPUT
                        Input file name (any astropy-readable table)
  -o OUTPUT, --output OUTPUT
                        Output file name
  -c CONFIG, --config CONFIG
                        Config file name
  --id ID               Run for just a single ID or a comma-separated list of
                        IDs
  -t, --time            Add timing output (only in parallel mode)
  -C CONFIG_OVERRIDE    Override config params, in the form -C PARAM=VALUE.
                        Can be used multiple times.
  --parallax-zero PARALLAX_ZERO
                        Parallax zero point value
  -d, --dump-results    Dump model data for each star
  -p, --parallel        Run in parallel (uses OMP_NUM_THREADS if given,
                        otherwise 2 threads)

Running UniDAM

Batch mode

You have to call

python3 unidam_runner.py -i input_file -o output_file -c config_file(e.g. unidam_pdf.conf)

you can add -p flag for parallel execution, number of processes is set by OMP_NUM_THREADS shell variable.

Debug mode

You have to call

python3 unidam_runner.py -d --id IDs -i input_file -o output_file -c config_file(e.g. unidam_pdf.conf)

where IDs is a comma-separated list of IDs from the input file. In this case two files will be created for each ID:

  • {dump_prefix}/dump_{ID}.dat -- an ASCII table with parameters for all models that fit.
  • {dump_prefix}/dump_{ID}.json -- JSON file with extended data for each solution.

dump_prefix is taken from the config file in this case (default = dump)

Output description:

Column name Units Description
id Unique ID of the star from the input data
stage Stage number (I, II or III)
uspdf_priority Priority order of a given USPDF (starting from 0)
uspdf_weight Weight of a given USPDF $V_m$
total_uspdfs Number of USPDF with $V_m > 0.03$
p_best Probability for a best-fitting model
p_sed p-value from $\chi^2$ SED fit
quality Quality flag (see below)
distance_modulus_smooth mag smoothing used in distance modulus PDF calculations
extinction_smooth mag smoothing used in extinction PDF calculations
extinction_zero fraction of the PDF with zero extinction
For every param in the fitted_columns list
param_mean Mean value of param PDF
param_err Standard deviation of param PDF
param_mode Mode of param PDF
param_median Median of param PDF
param_fit Letter indicating the fitted functon
param_par Parameters of the fitted function
Distance modulus - logarithm of age relation:
dm_age_slope Slope of the relation
dm_age_intercept Intercept of the relation
dm_age_scatter Scatter of the relation
dm_age_mad Median absolute deviation of the relation
Distance modulus - logarithm of mass relation:
dm_mass_slope Slope of the relation
dm_mass_intercept Intercept of the relation
dm_mass_scatter Scatter of the relation
dm_mass_mad Median absolute deviation of the relation

See Mints & Hekker 2017 for column description. All _fit columns contain a letter designating the best-fitted function.

List of the functions used:

  • G - Gaussian (omitted in the current version, but can appear in old output file). $f(x) = e^{-\frac{(x-p_1)^2}{2 p_2^2}}$

  • T - Truncated gaussian $$ f(x) = \left{\begin{align} e^{-\frac{(x-p_1)^2}{2 p_2^2}} & \textrm{if} p_3 < x < p_4; \nonumber \ 0\hskip{10pt} & otherwise \nonumber\end{align} \right . $$

  • L - Laplace $$ f(x) = \left{\begin{align} e^{-(x - p_1) / p_2} & \textrm{if} p_3 < x < p_4; \nonumber \ 0\hskip{10pt} & otherwise \nonumber\end{align} \right . $$

  • S - Skew-gaussian

  • F - straight line $$ f(x) = \left{\begin{align} p_1 x + p_2,& \textrm{if} p_3 < x < p_4; \nonumber \ 0\hskip{10pt} & otherwise \nonumber\end{align} \right . $$

  • P - Student's t-distribution

Functions are selected based on symmetric KL-divirgence. Note that all functions are unnormalized.

When working with results, the following procedure is recommended to obtain PDF:

  1. Use get_param function from unidam.utils.local. It returns a distribution class f and an array of parameters p;
  2. Run y = f(x, *p), where x are values where you want to calculate PDF.
  3. The PDF for this solution is then: y = y * uspdf_weight / y.sum(), where uspdf_weight is the weight of USPDF for that solution.