# Defining Column Names for atlas_objects

Here we define and document the columns used in the atlas_objects table.
Currently the columns created by the rockAtlas code (https://github.com/thespacedoctor/rockAtlas) describe the detections of an asteroid in the ATLAS o and c filters (Tonry et al. 2018) and the phase curve fit using the two parameter HG system (Bowell et al. 1989).

We now add columns for fits using other phase curve models, and to record the metrics associated with each fit.
The parameters (and fitting metrics) for each model, for each filter, must be recorded in the atlas_objects table as unique columns.
We use the format:

phase_curve_{1}_{2}_{3}

Where the terms to be inserted into the string are defined as follows:

1. Model Parameter or Fit Metric
2. Model Shorthand
3. Filter: o or c

## Model Names and Parameters

We use the phase curve models that are supplied with the sbpy package (https://sbpy.readthedocs.io/en/latest/sbpy/photometry.html).
These models use some sort of absolute magnitude ('H') and slope ('G') parameter(s) to describe the phase curve.
Each of these parameters will have an associated uncertainty/error from the fit.
We use a unique shorthand to distinguish each model, which references the exact paper where the model is defined.

|Name | Parameters | Shorthand | Source|
|---|:---|---|:---|
|HG | H, H_err, G, G_err | B89 (or blank*) | Bowell et al. 1989|
|HG1G2 | H, H_err, G1, G1_err, G2, G2_err | 3M10 | 3 parameter Muinonen et al. 2010|
|HG12 | H, H_err, G12, G12_err| 2M10 | 2 parameter Muinonen et al. 2010|
|HG12_Pen16 | H, H_err, G12, G12_err | P16 | Pentilla et al. 2016|

*to be consistent with the exisiting columns in the atlas_objects table the HG system can go without a shorthand marker (it is the default phase curve model).


N.B. should we add the linear phase function?

## Fit Metrics

|Name | Description |
|---|:---|
|N_fit | number of data points used for the fit|
|N_nights | the number of individual nights of observation used for the fit (number of unique integer MJDs for the fit data points)|
|N_iter | the number of iterations required to cut outlying data points (CHOOSE FROM 1 mag, 2 sigma, 3 sigma clipping) |until the change in H,G parameters is <0.01|
|alpha_min | the smallest phase angle used in the fit (degrees)|
|alpha_max | the largest phase angle used in the fit (degrees)|
|N_alpha_low | the number of data points used in the fit with phase angle < 5 degrees|
|nfev | number of scipy.optimize.leastsq function evaluations |
|ierr | scipy.optimize.leastsq integer flag (1,2,3,4 -> solution found) | 
|N_mag_err | Number of data points used in the fit with magnitude error less than threshold (0.1 mag) |
|OC_mean | Mean value of the observed - calculated residuals of the data points that were fit (mag)|
|OC_std | Standard deviation of the residuals (mag)|
|OC_range | Range of the residuals (mag)|
|N_cut | Number of data points cut during fit |

N.B. the column names currently do not indicate the type of data clipping used in the fitting process.
We must either pick one before and use it exclusively or add it as another descriptor in the column name (will greatly increase number of columns!)

The following columns record information on individual apparitions.

|Name | Description |
|---|:---|
|N_apparitions|Number of apparitions found on all data, using either the change in solar elongation or a JPL query |
|phase_curve_med_H_app_o | HG model is fitted to the o filter data of each apparition. This parameter is the median of wach fitted H|
|phase_curve_med_std_app_o | The std of each apparition about the fitted HG is recorded, this parameter is the median std|
|phase_angle_app_range_o | The phase angle range of each apparition is recorded, this parameter is the maximum difference in phase angle range between apparitions|


To make changes: run this notebook to update atlas_objects_fields.txt.
    
Then run create_sql_atlas_objects_phase.py to create the sql query, then connect_create_atlas_objects_phase.py to run that query on the remote database.

Finally dtypes_atlas_phase_fits.txt needs manually updated to use when loading the atlas_phase_fits table using tools/database_tools.py

In [1]:
import numpy as np

In [2]:
# These are the existing columns in the atlas_objects table (HG fit only)
current_columns=["dateLastModified", "detection_count", "detection_count_c",
"detection_count_o", "last_detection_mjd", "last_photometry_update_date_c",
"last_photometry_update_date_o", "mpc_number", "name", "orbital_elements_id",
"phase_angle_range_c", "phase_angle_range_o", "phase_curve_G_c",
"phase_curve_G_err_c", "phase_curve_G_err_o", "phase_curve_G_o",
"phase_curve_H_c", "phase_curve_H_err_c", "phase_curve_H_err_o",
"phase_curve_H_o", "phase_curve_refresh_date_c", "phase_curve_refresh_date_o",
"primaryId", "updated"]
# for c in current_columns:
#     print(c)

In [3]:
filters = ["o","c"]
model_names = ["HG", "HG1G2", "HG12", "HG12_Pen16"]
model_short = ["_B89","_3M10","_2M10","_P16"] # use shorthand for all models 
#model_short = ["","_3M10","_2M10","_P16"] # do not use shorthand for HG to be consistent with exisiting column names
phase_curve = [["H","G"],["H","G1","G2"],["H","G12"],["H","G12"]]
phase_curve_err = [["H_err","G_err"],["H_err","G1_err","G2_err"],["H_err","G12_err"],["H_err","G12_err"]]
fit_metrics = ["OC_mean","OC_std","OC_range","skew","kurtosis","KS_D","KS_p"]
app_metrics = ["N_fit","alpha_min","N_alpha_low","N_mag_err","N_data_app"]
apparition_columns = ["N_apparitions","app_ind","app_start_mjd","fit_slope"]

In [4]:
# drop these HG columns as we are updating them with a "B89" tag
drop_cols=["phase_curve_G_c",
"phase_curve_G_err_c", "phase_curve_G_err_o", "phase_curve_G_o",
"phase_curve_H_c", "phase_curve_H_err_c", "phase_curve_H_err_o",
"phase_curve_H_o"]
current_columns=np.array(current_columns)
print(current_columns)
current_columns=current_columns[~np.isin(current_columns,drop_cols)]
print(current_columns)

['dateLastModified' 'detection_count' 'detection_count_c'
 'detection_count_o' 'last_detection_mjd' 'last_photometry_update_date_c'
 'last_photometry_update_date_o' 'mpc_number' 'name' 'orbital_elements_id'
 'phase_angle_range_c' 'phase_angle_range_o' 'phase_curve_G_c'
 'phase_curve_G_err_c' 'phase_curve_G_err_o' 'phase_curve_G_o'
 'phase_curve_H_c' 'phase_curve_H_err_c' 'phase_curve_H_err_o'
 'phase_curve_H_o' 'phase_curve_refresh_date_c'
 'phase_curve_refresh_date_o' 'primaryId' 'updated']
['dateLastModified' 'detection_count' 'detection_count_c'
 'detection_count_o' 'last_detection_mjd' 'last_photometry_update_date_c'
 'last_photometry_update_date_o' 'mpc_number' 'name' 'orbital_elements_id'
 'phase_angle_range_c' 'phase_angle_range_o' 'phase_curve_refresh_date_c'
 'phase_curve_refresh_date_o' 'primaryId' 'updated']


In [5]:
new_columns=[]
for i,mod in enumerate(model_short):
    for fil in filters:
        for hg in phase_curve[i]:
            column="phase_curve_{}{}_{}".format(hg,mod,fil)
            new_columns.append(column)
        for hg_err in phase_curve_err[i]:
            column="phase_curve_{}{}_{}".format(hg_err,mod,fil)
            new_columns.append(column)
        for met in fit_metrics:
            column="phase_curve_{}{}_{}".format(met,mod,fil)
            new_columns.append(column)
        for met in app_metrics:
            column="phase_curve_{}_{}".format(met,fil)
            new_columns.append(column)

In [6]:
# print(len(current_columns),len(new_columns),len(current_columns)+len(new_columns))
all_columns=np.unique(np.append(current_columns,new_columns))
all_columns=np.append(all_columns,apparition_columns)
# print(all_columns)
for a in all_columns:
    print(a)
print(len(all_columns))

dateLastModified
detection_count
detection_count_c
detection_count_o
last_detection_mjd
last_photometry_update_date_c
last_photometry_update_date_o
mpc_number
name
orbital_elements_id
phase_angle_range_c
phase_angle_range_o
phase_curve_G12_2M10_c
phase_curve_G12_2M10_o
phase_curve_G12_P16_c
phase_curve_G12_P16_o
phase_curve_G12_err_2M10_c
phase_curve_G12_err_2M10_o
phase_curve_G12_err_P16_c
phase_curve_G12_err_P16_o
phase_curve_G1_3M10_c
phase_curve_G1_3M10_o
phase_curve_G1_err_3M10_c
phase_curve_G1_err_3M10_o
phase_curve_G2_3M10_c
phase_curve_G2_3M10_o
phase_curve_G2_err_3M10_c
phase_curve_G2_err_3M10_o
phase_curve_G_B89_c
phase_curve_G_B89_o
phase_curve_G_err_B89_c
phase_curve_G_err_B89_o
phase_curve_H_2M10_c
phase_curve_H_2M10_o
phase_curve_H_3M10_c
phase_curve_H_3M10_o
phase_curve_H_B89_c
phase_curve_H_B89_o
phase_curve_H_P16_c
phase_curve_H_P16_o
phase_curve_H_err_2M10_c
phase_curve_H_err_2M10_o
phase_curve_H_err_3M10_c
phase_curve_H_err_3M10_o
phase_curve_H_err_B89_c
phase_curve_

In [7]:
with open("atlas_objects_app_fields.txt", "w") as outfile:
    outfile.write("\n".join(all_columns))