# Statistics of star measurements

The purpose of this notebook is to estimate what fraction of stars in a given field have measurements in one, two and three filters. 

To get a straight forward estimate we use the field index from the field's crossmatch table, as this indicates each star's ID in the reference images for each filter.  We will use this as a rough indicator of whether the star would have produced valid lightcurve 
measurements in that filter. 

In [8]:
from pyDANDIA import crossmatch
import numpy as np

We will use the data from ROME-FIELD-01 as a proxy, since it should be reasonably consistent with the other fields. 

In [3]:
xmatch_file = '/Users/rstreet/ROME/ROME-FIELD-01/DR1/ROME-FIELD-01_field_crossmatch.fits'

xmatch = crossmatch.CrossMatchTable()
xmatch.load(xmatch_file,log=None)

In [4]:
xmatch.field_index

field_id,ra,dec,quadrant,quadrant_id,gaia_source_id,ROME-FIELD-01_lsc-doma-1m0-05-fa15_ip_index,ROME-FIELD-01_lsc-doma-1m0-05-fa15_rp_index,ROME-FIELD-01_lsc-doma-1m0-05-fa15_gp_index,ROME-FIELD-01_lsc-domb-1m0-09-fa03_ip_index,ROME-FIELD-01_coj-domb-1m0-03-fa11_gp_index,ROME-FIELD-01_coj-domb-1m0-03-fa11_rp_index,ROME-FIELD-01_coj-domb-1m0-03-fa11_ip_index,ROME-FIELD-01_coj-doma-1m0-11-fa12_gp_index,ROME-FIELD-01_coj-doma-1m0-11-fa12_rp_index,ROME-FIELD-01_coj-doma-1m0-11-fa12_ip_index,ROME-FIELD-01_cpt-doma-1m0-10-fa16_gp_index,ROME-FIELD-01_cpt-doma-1m0-10-fa16_rp_index,ROME-FIELD-01_cpt-doma-1m0-10-fa16_ip_index,ROME-FIELD-01_cpt-domb-1m0-13-fa14_gp_index,ROME-FIELD-01_cpt-domb-1m0-13-fa14_rp_index,ROME-FIELD-01_cpt-domb-1m0-13-fa14_ip_index,ROME-FIELD-01_cpt-domc-1m0-12-fa06_rp_index,ROME-FIELD-01_cpt-domc-1m0-12-fa06_ip_index,ROME-FIELD-01_cpt-domc-1m0-12-fa06_gp_index
int64,float64,float64,int64,int64,str19,int64,int64,int64,int64,int64,int64,int64,int64,int64,int64,int64,int64,int64,int64,int64,int64,int64,int64,int64
1,267.61861696019145,-29.829605383706895,4,1,4056436121079692032,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,267.70228408545813,-29.83032824102953,4,2,4056435567040767488,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,267.9873108673885,-29.829734325692858,3,1,4056444539267431040,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,267.9585073984874,-29.83002538112054,3,2,4056444878525743616,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
5,267.9623466389135,-29.82994179424344,3,3,4056444917224603904,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
6,267.943683356543,-29.830113202355186,3,4,4056444951538372736,6,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
7,267.90449275089594,-29.830465810573223,3,5,4056445084668331520,7,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
8,267.9504950018423,-29.830247462548577,3,6,4056444951538344320,8,8,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
9,267.9778110411362,-29.83012645385565,3,7,4056444156948804608,9,9,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
10,267.7950771349625,-29.830849947501875,4,3,4056447116201138176,10,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


First we extract the columns which provide the star indices in the reference images in each of the different datasets, for each filter. 

In [26]:
g_cols = [col for col in xmatch.field_index.columns if '_gp_' in col]
r_cols = [col for col in xmatch.field_index.columns if '_rp_' in col]
i_cols = [col for col in xmatch.field_index.columns if '_ip_' in col]
g_cols

['ROME-FIELD-01_lsc-doma-1m0-05-fa15_gp_index',
 'ROME-FIELD-01_coj-domb-1m0-03-fa11_gp_index',
 'ROME-FIELD-01_coj-doma-1m0-11-fa12_gp_index',
 'ROME-FIELD-01_cpt-doma-1m0-10-fa16_gp_index',
 'ROME-FIELD-01_cpt-domb-1m0-13-fa14_gp_index',
 'ROME-FIELD-01_cpt-domc-1m0-12-fa06_gp_index']

Now we sum the star indices across all columns for each filter - this gives us an index of stars which will be non-zero if a star was measured in any reference image for a given filter.  The exact value of the index doesn't matter.  

In [36]:
g_stars = np.zeros(len(xmatch.field_index))
r_stars = np.zeros(len(xmatch.field_index))
i_stars = np.zeros(len(xmatch.field_index))
for col in g_cols:
    g_stars += xmatch.field_index[col].data

for col in r_cols:
    r_stars += xmatch.field_index[col].data
    
for col in i_cols:
    i_stars += xmatch.field_index[col].data

i_stars

array([1., 2., 3., ..., 0., 0., 0.])

Using the star indices for the primary reference datasets, we can count the number of stars that were detected in all 3 passbands.

In [40]:
# Count the number of stars with non-zero indices in all three columns, indicating that the star was 
# detected in all three reference images
#mask = (xmatch.field_index[ref_col_g] > 0) and (xmatch.field_index[ref_col_r] > 0) and (xmatch.field_index[ref_col_i] > 0)
gdx = np.where(g_stars > 0)[0]
rdx = np.where(r_stars > 0)[0]
idx = np.where(i_stars > 0)[0]

mask = list((set(gdx).intersection(set(rdx))).intersection(set(idx)))
n3bands = len(mask)
n3percent = (n3bands/len(xmatch.field_index))*100.0
print('Number of stars measured in all three passbands=' + str(n3bands) + ', percentage of total=' + str(round(n3percent,1)) + '%')

Number of stars measured in all three passbands=149697, percentage of total=37.1%


Counting the number detected in two passbands requires us to handle cases of (i,g), (r,g) and (i,r).

In [41]:
# Filter pairs are defined in set order, with the last column in the tuple being the column where the star is NOT measured. 
filter_sets = {
    'g,i': (g_stars, i_stars, r_stars),
    'r,g': (r_stars, g_stars, i_stars),
    'i,r': (i_stars, r_stars, g_stars)
}

# Counting the number of stars measured in the two selected bands, accumulating the total percentage:
n2bands = 0
for label, fset in filter_sets.items():
    idx1 = np.where(fset[0] > 0)[0]
    idx2 = np.where(fset[1] > 0)[0]
    idx3 = np.where(fset[2] == 0)[0]
    mask = list((set(idx1).intersection(set(idx2))).intersection(set(idx3)))
    nbands = len(mask)
    n2bands += nbands
    npercent = (nbands/len(xmatch.field_index))*100.0
    print('Number of stars measured in passbands ' + label + '=' + str(nbands) + ', percentage of total=' + str(round(npercent,1)) + '%')

n2percent = (n2bands/len(xmatch.field_index))*100.0
print('Total number of stars measured in two passbands=' + str(n2bands) + ', percentage of total=' + str(round(n2percent,1)) + '%')

Number of stars measured in passbands g,i=10927, percentage of total=2.7%
Number of stars measured in passbands r,g=3938, percentage of total=1.0%
Number of stars measured in passbands i,r=124686, percentage of total=30.9%
Total number of stars measured in two passbands=139551, percentage of total=34.6%


Lastly, counting the number of stars with measurements in only one passband. 

In [42]:
# Filter in set order, with the last two columns in the tuple being the columns where the star is NOT measured. 
filter_sets = {
    'g only': (g_stars, i_stars, r_stars),
    'r only': (r_stars, g_stars, i_stars),
    'i only': (i_stars, r_stars, g_stars)
}

# Counting the number of stars measured in the two selected bands, accumulating the total percentage:
n1band = 0
for label, fset in filter_sets.items():
    idx1 = np.where(fset[0] > 0)[0]
    idx2 = np.where(fset[1] == 0)[0]
    idx3 = np.where(fset[2] == 0)[0]
    mask = list((set(idx1).intersection(set(idx2))).intersection(set(idx3)))
    nbands = len(mask)
    n1band += nbands
    npercent = (nbands/len(xmatch.field_index))*100.0
    print('Number of stars measured in ' + label + '=' + str(nbands) + ', percentage of total=' + str(round(npercent,1)) + '%')

n1percent = (n1band/len(xmatch.field_index))*100.0
print('Total number of stars measured in one passband only=' + str(n1band) + ', percentage of total=' + str(round(n1percent,1)) + '%')

Number of stars measured in g only=6731, percentage of total=1.7%
Number of stars measured in r only=10888, percentage of total=2.7%
Number of stars measured in i only=96499, percentage of total=23.9%
Total number of stars measured in one passband only=114118, percentage of total=28.3%
