# Linking Filter Validation

In [1]:
import pandas as pd
import numpy as np
from sorcha.modules.PPLinkingFilter import PPLinkingFilter

This function aims to mimic the effects of the Solar System Processing pipeline in linking objects. More information can be found [here](http://lsst-sssc.github.io/software.html). If we use the SSP defaults, for an object to be linked, it must have:
* At least **2** observations in a night to constitute a valid tracklet.
* These observations must have an angular separation of at least **0.5 arcseconds** in order to be recognised as separate.
* However, subsequent observations in a tracklet must occur within 90 minutes or **0.0625 days**.
* At least **3** tracklets must be observed to form a valid track.
* These tracklets must be observed in less than **15** days.

We also expect **95%** of objects to be linked. For now, we will set this parameter to 100% in order to test the others.

These six parameters can be changed in the config file and are found in the [LINKINGFILTER] section.

In [2]:
min_observations = 2
min_angular_separation = 0.5
max_time_separation = 0.0625
min_tracklets = 3
min_tracklet_window = 15
detection_efficiency = 1
night_start_utc = 17.0

Let's create an object that should definitely be linked according to these parameters.

In [3]:
obj_id = ["pretend_object"] * 6
field_id = np.arange(1, 7)
times = [60000.03, 60000.06, 60005.03, 60005.06, 60008.03, 60008.06]
ra = [142, 142.1, 143, 143.1, 144, 144.1]
dec = [8, 8.1, 9, 9.1, 10, 10.1]

In [4]:
observations = pd.DataFrame(
    {
        "ObjID": obj_id,
        "FieldID": field_id,
        "fieldMJD_TAI": times,
        "RA_deg": ra,
        "Dec_deg": dec
    }
)

In [5]:
observations

Unnamed: 0,ObjID,FieldID,fieldMJD_TAI,RA_deg,Dec_deg
0,pretend_object,1,60000.03,142.0,8.0
1,pretend_object,2,60000.06,142.1,8.1
2,pretend_object,3,60005.03,143.0,9.0
3,pretend_object,4,60005.06,143.1,9.1
4,pretend_object,5,60008.03,144.0,10.0
5,pretend_object,6,60008.06,144.1,10.1


Now let's run the linking filter. As this object should be linked, we should receive the same dataframe back.

In [6]:
linked_observations = PPLinkingFilter(observations, detection_efficiency, min_observations, min_tracklets, min_tracklet_window, min_angular_separation, max_time_separation, night_start_utc)

In [7]:
linked_observations

Unnamed: 0,ObjID,FieldID,fieldMJD_TAI,RA_deg,Dec_deg,object_linked,date_linked_MJD
0,pretend_object,1,60000.03,142.0,8.0,True,60007.0
1,pretend_object,2,60000.06,142.1,8.1,True,60007.0
2,pretend_object,3,60005.03,143.0,9.0,True,60007.0
3,pretend_object,4,60005.06,143.1,9.1,True,60007.0
4,pretend_object,5,60008.03,144.0,10.0,True,60007.0
5,pretend_object,6,60008.06,144.1,10.1,True,60007.0


Success! The object was successfully linked. Now let's play with this dataframe a little. First, let's remove the first observation, so that we only have two complete tracklets.

In [8]:
observations_two_tracklets = observations.iloc[1:].copy()

In [9]:
observations_two_tracklets

Unnamed: 0,ObjID,FieldID,fieldMJD_TAI,RA_deg,Dec_deg,object_linked
1,pretend_object,2,60000.06,142.1,8.1,True
2,pretend_object,3,60005.03,143.0,9.0,True
3,pretend_object,4,60005.06,143.1,9.1,True
4,pretend_object,5,60008.03,144.0,10.0,True
5,pretend_object,6,60008.06,144.1,10.1,True


In [10]:
unlinked_observations = PPLinkingFilter(observations_two_tracklets, detection_efficiency, min_observations, min_tracklets, min_tracklet_window, min_angular_separation, max_time_separation, night_start_utc)

In [11]:
unlinked_observations

Unnamed: 0,ObjID,FieldID,fieldMJD_TAI,RA_deg,Dec_deg,object_linked,date_linked_MJD


As expected, we no longer link the object. Now let's try putting the last two observations outside of the 15-day window.

In [12]:
observations_large_window = observations.copy()
observations_large_window['fieldMJD_TAI'] = [60000.03, 60000.06, 60005.03, 60005.06, 60016.03, 60016.06]

In [13]:
unlinked_observations = PPLinkingFilter(observations_large_window, detection_efficiency, min_observations, min_tracklets, min_tracklet_window, min_angular_separation, max_time_separation, night_start_utc)

In [14]:
unlinked_observations

Unnamed: 0,ObjID,FieldID,fieldMJD_TAI,RA_deg,Dec_deg,object_linked,date_linked_MJD


Once again, we no longer link the object. What if we move the first two objects much closer to each other so that they no longer form a valid tracklet?

In [15]:
observations_small_sep = observations.copy()
observations_small_sep["RA_deg"] = [142, 142.00001, 143, 143.1, 144, 144.1]
observations_small_sep["Dec_deg"] = [8, 8.00001, 9, 9.1, 10, 10.1]

In [16]:
unlinked_observations = PPLinkingFilter(observations_small_sep, detection_efficiency, min_observations, min_tracklets, min_tracklet_window, min_angular_separation, max_time_separation, night_start_utc)

In [17]:
unlinked_observations

Unnamed: 0,ObjID,FieldID,fieldMJD_TAI,RA_deg,Dec_deg,object_linked,date_linked_MJD


And the object is no longer linked. Finally, let's move the first two observations much further apart in time so that they once again no longer form a valid tracklet.

In [18]:
observations_large_time = observations.copy()
observations_large_time["fieldMJD_TAI"] = [60000.03, 60000.10, 60005.03, 60005.06, 60008.03, 60008.06]

In [19]:
unlinked_observations = PPLinkingFilter(observations_large_time, detection_efficiency, min_observations, min_tracklets, min_tracklet_window, min_angular_separation, max_time_separation, night_start_utc)

In [20]:
unlinked_observations

Unnamed: 0,ObjID,FieldID,fieldMJD_TAI,RA_deg,Dec_deg,object_linked,date_linked_MJD


And as expected, we no longer link the object.

Finally, let's check that the detection efficiency works as expected. Let's set it to 0.95.

In [21]:
detection_efficiency = 0.95

Now let's make a dataframe of the same linked object repeated 10000 times.

In [22]:
objs = [["pretend_object_" + str(a)] * 6 for a in range(0, 10000)]
obj_id_long = [item for sublist in objs for item in sublist]
field_id_long = list(np.arange(1, 7)) * 10000
times_long = [60000.03, 60000.06, 60005.03, 60005.06, 60008.03, 60008.06] * 10000
ra_long = [142, 142.1, 143, 143.1, 144, 144.1] * 10000
dec_long = [8, 8.1, 9, 9.1, 10, 10.1] * 10000

In [23]:
observations_long = pd.DataFrame(
    {
        "ObjID": obj_id_long,
        "FieldID": field_id_long,
        "fieldMJD_TAI": times_long,
        "RA_deg": ra_long,
        "Dec_deg": dec_long
    }
)

In [24]:
observations_long

Unnamed: 0,ObjID,FieldID,fieldMJD_TAI,RA_deg,Dec_deg
0,pretend_object_0,1,60000.03,142.0,8.0
1,pretend_object_0,2,60000.06,142.1,8.1
2,pretend_object_0,3,60005.03,143.0,9.0
3,pretend_object_0,4,60005.06,143.1,9.1
4,pretend_object_0,5,60008.03,144.0,10.0
...,...,...,...,...,...
59995,pretend_object_9999,2,60000.06,142.1,8.1
59996,pretend_object_9999,3,60005.03,143.0,9.0
59997,pretend_object_9999,4,60005.06,143.1,9.1
59998,pretend_object_9999,5,60008.03,144.0,10.0


If detection efficiency were perfect, all of these objects would be linked. However, it is not. We have set the detection efficency to 0.95, so we should expect to return roughly 95% of these objects from the linking filter. Let's find out.

In [25]:
long_linked_observations = PPLinkingFilter(observations_long, detection_efficiency, min_observations, min_tracklets, min_tracklet_window, min_angular_separation, max_time_separation, night_start_utc)

In [26]:
long_linked_observations

Unnamed: 0,ObjID,FieldID,fieldMJD_TAI,RA_deg,Dec_deg,object_linked,date_linked_MJD
0,pretend_object_0,1,60000.03,142.0,8.0,True,60007.0
1,pretend_object_1624,1,60000.03,142.0,8.0,True,60007.0
2,pretend_object_5206,1,60000.03,142.0,8.0,True,60007.0
3,pretend_object_5205,1,60000.03,142.0,8.0,True,60007.0
4,pretend_object_1625,1,60000.03,142.0,8.0,True,60007.0
...,...,...,...,...,...,...,...
59995,pretend_object_5720,6,60008.06,144.1,10.1,True,60007.0
59996,pretend_object_5721,6,60008.06,144.1,10.1,True,60007.0
59997,pretend_object_5722,6,60008.06,144.1,10.1,True,60007.0
59998,pretend_object_5708,6,60008.06,144.1,10.1,True,60007.0


In [27]:
len(long_linked_observations["ObjID"].unique())/10000

1.0

This is close enough - the detection efficiency is stochastic, so some variation is to be expected.