# WALLABY Search for duplicates

## Problem description

WALLABY sources must be assigned a unique name (WALLABY J hhmmss±ddmmss) that follows
the official IAU source naming guideline. If a source is observed more than once, then all subsequent detections of that source must be assigned the same name that was created when the source was first detected.

One way of ensuring name consistency would be to cross-match new sources against the existing
source database to identify if a source was previously detected. This could be done by checking if
any sources exist in the database within a certain radius around the measured source position. If
an older source is found, its name will need to be assigned to the new source, thus replacing its
existing source name. 

Problems can occur if a source is broken up into multiple components by the source finder, or
multiple sources are merged into a single detection. General position cross-matching within 2–3
times the beam size may therefore be desirable to flag such cases and manually resolve their
names. Specific flags could be assigned to detections to mark them as close pairs of galaxies or
components of a single galaxy.

In [None]:
# ------- RUN THIS CELL FIRST -------

# import Python standard libraries
import os
import sys
import json
import django
from datetime import datetime

# Django setup
sys.path.append('src/')
django.setup()

# Import Django models
from run.models import Run
from instance.models import Instance
from detection.models import Detection
from products.models import Products

from sources.models import Sources
from comments.models import Comments
from tag.models import Tag

## Example workflow

For the purposes of this example we will take the first detection and make that a "new source". We will then proceed to cross-match with sources within some multiple of the beam size and flag them.

In [4]:
# Create a "new source" from the existing detections

detections = list(Detection.objects.all())
new_source = detections.pop()

In [7]:
print(f"Total detections: {Detection.objects.count()}")
print(f"Mock detections list length: {len(detections)}")

Total detections: 1774
Mock detections list length: 1773


In [13]:
# Get products for detection

new_source_products = Products.objects.get(detection=new_source)

In [20]:
# Get FITS header for beam size

b''.join(new_source_products.cube)[0:2880]

b"SIMPLE  =                    T                                                  BITPIX  =                  -32                                                  NAXIS   =                    3                                                  NAXIS1  =                   52                                                  NAXIS2  =                   57                                                  NAXIS3  =                   76                                                  CTYPE1  = 'RA---SIN'                                                            CRPIX1  =   -5.60000000000E+01                                                  CDELT1  =   -1.66666666667E-03                                                  CRVAL1  =    2.52984450000E+02                                                  CTYPE2  = 'DEC--SIN'                                                            CRPIX2  =    1.11700000000E+03                                                  CDELT2  =    1.66666666667E-03        

In [35]:
# get pixel search radius from beam size
# NOTE: Temporary value for beam size used since I cannot find it in the FITS header
# or in the data products of the detection

N = 3
beam_size = 10
search_radius = N * beam_size

In [36]:
import numpy as np
from numpy.linalg import norm

In [37]:
# Cross match with existing detections

import numpy as np
from numpy.linalg import norm

close_detections = []
source_coords = np.array([new_source.x, new_source.y])

for d in detections:
    detection_coords = np.array([d.x, d.y])
    dist = norm(source_coords - detection_coords)
    if (dist < search_radius):
        close_detections.append(d)

In [38]:
print(close_detections)

[<Detection: WALLABY J165101-604813>, <Detection: WALLABY J165100-604813>, <Detection: WALLABY J165101-604809>, <Detection: WALLABY J165100-604814>]


## Manual resolution...

Now, to determine which of the detections should be the the source we can manually inspect each of the detections in `close_detections` and compare it with the `new_source`.