In [1]:
import pandas as pd
import numpy as np
import time
from haversine import haversine, Unit

In [2]:
# df contains the cleaned original catalog (1177 objects)
df = pd.read_csv("/mnt/c/Users/joerg/OneDrive/Dokumente/UNI/Bachelor of Science/Bachelorarbeit/Jupyter Notebook/blazar_clean.csv")
# df_centers contains the coordinates of the 19 neutrino hotspots obtained for L_min = 4.0
df_centers = pd.read_csv("/mnt/c/Users/joerg/OneDrive/Dokumente/UNI/Bachelor of Science/Bachelorarbeit/Jupyter Notebook/neutrino_hotspots_40.csv").to_numpy()

## Matching algorithm
Loops through all neutrino hotspots, finds blazars closer than r (in degrees) and prints the index of matched hotspots, as well as the coordinates of the corresponding blazars. Does not filter out duplicates yet. 

In [3]:
t0 = time.time()
r = 0.55
correlated = []
catalog = np.array(df)
for i in range(df_centers.shape[0]):
    for j in range(catalog.shape[0]):
        if haversine([df_centers[i,2], df_centers[i,1]], [catalog[j,4], catalog[j,3]], unit=Unit.DEGREES, normalize=True) < r:
            correlated.append(np.array((catalog[j,3],catalog[j,4])))
            print(i)
t1 = time.time()
print(f"\n{len(correlated)} Matches were found")
print(f"\nIt took {t1-t0:.3f}s to run the matching algorithm")
correlated

0
1
4
5
7
9
11
12
13

9 Matches were found

It took 0.229s to run the matching algorithm


[array([59.430375, -7.854058]),
 array([300.850458, -32.862528]),
 array([ 25.792208, -32.015747]),
 array([340.7865  ,  -6.150719]),
 array([274.895792, -63.763389]),
 array([ 44.0535  , -21.624747]),
 array([ 59.890292, -26.258689]),
 array([346.222042, -36.416669]),
 array([ 97.747958, -24.112828])]

The matching algorithm is expected to return 10 matches for r=0.55 and L_min=4.0 (see Buson). 9 out of 10 matches can already be identified. The missing correlation is due to a small deviation in the neutrino hotspot clustering method for certain hotspots. Note the following difference in calculated distances:

In [4]:
# The angular distance between the missing hotspot and blazar pair according to applied clustering method.
haversine([df_centers[10,2], df_centers[10,1]], [catalog[920,4], catalog[920,3]], unit=Unit.DEGREES, normalize=True)

0.569873307651041

In [5]:
# The angular distance between the missing hotspot and blazar pair according to Buson.
haversine([-22.27, 309.38], [-21.776858, 309.213208], unit=Unit.DEGREES, normalize=True)

0.5168138467295209

It turns out, that Busons clustering method does not use averaging of values for the center of each cluster. The method used by Buson seems to select the neutrino source with the highest -log(p-value) for the center of each cluster, disregarding all other points in that cluster. Listed below are the 4 neutrino sources that make up the neutrino hotspot thats still missing in the matching above (compare with Table 2 in Buson paper).

In [6]:
df_neutrinos = pd.read_csv("/mnt/c/Users/joerg/OneDrive/Dokumente/UNI/Bachelor of Science/Bachelorarbeit/Jupyter Notebook/all_neutrinos_L40.csv")
df_neutrinos[81:85]

Unnamed: 0.1,Unnamed: 0,index,RA,DEC,-log[p-value],l,b
81,81,577917,309.287109,-22.346588,4.347145,22.621176,-32.664999
82,82,578942,309.375,-22.427273,4.19052,22.561427,-32.767893
83,83,576894,309.375,-22.26595,4.664159,22.743021,-32.715979
84,84,577918,309.462891,-22.346588,4.497585,22.683378,-32.818931


Applying the same rules in the clustering as Buson did, one finds the exact same 10 matches. </br> </br> **Now the question: Which clustering method is better / more useful in this correlation study?**

In [7]:
buson_hotspots = pd.read_csv("/mnt/c/Users/joerg/OneDrive/Dokumente/UNI/Bachelor of Science/Bachelorarbeit/Jupyter Notebook/buson_neutrino_hotspots_40.csv").to_numpy()
r = 0.55
correlated = []
catalog = np.array(df)
for i in range(buson_hotspots.shape[0]):
    for j in range(catalog.shape[0]):
        if haversine([buson_hotspots[i,2], buson_hotspots[i,1]], [catalog[j,4], catalog[j,3]], unit=Unit.DEGREES, normalize=True) < r:
            correlated.append(np.array((catalog[j,3],catalog[j,4])))
print(f"\n{len(correlated)} Matches were found")
correlated


10 Matches were found


[array([300.850458, -32.862528]),
 array([59.430375, -7.854058]),
 array([ 25.792208, -32.015747]),
 array([340.7865  ,  -6.150719]),
 array([ 44.0535  , -21.624747]),
 array([274.895792, -63.763389]),
 array([346.222042, -36.416669]),
 array([ 59.890292, -26.258689]),
 array([ 97.747958, -24.112828]),
 array([309.213208, -21.776858])]