<h3 style="background-color:orange;font-family:newtimeroman;font-size:200%;text-align:center;border-radius: 15px 50px;">Introduction</h3>


<p style="text-align: center;"><span style='font-family: "Times New Roman", Times, serif; font-size: 24px;'>Greeting fellow Kagglers,</span></p>
<p style="text-align: center;"><span style="font-size: 24px;"><span style="font-family: 'Times New Roman', Times, serif;">Most of us are familiar with graph theory; some understand it better, and some have a general idea of this interesting field of mathematics.</span></span></p>
<p style="text-align: center;"><span style="font-size: 24px;"><span style="font-family: 'Times New Roman', Times, serif;">Looking at the structure of the chemical compounds in the images, it is hard not to see a graph theory graph; a question arising goes as following: given the same formula represented as a graph, can the extracted graph attributes provide features that can improve the model that solves this problem, e.g. if I have an image of a chemical compound and I know that the graph the represent the compound has a maximum degree of some value and a minimum degree of another value, I know the chromatic number of the graph and the size of the largest clique or independent set, can such features included in a model improve performance or accuracy?</span></span></p>
<p style="text-align: center;"><span style="font-size: 24px;"><span style="font-family: 'Times New Roman', Times, serif;"><br></span></span></p>
<p style="text-align: center;"><span style="font-size: 24px;"><span style="font-family: 'Times New Roman', Times, serif;">Well, it is a hard question to answer, and even before we look into such a question, how can we even create a graph out of an image? That&apos;s exactly what I was thinking in the days working on this notebook.&nbsp;</span></span></p>
<p style="text-align: center;"><span style='font-family: "Times New Roman", Times, serif; font-size: 24px;'>First, we need to detect the vertices, the points between which we will connect our edges, and the following section dedicated to creating a pipeline that will get us to that first step, knowing how much and where the graph edges are located.</span></p>

<h2><span style="font-family: 'Times New Roman', Times, serif;">Note: for those of you not familiar with graph theory an introduction to the basics of graph theory can be reviewed in&nbsp;</span><a href="https://www.kaggle.com/thomaskonstantin/an-introduction-to-graph-theory"><span style="font-family: 'Times New Roman', Times, serif;">this notebook</span></a></h2>

<h3 style="background-color:orange;font-family:newtimeroman;font-size:200%;text-align:center;border-radius: 15px 50px;">Libraries And Utilities</h3>


In [None]:
import numpy as np
import pandas as pd
from collections import Counter, defaultdict
import cv2, os
import skimage.io as io
from tqdm.auto import tqdm
tqdm.pandas()
import matplotlib.pyplot as plt
import matplotlib as mpl
import seaborn as sns
from scipy.spatial import distance_matrix
import itertools
plt.rc('figure',figsize=(18,11))
plt.rc('image',cmap='Blues')
sns.set_context('paper',font_scale=2)


<h3 style="background-color:orange;font-family:newtimeroman;font-size:200%;text-align:center;border-radius: 15px 50px;">Loading The Data</h3>


In [None]:
labels = pd.read_csv('../input/bms-molecular-translation/train_labels.csv')
ss = pd.read_csv('../input/bms-molecular-translation/sample_submission.csv', index_col = 0)

labels['path'] = labels['image_id'].progress_apply(
    lambda x: "../input/bms-molecular-translation/train/{}/{}/{}/{}.png".format(
        x[0], x[1], x[2], x))
labels.head()

<h3 style="background-color:orange;font-family:newtimeroman;font-size:200%;text-align:center;border-radius: 15px 50px;">Going Through The Logic</h3>


<h3 style="background-color:darkorange;font-family:newtimeroman;font-size:200%;text-align:center;border-radius: 15px 50px;">Understanding Harris Corrner</h3>


<p style="text-align: center;"><span style='font-size: 24px; font-family: "Times New Roman", Times, serif;'>We will use the Harris Corner Detection algorithms to extract the corners i.e &quot;the significant&quot; pixel&apos;s which in our case is the intersection or the begining of every straight line.</span></p>
<p><br></p>
<p style="text-align: center;"><span style='font-size: 24px; font-family: "Times New Roman", Times, serif;'><strong><u>Harris Corners Detector Steps</u></strong></span></p>
<p style="text-align: center;"><span style="font-family: 'Times New Roman', Times, serif;"><span style="font-size: 24px;">1. Compute image gradients: Gx, Gy</span></span></p>
<p style="text-align: center;"><span style="font-family: 'Times New Roman', Times, serif;"><span style="font-size: 24px;">2. Compute products: Gx*Gx, Gx*Gy, Gy*Gy</span></span></p>
<p style="text-align: center;"><span style="font-family: 'Times New Roman', Times, serif;"><span style="font-size: 24px;">3. Filter products with a Gaussian window</span></span></p>
<p style="text-align: center;"><span style="font-family: 'Times New Roman', Times, serif;"><span style="font-size: 24px;">4. For each pixel (i,j) define the matrix M</span></span></p>
<p style="text-align: center;"><span style="font-family: 'Times New Roman', Times, serif;"><span style="font-size: 24px;">5. For each pixel compute the score R</span></span></p>
<p style="text-align: center;"><span style="font-family: 'Times New Roman', Times, serif;"><span style="font-size: 24px;">6. Threshold R, and perform non-maxima suppression</span></span></p>
<p style="text-align: center;"><span style='font-size: 24px; font-family: "Times New Roman", Times, serif;'><br></span></p>
<p><br></p>

<a id="1.1"></a>
<h3 style="background-color:white;font-family:newtimeroman;font-size:200%;text-align:center;border-radius: 15px 50px;">1) Compute image gradients: Gx, Gy</h3>


In [None]:
def calculate_image_gradients(img):
    #Derivative Kernels
    Kx = np.array([[-1,0,1],[-2,0,2],[-1,0,1]]) 
    Ky = Kx.T
    #Calculate The Derivative with respect to the x axis
    Gx = cv2.filter2D(img,-1,Kx)
    #Calculate The Derivative with respect to the y axis
    Gy = cv2.filter2D(img,-1,Ky)
    
    return Gx,Gy

In [None]:
img = np.zeros((21,21), dtype=np.float32)
img[5:-5,5:-5]=200

Gx,Gy = calculate_image_gradients(img)

plt.subplot(2,3,3)
plt.title("Gx",fontsize=16,fontweight='bold')
plt.imshow(Gx,cmap='jet')
plt.colorbar()
plt.subplot(2,3,2)
plt.title("Gy",fontsize=16,fontweight='bold')
plt.imshow(Gy,cmap='jet')
plt.subplot(2,3,1)
plt.title("Original Image",fontsize=16,fontweight='bold')
plt.imshow(img,cmap='jet')
plt.show()

<a id="1.1"></a>
<h3 style="background-color:white;font-family:newtimeroman;font-size:200%;text-align:center;border-radius: 15px 50px;">2) Compute the second order moments: ($Gx\cdot Gx$, $Gx\cdot Gy$, $Gy\cdot Gy$)</h3>


In [None]:
titles=[r"$Gx^{2}$",r"$Gy^{2}$",r"$Gy\cdot Gx}$"]

img = np.zeros((21,21), dtype=np.float32)
img[5:-5,5:-5]=200
Gx,Gy = calculate_image_gradients(img)
moments = [Gx**2,Gy**2,Gx*Gy]

img = np.zeros((21,21), dtype=np.float32)
img[5:-5,5:-5]=200

for i in range(0,3):
    plt.subplot(2,3,i+1)
    plt.title(titles[i],fontsize=16,fontweight='bold')
    plt.imshow(moments[i],cmap='jet')
    
plt.colorbar()
plt.show()

<a id="1.1"></a>
<h3 style="background-color:white;font-family:newtimeroman;font-size:200%;text-align:center;border-radius: 15px 50px;">3) Filter products with a Gaussian window</h3>


In [None]:
titles=[r"$Gx^{2}$",r"$Gy^{2}$",r"$Gy\cdot Gx}$"]

img = np.zeros((21,21), dtype=np.float32)
img[5:-5,5:-5]=200
Gx,Gy = calculate_image_gradients(img)
moments = [Gx**2,Gx*Gy,Gy**2]
moments = [cv2.GaussianBlur(moments[i],(3,3),2) for i in range(0,3)]

img = np.zeros((21,21), dtype=np.float32)
img[5:-5,5:-5]=200

for i in range(0,3):
    plt.subplot(2,3,i+1)
    plt.title('Filtered ' +titles[i],fontsize=16,fontweight='bold')
    plt.imshow(moments[i],cmap='jet')

plt.colorbar()
plt.show()

<a id="1.1"></a>
<h3 style="background-color:white;font-family:newtimeroman;font-size:200%;text-align:center;border-radius: 15px 50px;">4) For each pixel $(i,j)$ define the matrix $M$</h3>


For each (i,j) define a 2x2 moments-matrix:  $M = \begin{bmatrix}
 \sum{G_x^2}&\sum{G_xG_y} \\ 
 \sum{G_xG_y}&\sum{G_y^2} 
\end{bmatrix}$

We will define the Harris Score as : $R = det(M) - \alpha{[tr(M)]^2}$
Where det is the determinante operator and tr is the trace operator

In [None]:
img = np.zeros((21,21), dtype=np.float32)
img[5:-5,5:-5]=200

plt.subplot(2,2,1)
plt.title('Original Image',weight='bold')
plt.imshow(img,cmap='jet')
plt.subplot(2,2,2)
Gx,Gy =calculate_image_gradients(img)
M = [Gx**2,Gx*Gy,Gy**2]
M = [cv2.GaussianBlur(Prod,(3,3),2) for Prod in M]
detM = M[0]*M[2]-M[1]**2
traceM = M[0]+M[2]
R_scores = detM-0.06*(traceM**2)
plt.title('R Score Matrix',weight='bold')
plt.imshow(R_scores,cmap='jet')
plt.colorbar()
plt.show()

<a id="1.1"></a>
<h3 style="background-color:white;font-family:newtimeroman;font-size:200%;text-align:center;border-radius: 15px 50px;">6) Threshold $R$, and perform non-maxima suppression</h3>


In [None]:
def get_R_scores(Gx,Gy,Sigma=2,alpha=0.06,tsh=0.35):
    M = [Gx**2,Gx*Gy,Gy**2]
    M = [cv2.GaussianBlur(Prod,(3,3),Sigma) for Prod in M]
    detM = M[0]*M[2]-M[1]**2
    traceM = M[0]+M[2]
    R_scores = detM-alpha*(traceM**2)
    cross_kernel = np.ones((3,3),np.uint8)
    #cross_kernel[1,:]=1
    #cross_kernel[:,1]=1
    R_dilate = cv2.dilate(R_scores,cross_kernel)
    R_th = R_scores > R_scores.max()*tsh; # threshold
    R_nms = R_scores >= R_dilate;# NMS
    R_final = R_th * R_nms # threshold and NMS
    #plt.imshow(R_scores,cmap='jet')
    return R_final.astype(np.int)

img = np.zeros((21,21), dtype=np.float32)
img[5:-5,5:-5]=200

plt.subplot(2,2,1)
plt.title('Original Image',weight='bold')
plt.imshow(img,cmap='jet')
plt.subplot(2,2,2)
plt.title('R Score Matrix after NMS',weight='bold')
Gx,Gy =calculate_image_gradients(img)
plt.imshow(get_R_scores(Gx,Gy),cmap='jet')
plt.colorbar()
plt.show()

<p style="text-align: center;"><span style="font-family: 'Times New Roman', Times, serif;"><span style="font-size: 24px;">Note that what we are left with the corners of our shape, the vertices,
unfortunately, Harris corner by itself won't do us much, taking into account the amount of instability, variability, and noise in our images.
But now that we are familiar with the Harris corner detection algorithms, we can extend it and fit it to our needs; that is exactly what is about to be explained in the following section. </span></span></p>


<h3 style="background-color:darkorange;font-family:newtimeroman;font-size:200%;text-align:center;border-radius: 15px 50px;">Understanding Our Pipeline</h3>


<p style="text-align: center;"><span style='font-family: "Times New Roman", Times, serif; font-size: 24px;'>Our extension to the Harris corner algorithm requires us to find corners that may have lower R scores due to noise and the overall quality of quite many images.</span></p>
<p style="text-align: center;"><span style="font-size: 24px;"><span style="font-family: 'Times New Roman', Times, serif;">If we use more strict parameters in the Harris detector, i.e., the aperture and window size and the alpha value, we will get more quality and stable vertices, representing the intersection between straight lines; but we will miss a lot of vertices due to the already mentioned noise and quality.</span></span></p>
<p style="text-align: center;"><span style="font-size: 24px;"><span style="font-family: 'Times New Roman', Times, serif;">The solution implemented in the following pipeline goes as follows:</span></span></p>
<ol>
    <li style="text-align: center;"><span style="font-size: 24px;"><span style="font-family: 'Times New Roman', Times, serif;">&nbsp;Use Large Harris Detector Parameters To Get A Noisy Set Of Potential Vertecis, A Set Which Definitely Covers All Vertices.</span></span></li>
    <li style="text-align: center;"><span style='font-family: "Times New Roman", Times, serif; font-size: 24px;'>Preform A Modified Version Of KNN, Where Depending On A Threshold, All Vertecis That Grouped Within A Certiean Distance Leave Only The First Detected Vertex.</span></li>
</ol>

<h3 style="background-color:white;font-family:newtimeroman;font-size:200%;text-align:center;border-radius: 15px 50px;">1- Use Harris Corenr with Robust Parameters</h3>


In [None]:
def clean_noise(image,stepSize=2,windowSize=(3,3)):

    for y in range(0, image.shape[0], stepSize):
        for x in range(0, image.shape[1], stepSize):
            if (windowSize[0]**2 - image[y:y + windowSize[1], x:x + windowSize[0]].flatten().sum()) == 1:
                image[y:y + windowSize[1], x:x + windowSize[0]] = 1
    return image

img = plt.imread(labels.path[55])
img = clean_noise(img)
pimg = cv2.erode(img,np.ones((3,3)),iterations=2)
cimg = cv2.cvtColor(img,cv2.COLOR_GRAY2RGB)

plt.subplot(2,2,1)
plt.title('Eroded Example')
plt.imshow(pimg)
plt.subplot(2,2,2)

R = cv2.cornerHarris(np.float32(pimg),7,3,0.04)

R_dilate = cv2.dilate(R, np.ones((3,3)))
R_nms = R >= R_dilate;# NMS

R_th = R > R.max()*0.1; # threshold

R_final = R_th * R_nms # threshold and NMS
[y,x] = np.nonzero(R_final)

for x,y in zip(x,y):
    cimg = cv2.circle(cimg,(x,y),3,(255,0,0))

plt.title('Detected Vertices')
plt.imshow(cimg)
plt.show()

<h3 style="background-color:white;font-family:newtimeroman;font-size:200%;text-align:center;border-radius: 15px 50px;">2- Filter Point Using Modified KN Clustering</h3>


In [None]:

def threshold_neighbours(distance_matrix,index,threshold):
    return np.where(distance_matrix[index,:] <threshold)[0][1:]

def cluster_vertices(x,y,threshold=5):

    dm = distance_matrix(np.stack([x,y]).T,np.stack([x,y]).T)
    removed = set()
    for row in np.arange(0,len(x)):
        if row not in removed:
            removed = removed | set(threshold_neighbours(dm,row,threshold))

    xx=np.take(x,list(set(np.arange(0,len(x)))-(removed)))
    yy=np.take(y,list(set(np.arange(0,len(x)))-(removed)))
    return xx,yy

def clean_vertices(img,x_nodes,y_nodes):
    xx,yy = [],[]
    for x,y in zip(x_nodes,y_nodes):
        if img[yy,xx] < 1:
            xx.append(x)
            yy.append(y)
    return np.array(xx),np.array(yy)


def tag_vertices(pimg,blockSize=7,apertureSize=13,harrisAlpha=0.04,cluster_threshold=7,nms_threshold=0.1):
    R = cv2.cornerHarris(np.float32(pimg),blockSize,apertureSize,harrisAlpha)
    R_dilate = cv2.dilate(R, np.ones((3,3)))
    R_nms = R >= R_dilate;# NMS
    R_th = R > R.max()*nms_threshold; # threshold
    R_final = R_th * R_nms # threshold and NMS
    [y,x] = np.nonzero(R_final)
    xx,yy = cluster_vertices(x,y,cluster_threshold)
    return xx,yy

plt.subplot(2,2,1)
plt.title('Detected Vertices Before Filtration')
plt.imshow(cimg)
plt.subplot(2,2,2)
plt.title('Detected Vertices After Filtration')
x,y = tag_vertices(img,apertureSize=3,cluster_threshold=8)
ccimg = cv2.cvtColor(img,cv2.COLOR_GRAY2RGB)
for x,y in zip(x,y):
    ccimg = cv2.circle(ccimg,(x,y),4,(255,0,0),-1)
plt.imshow(ccimg)

plt.show()

<h3 style="background-color:orange;font-family:newtimeroman;font-size:200%;text-align:center;border-radius: 15px 50px;">Vertex Detection Pipeline</h3>


In [None]:
def sliding_window(image, stepSize, windowSize):
    for y in range(0, image.shape[0], stepSize):
        for x in range(0, image.shape[1], stepSize):
            yield (x, y, image[y:y + windowSize[1], x:x + windowSize[0]])

        
def clean_noise(image,stepSize=2,windowSize=(3,3)):
    
    """
     
    Parameters
    ----------
    image : np.array
        an image of a chemical compound
    stepSize : int
        The number of pixels the sliding window moves each step
    windowSize : tuple-(int,int)
        The size of the sliding window 

    """
    for y in range(0, image.shape[0], stepSize):
        for x in range(0, image.shape[1], stepSize):
            if (windowSize[0]**2 - image[y:y + windowSize[1], x:x + windowSize[0]].flatten().sum()) == 1:
                image[y:y + windowSize[1], x:x + windowSize[0]] = 1
    return image


def threshold_neighbours(distance_matrix,index,threshold):
    """
     
    Parameters
    ----------
    distance_matrix : np.array
        The distnace matrix of each detected vertex with every other
    index : int
        The index of our target vertex
    threshold : float
        The maximum distance allowed before considering a vertex to close to our target i.e a neighbour

    """
    return np.where(distance_matrix[index,:] <threshold)[0][1:]

def cluster_vertices(x,y,threshold=5,P=2,clustering_type=1):
    
    """
     
    Parameters
    ----------
    x : np.array
        The list of x coordinates of vertices
    y : np.array
        The list of y coordinates of vertices
    threshold : float
        The maximum distance allowed before considering a vertex to close to our target i.e a neighbour
    P : float
         Which Minkowski p-norm to use when calculating distance between vertices

    """
    if clustering_type == 1:
        #Calculate Distance Matrix
        dm = distance_matrix(np.stack([x,y]).T,np.stack([x,y]).T,p=P)
        removed = set()
        #iterate an accumulate the vertices which are to close or overlaping others 
        for row in np.arange(0,len(x)):
            if row not in removed:
                removed = removed | set(threshold_neighbours(dm,row,threshold))

        xx=np.take(x,list(set(np.arange(0,len(x)))-(removed)))
        yy=np.take(y,list(set(np.arange(0,len(x)))-(removed)))
        #return the clusterd vertecis
        return xx,yy

def clean_vertices(img,x_nodes,y_nodes):
    xx,yy = [],[]
    for x,y in zip(x_nodes,y_nodes):
        if img[yy,xx] < 1:
            xx.append(x)
            yy.append(y)
    return np.array(xx),np.array(yy)


def tag_vertices(pimg,blockSize=7,apertureSize=13,harrisAlpha=0.04,cluster_threshold=7,nms_threshold=0.1,minkowski_p = 2):
    
    """
     
    Parameters
    ----------
    pimg : np.array
        an image of a chemical compound
    blockSize : float
        It is the size of neighborhood considered for corner detection
    apertureSize : float
        Aperture parameter of Sobel derivative used 
    harrisAlpha : float
        Harris alpha parameter in the equation. 
    cluster_threshold : float
        A threshold for points clustering , i.e the minimum distance two point need to be apart to be considered
    nms_threshold : float
        A threshold value for the non maxima supression performed on the resulting R matrix
    minkowski_p : float, 1 <= p <= infinity
        Which Minkowski p-norm to use when calculating distance between vertices     
    """
    
    #Detect Harris Corners According To Given Parameters
    R = cv2.cornerHarris(np.float32(pimg),blockSize,apertureSize,harrisAlpha)
    
    #Preforme Non Maxima Suprresion On Resulting R Matrix to Eliminate Weak Candidates 
    R_dilate = cv2.dilate(R, np.ones((3,3)))
    R_nms = R >= R_dilate;# NMS
    R_th = R > R.max()*nms_threshold; # threshold
    R_final = R_th * R_nms # threshold and NMS
    
    
    [y,x] = np.nonzero(R_final)
    xx,yy = cluster_vertices(x,y,cluster_threshold,minkowski_p)
    return xx,yy
            
    

<h3 style="background-color:orange;font-family:newtimeroman;font-size:200%;text-align:center;border-radius: 15px 50px;">Examples and A Brief Analysis </h3>


In [None]:
imgs = [230,5,55]
fig,axs = plt.subplots(2,3)
for ix,img in enumerate(imgs):
    img = plt.imread(labels.path[img])
    img = clean_noise(img)
    pimg = cv2.erode(img,np.ones((3,3)),iterations=2)

    x,y = tag_vertices(pimg,apertureSize=3,blockSize=7,cluster_threshold=12,minkowski_p=2)
    ccimg = cv2.cvtColor(img,cv2.COLOR_GRAY2RGB)
    for x,y in zip(x,y):
        ccimg = cv2.circle(ccimg,(x,y),3,(255,0,0),-1)
    axs[1,ix].imshow(ccimg)
    axs[0,ix].imshow(img)
    

In [None]:
def count_nodes(img_path):
    #Read An Image
    img = plt.imread(img_path)
    #Remove Salt And Paper Noise - Not Using MedianBlur
    img = clean_noise(img)
    #Erode To Close Gaps
    img = cv2.erode(img,np.ones((3,3)),iterations=2)
    #Run Our Pipeline
    x,y = tag_vertices(img)
    return len(x)

SCAN_N = 500
vert_df = pd.DataFrame({'path':labels.loc[:SCAN_N,'path']})
vert_df['Num_Of_Vertices'] = labels.loc[0:SCAN_N,'path'].progress_apply(count_nodes)
vert_df['Img_Height'] = labels.loc[0:SCAN_N,'path'].progress_apply(lambda x: plt.imread(x).shape[0])
vert_df['Img_Width'] = labels.loc[0:SCAN_N,'path'].progress_apply(lambda x: plt.imread(x).shape[1])

In [None]:
vert_df

In [None]:
plt.title(f'Distribution of Vertices Count in First {SCAN_N} Images - With Noise')
sns.histplot(vert_df.Num_Of_Vertices)
plt.grid()
plt.show()

In [None]:
from scipy import stats


plt.suptitle("Linear Relationship Between Image Scale And Number of Vertices", fontsize=16,fontweight='bold')
plt.subplot(2,1,1)
slope, intercept, _,_,_ = stats.linregress(vert_df['Num_Of_Vertices'],vert_df['Img_Width'])
plt.title(r'$Width ={} \cdot X+{}$'.format(np.round(slope,2),np.round(intercept,2)))
sns.regplot(x=vert_df.Num_Of_Vertices,y=vert_df.Img_Width,line_kws=dict(color='red',lw='2',ls='-.'))
plt.grid()
plt.subplot(2,1,2)
slope, intercept, _,_,_ = stats.linregress(vert_df['Num_Of_Vertices'],vert_df['Img_Height'])
plt.title(r'$Height ={} \cdot X+{}$'.format(np.round(slope,2),np.round(intercept,2)))
sns.regplot(x=vert_df.Num_Of_Vertices,y=vert_df.Img_Height,line_kws=dict(color='red',lw='2',ls='-.'))
plt.grid()
plt.tight_layout()
plt.subplots_adjust(top=0.88)
plt.show()

<h3 style="background-color:orange;font-family:newtimeroman;font-size:200%;text-align:center;border-radius: 15px 50px;">Conclusions and Further Directions </h3>


<ul>
    <li style="text-align: center;"><span style='font-family: "Times New Roman", Times, serif; font-size: 24px;'>After looking at a couple of examples, the pipeline detects almost all vertices in many of the images but there a few keynotes to keep in mind. Without removing the letters first, we will have some vertices tagged on to letters noted as noise in this notebook but definitely need to be handled if applied to another pipeline.</span></li>
</ul>
<p style="text-align: center;"><span style="font-size: 24px;"><span style="font-family: 'Times New Roman', Times, serif;"><br></span></span></p>
<ul>
    <li style="text-align: center;"><span style="font-size: 24px;"><span style="font-family: 'Times New Roman', Times, serif;">From looking at many examples during the work on this pipeline, it appears that there are a few sets of parameters for the pipeline (one of which is set to be the default) that solve the problem with little deviation for different types of images, it may be an interesting approach to create a set of hand tagged images and design a grid-search like an algorithm to find the ideal parameter values which increase the number of different image types that are being properly tagged and in the same time decrease the overall number of missed vertices or noise.</span></span></li>
</ul>
<p style="text-align: center;"><span style="font-size: 24px;"><span style="font-family: 'Times New Roman', Times, serif;"><br></span></span></p>
<ul>
    <li style="text-align: center;"><span style='font-family: "Times New Roman", Times, serif; font-size: 24px;'>In the current pipeline filtering, clustered vertices were made using the euclidian distance between the vertices, and the vertex that was left is the first one found. Using different distance metrics and different clustering approaches may result in better results.</span></li>
</ul>