# euclidean distance & self organising maps : 

we should note that at every iteration we try to find the closest neuron or the closest city to the current city or 
neuron. here we use the <b>"euclidean distance"</b> as we are in 
two dimensional system let $\,\vec{a} \, , \vec{b} \,  $ be 
two vectors in $ \mathbb{R}^2 \,$ then the euclidean distance 
between them is given by $d(a,b) = || \vec{a} - \vec{b} ||$

In [1]:
import numpy as np 

def euclidean_distance(vec1, vec2):
    
    #returns the euclidean distance between the two vectors 
    
    return np.linalg.norm(vec1-vec2, axis=1)
    

the next task is choosing the nearest of the possible candidates, i.e. selecting the candidate with the minimal euclidean distance from the current one or finding the $$\vec{b_m} =   \, \underset{\vec{b}}{\operatorname{argmin}} ||\vec{a} - \vec{b}||$$ 

In [2]:
""" prototype for the argmin function : 
___________________________________________________
    def argmin(l):
        y =  min(range(len(l)), key=lambda x: l[x])
        return l[y]""" 

# we tried to first understand how the argmin function works for a list, then found that this took time for a matrix so we decided to use the inbuilt argmin function  

def choose_closest(current, neighbours):
    return euclidean_distance(current, neighbours).argmin()

now having found the distances between two cities we have to devise some way to calculate the cost of taking a particular route, also i 
think that this cost must be sensitive to order like $ \text{cost(x,y,z)} \neq \text{cost(y,z,x)}$

In [3]:
def calculate_dist(cities):
    """ calculate the distances while traversing on a route 
    _________________________________________________________
    
    cities : pandas data frame object containing the x, y coordinates of the cities 
    
    returns : 
    
    the sum of the distances on traversing a route in a particular order (start -> end)
        
    """
    
    # cities will be a data frame of coordinates and city number 
    # we need to first form the array of (x,y) coordinates of cities
    
    city_coordinates = cities[['x', 'y']].values
    
    distances = euclidean_distance(city_coordinates, np.roll(city_coordinates, 1, axis=0))
    
    return np.sum(distances)

In [4]:
# test case 1 : 

a = np.array([1,1]).reshape(2,1)
b = np.array([9,2]).reshape(2,1)

print(f"distance between a and b is {euclidean_distance(a,b)}")

"""the expected answer is (9-1,2-1) = (8,1)"""

distance between a and b is [8. 1.]


'the expected answer is (9-1,2-1) = (8,1)'

In [5]:
# test case 2 :

print(choose_closest(np.array([1,1]), np.array([[1,5],[2,1],[2,3]])))

"""expected answer is the index of (2,1)"""

1


'expected answer is the index of (2,1)'

In [6]:
# test case 3 : 

import pandas as pd 

df = pd.DataFrame([[1,0],[1,1],[0,1],[0,0]], columns=['x', 'y'])

print(calculate_dist(df))

"""the expected answer is 4.0"""

4.0


'the expected answer is 4.0'