# Exploring distance metrics

i. Euclidean Distance : A distance metrics for calculating the distance between two points going through a straight line.

Euclidean Distance(P,Q) = [(x2-x1)^2 + (y2-y1)^2]^(1/2)<br>
where,<br>
P = (x1,x2)<br> 
Q = (x2,y2)

In [49]:
def euclidean_distance_2p(p:tuple = (0,0),q:tuple = (0,0))->float:
        import math as m 
        return m.sqrt((q[0]-p[0])**2 +(q[1] - p[1])**2)

def euclidean_distance_np(p:list)->float:
        distance = 0
        for i in range(len(p)-1):
            if i!=len(p) - 1:
                  if isinstance(p[i],tuple) and isinstance(p[i+1],tuple):
                        if (len(p[i])!=2) and (len(p[i+1])!=2):
                            raise ValueError('All tuples should be of length 2')
                        for point in (p[i], p[i+1]):
                           for coord in point:
                            if not isinstance(coord, (int, float)):
                             raise ValueError("Only integers or floats allowed in tuples")
                        distance+= euclidean_distance_2p(p[i],p[i+1]) 
                  else:
                        raise TypeError('Only tuple type allowed in list') 
        return distance

In [50]:
# Trying euclidean_distance_2p
P = (1,2) 
Q = (5,6) 
distance = euclidean_distance_2p(P,Q) 
distance

5.656854249492381

In [51]:
# Trying euclidean_distance_np
points = [(1,2),(5,6),(7,8)] 
distance = euclidean_distance_np(points) 
distance

8.485281374238571

ii. Manhattan Distance : A distance metrics for calculating the distance between two points when going through twist and turns through other points.<br>

Manhattan Distance(P,Q) = |x1-x2| + |y1-y2|<br>
where,<br>
P = (x1,x2)<br>
Q = (y1,y2)

In [52]:
def manhattan_distance_2p(p:tuple = (0,0),q:tuple = (0,0))->float:
        import math as m 
        return abs(p[0]-q[0]) + abs(p[1]-q[1])

def manhattan_distance_np(p:list)->float:
        distance = 0
        for i in range(len(p)-1):
            if i!=len(p) - 1:
                  if isinstance(p[i],tuple) and isinstance(p[i+1],tuple):
                        if (len(p[i])!=2) and (len(p[i+1])!=2):
                            raise ValueError('All tuples should be of length 2')
                        for point in (p[i], p[i+1]):
                           for coord in point:
                            if not isinstance(coord, (int, float)):
                             raise ValueError("Only integers or floats allowed in tuples")
                        distance+= manhattan_distance_2p(p[i],p[i+1]) 
                  else:
                        raise TypeError('Only tuple type allowed in list') 
        return distance

In [53]:
# Trying manhattan_distance_2p 
distance = manhattan_distance_2p(P,Q) 
distance

8

In [54]:
# Trying manhattan_distance_np 
distance = manhattan_distance_np(points) 
distance

12

iii. Minkowski Distance : A generalized form of both euclidean and manhattan distance.<br>

Minkowski Distance(A,B) = (|x1-x2|^2 - |y1-y2|^2)^1/p<br>
where,<br>
A = (x1,x2)<br>
B = (y1,y2)<br>
p = 1 for manhatthan distance and p=2 for euclidean distance

In [55]:
def minkowski_distance_2p(p:int,a:tuple=(0,0),b:tuple=(0,0))->float:
    if p in [1,2]:
        if p==1:
          return manhattan_distance_2p(a,b)
        else:
           return euclidean_distance_2p(a,b) 
    else:
       raise ValueError('p must be 1 for manhattan distance and 2 for euclidean distance') 
def minkowski_distance_np(p:int,a:list)->float:
   if p in [1,2]:
      if p==1:
        return manhattan_distance_np(a) 
      else:
         return euclidean_distance_np(a)
   else:
      raise ValueError('p must be 1 for manhattan distance and 2 for euclidean distance')


In [56]:
# trying minkowski distance with p=2 for euclidean distance for two points
euclidean_distance = minkowski_distance_2p(2,(1,2),(5,6))
print('Euclidean_distance:',euclidean_distance)
# trying minkowski distance with p=1 for manhattan distance for two points
manhattan_distance = minkowski_distance_2p(1,(1,2),(5,6)) 
print('Manhattan distance:',manhattan_distance)

Euclidean_distance: 5.656854249492381
Manhattan distance: 8


In [57]:
l = [(1,2),(5,6),(7,8)]
# trying minkowski distance with p=2 for euclidean distance for three points
euclidean_distance_mp = minkowski_distance_np(2,l) 
print('Euclidean distance: ',euclidean_distance_mp) 
# trying minkowski distance with p=1 for manhattan distance for three points
manhattan_distance_mp = minkowski_distance_np(1,l) 
print('Manhattan distance: ',manhattan_distance_mp) 

Euclidean distance:  8.485281374238571
Manhattan distance:  12


iv. Hamming Distance : A method used to calculate the deviations in strings. It is especially used in NLP(Natural Language Processing).

In [None]:
def hamming_distance_str(p:str,q:str,vectorize:bool=False,equalizer:bool=True)->float:
    if p=='' or q=='':
        raise ValueError('Empty strings cannot be accepted') 
    if len(p) != len(q):
        raise ValueError('Strings must be of equal lengths') 
    import string as s
    letters = list(s.ascii_letters+s.digits+s.punctuation+s.whitespace)
    letter_index = {ch: idx + 1 for idx, ch in enumerate(letters)}
    distance=0
    for i,j in zip(p,q):
            if i!=j:
              if vectorize:
               distance += abs(letter_index[i] - letter_index[j])
              else:
                distance += 1
    return distance
def hamming_distance_list(p:list,q:list)->float:
    if not p or not q:
        raise ValueError('Empty lists cannot be accepted') 
    if len(p) != len(q):
        raise ValueError('Lists must be of equal lengths') 
    if not all(isinstance(x, (int, float)) for x in p + q):
        raise ValueError('Only int or float datatype supported')
    distance = sum(abs(i-j) for i,j in zip(p,q)) 
    return distance

In [62]:
# trying hamming distance with vectorize as false 
str1 = 'Hello' 
str2 = 'hey@1'
hamming_distance = hamming_distance_str(str1,str2)
hamming_distance

4

In [60]:
# trying hamming distance with vectorize as true 
hamming_distance = hamming_distance_str(str1,str2,True) 
hamming_distance

124

In [61]:
# trying hamming distance with list of integers and floats 
l1 = [1,2,0.5,4,3,7] 
l2 = [0,4,7,8,1.2,3.4] 
hamming_distance = hamming_distance_list(l1,l2) 
hamming_distance

108.39999999999998