# Exercise 1.

Although merge sort has a better Big-O than selection sort, selection sort can be faster for smaller inputs.

Rewrite `merge_sort(A, min_size)` such that sub-arrays smaller than an input parameter `min_size` are sorted with our `selection_sort` from the lecture `algorithms intro`.

Time the difference between pure merge sort and this new algorithm. Is it faster? Why or why not?

In [53]:
import numpy as np

In [54]:
#Selection sort

def linear_search(arr):
  """
  Find the index of the minimum element
  AKA argsort
  """
  # initialize current best to +infinity
  # So any element beats it
  current_min = float('inf')
  current_min_idx = 0
  for i in range(len(arr)):
    if arr[i] < current_min:
      current_min = arr[i]
      current_min_idx = i
  return current_min_idx

def selection_sort(arr):
  """Selection sort"""
  n_sorted = 0
  while n_sorted < len(arr):
    # Get the index of the min of remaining elements
    # Since argsort returns based on array, we correct result
    # with `+ n_sorted`
    min_idx = linear_search(arr[n_sorted:]) + n_sorted
    # Swap minimum element with leftmost remaining element
    to_swap = arr[n_sorted]
    arr[n_sorted] = arr[min_idx]
    arr[min_idx] = to_swap
    # Increment and restart
    n_sorted += 1
  return arr



In [55]:
#Merge 2 arrays

def merge(left, right):
  res = []
  # Zip in together left and right parts
  while len(left)>0 and len(right)>0: 
      if left[0]<right[0]: 
          res.append(left[0]) 
          left.pop(0)
      else: 
          res.append(right[0]) 
          right.pop(0)
  # Copy in remaining elements of left and right
  # (if there are any)
  for i in left: 
      res.append(i) 
  for i in right: 
      res.append(i)
  return res

In [56]:
#Merge sort without selection sort
def merge_sort(A): 
    size = len(A)
    if size > 1:
        m = size // 2
        left = merge_sort(A[m:]) 
        right = merge_sort(A[:m])
        return merge(left, right)
    else:
        return A


In [57]:
#Merge sort with selection sort
def merge_selection_sort(A, min_size): 
    size = len(A)

    if (len(A) > min_size):
        m = size // 2
        left = merge_selection_sort(A[m:], min_size) 
        right = merge_selection_sort(A[:m], min_size)
        return merge(left, right)
    else:
        return selection_sort(A)


In [58]:
a = np.random.randint(1,1000000,100000).tolist()
%timeit -n 1 merge_sort(a)
%timeit -n 1 merge_selection_sort(a,30)

1.41 s ± 43.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
1.43 s ± 240 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


Question: 

Time the difference between pure merge sort and this new algorithm. Is it faster? Why or why not?

Answer: 

It is expected for the merge sort with selection sort to be faster than the pure merge sort for a large array and a properly adjusted min_size. This behavior is expected since selection sort implemented on a small and on a somewhat ordered array may lead to a solution with less iterations compared to merging the two arrays with the function implemented with merge sort. However, a faster implementation of merge sort with selection sort does not seem to be a guarantee in all cases. 

# Exercise 2. 

Let $A[1...n]$ be an array of $n$ distinct numbers. If $i < j$ and $A[i] > A[j]$, then the pair $(i, j)$ is called an **inversion** of $A$. 

In other words an inversion is a pair of unsorted elements in an array.

**1)** List the five inversions of $[2, 3, 8, 6, 1]$ 

**2)** Give an algorithm that determines the number of inversions in any permutation on $n$ elements in $O(nlog_2(n))$ worst-case time. (Hint: Modify merge sort.)

In [59]:
#Exercise 2.1
def find_inversion(array):
    '''
    Function which enumerates the inversions in a list
    '''
    a = array.copy()
    result = []
    for i,n1 in enumerate(a):
        for j,n2 in enumerate(a[i+1:]):
            if n1 > n2:
                result.append((i,j+i+1))
    return result
#Test
a2 = [2,3,8,6,1]
find_inversion(a2)


[(0, 4), (1, 4), (2, 3), (2, 4), (3, 4)]

In [60]:
#Exercise 2.2

counter = 0
    
def merge(l, r):
    global counter
    result = []
    l_index = 0
    r_index = 0

    while (l_index < len(l)) and (r_index < len(r)):
        if l[l_index] < r[r_index]:
            result.append(l[l_index])
            l_index += 1
        else:
            result.append(r[r_index])
            r_index += 1
            counter += len(l)-l_index

    if l_index < len(l):
        result += l[l_index:]
    if r_index < len(r):
        result += r[r_index:]
    return result

def merge_sort(array):
    a = array.copy()
    if len(a) == 1:
        return a
    mid = len(a)//2
    left = merge_sort(a[:mid]) 
    right = merge_sort(a[mid:]) 
    result = merge(left, right) 
    return result

merge_sort(a2)
print(f"The number of inversions for the given array is {counter}.")




The number of inversions for the given array is 5.


# 3. Recursive sum

Write a function that uses recursion to compute the sum of an array or list of numbers

```
recursive_sum([2, 4, 5, 6, 7])

output: 24
```

In [61]:
def recursive_sum(array):
    '''
    Recursive function which computes the sum of an array of list of numbers
    '''
    a = array.copy()
    if len(a) == 0:
        return 0
    return a[0] + recursive_sum(a[1:])

#Test
recursive_sum([2, 4, 5, 6, 7])

24

# 4. Recursive denominators

Write a Python program that uses recursion to find the greatest common divisor (gcd) of two integers.

```
recursive_gcd(12,14)

output : 2
```

In [62]:
def recursive_gcd(n1,n2):
    '''
    Recursive function which finds the greatest common divisor of two integers 
    '''
    if n2==0:
        return n1
    return recursive_gcd(n2, n1%n2)

#Test
recursive_gcd(12,14)

2

# 5. Recursive power function

Write a recursive function to calculate the value of 'a' to the power 'b'. 

```
recursive_pow(3, 4)

output: 81
```

In [63]:
def recursive_pow(a,b):
    '''
    Recursive function which calculates the value of 'a' to the power of 'b'
    '''
    if b == 0:
        return 1
    return a*recursive_pow(a, b-1)

#Test
recursive_pow(3, 4)

81

# 6. (Stretch) K-Nearest Neighbours

Consider a matrix with the following format:

```
[[0.3, 0.8],
 [-0.2, 0.5],
 [1, -1],
 [0.9, 0.5]
]
```

Each row denotes a point, and the numbers in each row are the coordinates. The coordinates in this example are in 2d, but the matrix could be in 3d (3 numbers per row) or even higher dimensions.

Your task is to write a function `knn(m, p)` or `k_nearest_neighbors(m, p, k)` which takes in a matrix of points `m`, an integer `p` denoting the index of a point in that matrix, and an intger `k` denoting the number of nearest neighbors to return.

The function returns the index of the `k` nearest neighbors of the point `p` in the matrix `m`.

```
dataset = [[2.7810836,2.550537003,0],
	[1.465489372,2.362125076,0],
	[3.396561688,4.400293529,0],
	[1.38807019,1.850220317,0],
	[3.06407232,3.005305973,0],
	[7.627531214,2.759262235,1],
	[5.332441248,2.088626775,1],
	[6.922596716,1.77106367,1],
	[8.675418651,-0.242068655,1],
	[7.673756466,3.508563011,1]]

knn(dataset, 0, 2)

output : [4, 1]
```


You can use `from sklearn.neighbors import NearestNeighbors` to test your function

In [64]:
from sklearn.neighbors import NearestNeighbors
np.random.seed(seed=1)
array = np.random.randint(1,10,[11,6])

def distance(pt1, pt2):
    '''
    Finds the distance between two points
    '''
    
    diff_sq = (pt1-pt2)**2
    dist = np.sqrt(np.sum(diff_sq))
    return dist

def knn(m, p, k):
    '''
    Finds the k closest points to a point located at index p in an array.
    '''
    a = m.copy()
    targets = np.array([a[p]]*len(a))
    a = np.array(a)
    distances = np.apply_along_axis(distance, 1, a,targets)
    distances[p]=np.inf
    idx = np.argpartition(distances, k)
    return idx[:k]

#Test
dataset = [[2.7810836,2.550537003,0],
    [1.465489372,2.362125076,0],
    [3.396561688,4.400293529,0],
    [1.38807019,1.850220317,0],
    [3.06407232,3.005305973,0],
    [7.627531214,2.759262235,1],
    [5.332441248,2.088626775,1],
    [6.922596716,1.77106367,1],
    [8.675418651,-0.242068655,1],
    [7.673756466,3.508563011,1]]

#Results from function knn
print(knn(dataset,0,2))

#Results obtained with Nearest Neighbors from sklearn.neighbors (This function seems to return also the reference point)
neigh = NearestNeighbors(n_neighbors=2, radius=0.4)
neigh.fit(dataset)
print(neigh.kneighbors([dataset[0]], 3, return_distance=False))

[4 1]
[[0 4 1]]
