Dylan Hastings

# Exercise 1.

Although merge sort has a better Big-O than selection sort, selection sort can be faster for smaller inputs.

Rewrite `merge_sort(A, min_size)` such that sub-arrays smaller than an input parameter `min_size` are sorted with our `selection_sort` from the lecture `algorithms intro`.

In [23]:
def linear_search(arr):
  """
  Find the index of the minimum element
  AKA argsort
  """
  # initialize current best to +infinity
  # So any element beats it
  current_min = float('inf')
  current_min_idx = 0
  for i in range(len(arr)):
    if arr[i] < current_min:
      current_min = arr[i]
      current_min_idx = i
  return current_min_idx

In [24]:
def selection_sort(arr):
  """Selection sort"""
  n_sorted = 0
  while n_sorted < len(arr):
    # Get the index of the min of remaining elements
    # Since argsort returns based on array, we correct result
    # with `+ n_sorted`
    min_idx = linear_search(arr[n_sorted:]) + n_sorted
    # Swap minimum element with leftmost remaining element
    to_swap = arr[n_sorted]
    arr[n_sorted] = arr[min_idx]
    arr[min_idx] = to_swap
    # Increment and restart
    n_sorted += 1

In [25]:
def merge(left, right):
  res = []
  # Zip in together left and right parts
  while len(left)>0 and len(right)>0: 
      if left[0]<right[0]: 
          res.append(left[0]) 
          left.pop(0)
      else: 
          res.append(right[0]) 
          right.pop(0)
  # Copy in remaining elements of left and right
  # (if there are any)
  for i in left: 
      res.append(i) 
  for i in right: 
      res.append(i)
  return res

In [26]:
def merge_sort(A): 
    size = len(A)
    if size > 1:
      m = size // 2
      left = merge_sort(A[m:]) 
      right = merge_sort(A[:m])
      return merge(left, right)
    else:
      return A

In [27]:
def hybrid(A, min_size):
    '''
    This function uses merge_sort if the sub-array size is greater
    than min_size.  Othwerwise, it uses selection_sort.
    '''
    size = len(A)
    
    if size > min_size: #merge_sort is used
      m = size // 2
      left = merge_sort(A[m:]) 
      right = merge_sort(A[:m])
      return merge(left, right)
    
    elif min_size > size and size > 1: #selection_sort is used
        selection_sort(A)
    
    else:
      return A

Checks:

In [28]:
hybrid([33, 1, 55, 2343, -232, 344, 2, 53, -4, 923], 2)

[-232, -4, 1, 2, 33, 53, 55, 344, 923, 2343]

Time the difference between pure merge sort and this new algorithm. Is it faster? Why or why not?

In [32]:
import random
test_list = list(range(0, 1000))
random.shuffle(test_list)

In [33]:
%timeit merge_sort(test_list)

7.8 ms ± 1.12 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [37]:
%timeit hybrid(test_list, 10)

5.48 ms ± 365 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


Although merge_sort is O(nlogn) versus selection_sort (O(n^2)), merge_sort is slower for smaller lists because of the overhead involved in splitting and merging the lists.

# Exercise 2. 

Let $A[1...n]$ be an array of $n$ distinct numbers. If $i < j$ and $A[i] > A[j]$, then the pair $(i, j)$ is called an **inversion** of $A$. 

In other words an inversion is a pair of unsorted elements in an array.

**1)** List the five inversions of $[2, 3, 8, 6, 1]$ 

(8,6) , (6,1) , (2,1) , (3,1) , (8,1)

**2)** Give an algorithm that determines the number of inversions in any permutation on $n$ elements in $O(nlog_2(n))$ worst-case time. (Hint: Modify merge sort.)

In [91]:
def merge(left, right, c=[]):
  res = []
  # Zip in together left and right parts
  while len(left)>0 and len(right)>0: 
      if left[0]<right[0]: 
          res.append(left[0]) 
          left.pop(0)
      else: 
          res.append(right[0])
          for e in left:
              c.append((e, right[0]))
          right.pop(0)
  # Copy in remaining elements of left and right
  # (if there are any)
  for i in left: 
      res.append(i) 
  for i in right: 
      res.append(i)
  return res, c

In [92]:
def merge_sort(A):
    size = len(A)
    if size > 1:
      m = size // 2
      left, c_left = merge_sort(A[:m]) 
      right, c_right = merge_sort(A[m:])
      return merge(left, right, c_left+c_right)
    else:
      return A, []

In [101]:
def inversions(A):
    lst, inv = merge_sort(A)
    return inv, len(inv)

In [102]:
inversions([2,3,8,6,1])

([(6, 1), (8, 1), (8, 6), (2, 1), (3, 1)], 5)

# 3. Recursive sum

Write a function that uses recursion to compute the sum of an array or list of numbers

```
recursive_sum([2, 4, 5, 6, 7])

output: 24
```

In [13]:
def recursive_sum(lst):
    '''
    This function uses recursion to compute the sum of an array
    or list of numbers.
    '''
    if len(lst) == 1:
        return lst[0]
    else:
        return lst.pop() + recursive_sum(lst)

Checks:

In [19]:
recursive_sum([2, 4, 5, 6, 7])

24

# 4. Recursive denominators

Write a Python program that uses recursion to find the greatest common divisor (gcd) of two integers.

```
recursive_gcd(12,14)

output : 2
```

In [126]:
def recursive_gcd(a, b):
    '''
    This function uses recursion to find the greatest
    common divisor (gcd) of 2 integers.
    '''    
    if b==0:
        return a
    else:
        return recursive_gcd(b, a % b)

In [130]:
recursive_gcd(15,6)

3

# 5. Recursive power function

Write a recursive function to calculate the value of 'a' to the power 'b'. 

```
recursive_pow(3, 4)

output: 81
```

In [45]:
def recursive_pow(a, b):
    '''
    This function uses recursion to compute the value
    of 'a' to the power 'b'.
    '''
    if b==0:
        return 1
    if b==1:
        return a
    else:
        return a * recursive_pow(a, b-1)

Checks:

In [46]:
recursive_pow(3, 4)

81

# 6. (Stretch) K-Nearest Neighbours

Consider a matrix with the following format:

```
[[0.3, 0.8],
 [-0.2, 0.5],
 [1, -1],
 [0.9, 0.5]
]
```

Each row denotes a point, and the numbers in each row are the coordinates. The coordinates in this example are in 2d, but the matrix could be in 3d (3 numbers per row) or even higher dimensions.

Your task is to write a function `knn(m, p)` or `k_nearest_neighbors(m, p, k)` which takes in a matrix of points `m`, an integer `p` denoting the index of a point in that matrix, and an intger `k` denoting the number of nearest neighbors to return.

The function returns the index of the `k` nearest neighbors of the point `p` in the matrix `m`.

```
dataset = [[2.7810836,2.550537003,0],
	[1.465489372,2.362125076,0],
	[3.396561688,4.400293529,0],
	[1.38807019,1.850220317,0],
	[3.06407232,3.005305973,0],
	[7.627531214,2.759262235,1],
	[5.332441248,2.088626775,1],
	[6.922596716,1.77106367,1],
	[8.675418651,-0.242068655,1],
	[7.673756466,3.508563011,1]]

knn(dataset, 0, 2)

output : [4, 1]
```

You can use `from sklearn.neighbors import NearestNeighbors` to test your function