# Lab 8
## Data Structures & Algorithms

## Today
* [Divide-and-conquer](#divide)
* [Bubble Sort](#motivation-bubble-sort)
* [Improved Bubble Sort](#improved-bubble-sort)
* [Merge Sort](#merge-sort)
* [A Recurrence Relation](#a-recurrency-relation)
* [Exercises](#exercises)

# Divide-and-conquer 

Divide-and-conquer algorithms are a class of algorithms that solve a problem by:
1. **Divide**: Breaking the problem into smaller, more manageable subproblems.
2. **Conquer**: Solving the subproblems recursively.
3. **Combine**: Merging the solutions of the subproblems to solve the original problem.

### Examples

Here's a (highly preliminary) subselection of commond divide-and-conquer algorithms you might find in the wild:

*NB: for this lab we'll be focused just on Merge Sort and related brute-force algorithms.*

- **Merge Sort**: A sorting algorithm that 
    1. divides the array into two halves, 
    2. sorts each half recursively, 
    3. and then merges the sorted halves.

*These others are for your general reference:*

- **Quick Sort**: Another sorting algorithm that: 
    1. selects a pivot element, 
    2. partitions the array around the pivot, 
    3. and then sorts the subarrays recursively.
- **Binary Search**: A searching algorithm that:
    1. divides the sorted array in half at each step,
    2. at each step searches to find the target element, 
    3. does so recursively.
- **Strassen's Algorithm**: A matrix multiplication algorithm that: 
    1. divides matrices into smaller submatrices, 
    2. recursively, 
    3. and combines their products efficiently.
- **Karatsuba Multiplication**: A multiplication algorithm for two $n$-digit numbers which reduces the process to three multiplications of $\frac{n}{2}$-digit numbers.


## Motivation: Bubble Sort

Before we dive into sorting techniques like divide-and-conquer algorithms (e.g., Merge Sort or Quick Sort), it's useful to first understand a simpler, more intuitive sorting method: Bubble Sort.

**Takeaway here**: these are inefficient / bruteforce -> divide-and-conquer algorithms can do same task more efficiently.

**Idea of Bubble Sort**: 

*Bubble Sort is a straightforward (i.e. naive / brute force) sorting algorithm that works by repeatedly swapping adjacent elements if they are in the wrong order. The idea is to "bubble up" the largest element to its correct position in each pass.*
1. **Repeated passes**: We iterate through the array $n$ times (where $n$ is the length of the array). 
    - This ensures that every element has had enough chances to move into the right spot.
2. **Pairwise Swapping**: for each pass:
    - Compare each adjacent pair of elements.
    - If the first element is greater than the second, swap them.

Effect of Swapping:
- After the first full pass, the largest element moves to the last position.
- After the second full pass, the second largest element is in its correct position, and so on.
- This continues until the array is fully sorted.

Here's an example implementation of bubble sort in Python:

In [2]:
def bubble_sort_brute_force(arr):
    """
    Bubble sort 
    
    Parameters
    ----------
    arr : a list of number

    Returns
    ----------
    The list sorted in ascending order
    """
    
    arr_temp = list(arr)
    n = len(arr_temp)
    
    for i in range(n): # NB: i index not used again - this just ensures we go through the list n times
        for j in range(n - 1):
            # Get the difference between two adjacent numbers
            diff = arr_temp[j] - arr_temp[j + 1]
            if diff > 0:
                # Swap the two numbers
                arr_temp[j], arr_temp[j + 1] = arr_temp[j + 1], arr_temp[j]               

    return arr_temp

In [3]:
arr_1 = []
arr_2 = [3]
arr_3 = [3, 2]
arr_4 = [3, 2, 1, 4]


print(bubble_sort_brute_force(arr_1))
print(bubble_sort_brute_force(arr_2))
print(bubble_sort_brute_force(arr_3))
print(bubble_sort_brute_force(arr_4))

[]
[3]
[2, 3]
[1, 2, 3, 4]


#### Complexity

The time complexity of bubble sort is $O(n^2)$. 
- In the worst-case scenario, where the input array is in reverse order, bubble sort will need to make $n$ passes through the array, with each pass requiring $O(n)$ comparisons and swaps. 
    - *NB: it's **brute force** implementation, so it  keeps checking every possible pair over and over, even when parts of the list are already sorted!*
- So, despite its simplicity, bubble sort is not efficient for sorting large arrays due to its quadratic time complexity.

# Improved Bubble Sort

We can save on some operations: 
- We know that in the first round, the largest number will be moved to the last place of the array. 
- So, in the second round, we do not have to consider the last element in the array as it is the largest element that we moved there in the first round. 
- By extension, in the third round, we can ignore the last two elements.
- ...and so on. 
- To formalise this: in round $i$, we can ignore the last $i-1$ elements:

In [4]:
def bubble_sort_improved(arr):
    """
    Bubble sort 
    
    Parameters
    ----------
    arr : a list of number

    Returns
    ----------
    The list sorted in ascending order
    """
    
    arr_temp = list(arr)
    n = len(arr_temp)
    
    for i in range(n):
        # in the second loop, we are leaving out the last i-1 elements of the array (before: n-1)
        for j in range(n - i - 1):
            # Get the difference between two adjacent numbers
            diff = arr_temp[j] - arr_temp[j + 1]
            if diff > 0:
                # Swap the two numbers
                arr_temp[j], arr_temp[j + 1] = arr_temp[j + 1], arr_temp[j]               

    return arr_temp

#### Complexity

NB: this still has running time $O(n^2)$, but it will still be a bit faster.

# Merge Sort

Merge sort is another sorting algorithm, but one that follows the divide-and-conquer approach. 

Remember, all Divide-and-conquer approaches follow the same 3-step structure:
1. **Divide**: Breaking the problem into smaller, more manageable subproblems.
2. **Conquer**: Solving the subproblems recursively.
3. **Combine**: Merging the solutions of the subproblems to solve the original problem.

MergeSort specifically works by:
1. Dividing the array into two halves, 
2. Sorting each half recursively -> base case (ie. 1 element; a 1-element list is always sorted!), 
3. Merging the sorted halves.

<div>
   <img src="images/mergesort_viz.png" width="500px" title="mergesort visualisation">
</div>

The fundamental idea is quite simple!

The more complex part is in the implementation: how to *merge* the sorted halves into a new sorted array?

Let's take a simple example: 
- `Left  = [1, 4, 6]`
- `Right = [2, 3, 5]`
- We want to combine into a single sorted list: `Merged = [1, 2, 3, 4, 5, 6]`

Basic idea: **compare the smallest unmerged elements** from both lists, one by one, -> insert the smaller one into the result.
If we were to try to write that concept in python:
- define 3 indexes: 1 to track our position in each list (left: `i`, right: `j`, combined: `k`)
- at each step:
    - compare values at 2 positions given by `i` (left-list index) vs `j` (right-list index),
    - take whichever value is smaller,
    - bump the index (either `i` or `j`) up by +1 (so it's now at the next element),
    - copy the smaller of the values to `k`'s position (new merged-list index),
    - bump that index up by +1 (onto the next value in the new list).
- repeat!

Here's an example full implementation of MergeSort in Python:

In [5]:
def merge_sort(arr):
    """
    Merge sort 
    
    Parameters
    ----------
    arr : a list of number

    Returns
    ----------
    The list sorted in ascending order
    """
    arr_temp = list(arr)
    n = len(arr_temp)    
    
    if n > 1: 
        # STEP 1: DIVIDE
        # Divide the list into two smaller ones
        # The middle of the list
        mid = n // 2 # using floor division (a.k.a integer division)
        # The left sublist
        arr_temp_left = arr_temp[:mid] 
        # The right sublist
        arr_temp_right = arr_temp[mid:]

        # STEP 2: RECURSIVE CALL (UNTIL N=1)
        # Recursively call merge_sort to sort the two smaller lists
        arr_temp_left = merge_sort(arr_temp_left)
        arr_temp_right = merge_sort(arr_temp_right)
        
        # STEP 3: MERGE  
        # Merge the two sorted smaller lists
        i = j = k = 0
        n_left, n_right = len(arr_temp_left), len(arr_temp_right)
          
        while i < n_left and j < n_right: # this while statement says to keep going through the two lists until one of them is exhausted
            if arr_temp_left[i] < arr_temp_right[j]: 
                arr_temp[k] = arr_temp_left[i] 
                i += 1
            else: 
                arr_temp[k] = arr_temp_right[j] 
                j += 1
            k += 1
          
        # If there are elements in arr_temp_left that have not been visited 
        while i < n_left: 
            arr_temp[k] = arr_temp_left[i] 
            i += 1
            k += 1
 
        # If there are elements in arr_temp_right that have not been visited 
        while j < n_right: 
            arr_temp[k] = arr_temp_right[j] 
            j += 1
            k += 1
            
    return arr_temp

In [6]:
arr_1 = []
arr_2 = [3]
arr_3 = [3, 2]
arr_4 = [3, 2, 1, 4]

print(merge_sort(arr_1))
print(merge_sort(arr_2))
print(merge_sort(arr_3))
print(merge_sort(arr_4))

[]
[3]
[2, 3]
[1, 2, 3, 4]


### Time Complexity Analysis

Merge Sort has a time complexity of $O(n \log n)$ in all cases, making it a more efficient sorting algorithm than Bubble Sort. 

It achieves this time complexity by dividing the array into halves recursively and merging the sorted halves *efficiently*.

Let's look at this in more detail: remember that for divide-and-conquer algorithms, we use a **recurrence relation** to express the running time.

# Timed example!

In [9]:
import random
import time

random.seed(42)
long_arr = [random.randint(0, 10_000) for _ in range(5000)]

# let's time bubble sort
start_bubble = time.time()
bubble_sort_brute_force(long_arr)
end_bubble = time.time()
bubble_time = end_bubble - start_bubble

# ...and merge sort
start_merge = time.time()
merge_sort(long_arr)
end_merge = time.time()
merge_time = end_merge - start_merge

# Print results
print(f"Bubble Sort Time: {bubble_time:.4f} seconds")
print(f"Merge Sort Time:  {merge_time:.4f} seconds")
print(f"Merge sort is {bubble_time / merge_time:.2f} times faster than bubble sort")

Bubble Sort Time: 1.2219 seconds
Merge Sort Time:  0.0045 seconds
Merge sort is 272.61 times faster than bubble sort


## Exercises

### Exercise 1

Extend the bubble sort algorithm from above, to make it more efficient. 

When we pass an already sorted array to the implementation of the algorithm above, it always goes through the array $n$ times, which is unnecessary. In this even further improved version, we want to make sure that after each round of passing through the array, we first check if the array has already been sorted and we terminate the algorithm as soon as we find that it has. Implement this according to the following idea:

* remember that we compare each two adjacent elements in the array and do a swap if the first is larger than the second
* this means that if no two elements were swapped in a round, the array is already sorted!
* include a variable in your code that indicates if any swap was made in a round
* then terminate the algorithm once the flag variable has not recorded any swaps

In [6]:
def bubble_sort_optimal(arr):
    """
    Bubble sort 
    
    Parameters
    ----------
    arr : a list of number

    Returns
    ----------
    The list sorted in ascending order
    """

    # Implement me
    arr_temp = list(arr)
    n = len(arr_temp)
    
    for i in range(n):
        # If there are numbers that were swapped, False by default
        swapped = False
        
        for j in range(n - i - 1):
            # Get the difference between two adjacent numbers
            diff = arr_temp[j] - arr_temp[j + 1]
            if diff > 0:
                # Swap the two numbers
                arr_temp[j], arr_temp[j + 1] = arr_temp[j + 1], arr_temp[j]    
                
                # There are numbers that were swapped
                swapped = True
        
        # If there were no numbers that were swapped
        if swapped is False:
            break

    return arr_temp

In [7]:
arr_1 = []
arr_2 = [3]
arr_3 = [3, 2]
arr_4 = [3, 2, 1, 4]

print(bubble_sort_optimal(arr_1))
print(bubble_sort_optimal(arr_2))
print(bubble_sort_optimal(arr_3))
print(bubble_sort_optimal(arr_4))

[]
[3]
[2, 3]
[1, 2, 3, 4]


### Exercise 2

Modify the merge sort algorithm to sort the elements in descending order.

In [None]:
def merge_sort_desc(arr):
    """
    Merge sort in descending order
    
    Parameters
    ----------
    arr : list
        A list of numbers
    
    Returns
    -------
    list
        The list sorted in descending order
    """
    arr_temp = list(arr)
    n = len(arr_temp)    
    
    if n > 1: 
        # Divide the list into two smaller ones
        mid = n // 2
        arr_temp_left = arr_temp[:mid] 
        arr_temp_right = arr_temp[mid:]
  
        # Recursively call merge_sort_desc to sort the two smaller lists
        arr_temp_left = merge_sort_desc(arr_temp_left)
        arr_temp_right = merge_sort_desc(arr_temp_right)
          
        # Merge the two sorted smaller lists
        i = j = k = 0
        n_left, n_right = len(arr_temp_left), len(arr_temp_right)
          
        while i < n_left and j < n_right: 
            # CHANGED PART OF THE CODE: Now sorting in descending order
            if arr_temp_left[i] > arr_temp_right[j]: 
                arr_temp[k] = arr_temp_left[i] 
                i += 1
            else: 
                arr_temp[k] = arr_temp_right[j] 
                j += 1
            k += 1
          
        # If there are elements in arr_temp_left that have not been visited 
        while i < n_left: 
            arr_temp[k] = arr_temp_left[i] 
            i += 1
            k += 1
 
        # If there are elements in arr_temp_right that have not been visited 
        while j < n_right: 
            arr_temp[k] = arr_temp_right[j] 
            j += 1
            k += 1
            
    return arr_temp

In [9]:
arr_1 = []
arr_2 = [3]
arr_3 = [2, 3]
arr_4 = [3, 2, 1, 4]

print(merge_sort_desc(arr_1))
print(merge_sort_desc(arr_2))
print(merge_sort_desc(arr_3))
print(merge_sort_desc(arr_4))

[]
[3]
[3, 2]
[4, 3, 2, 1]


### Exercise 3

An **inversion** in an array is when for two elements `array[i]` and `array[j]` we have `array[i]` > `array[j]` and `i < j`. E.g. `array = [3,1,2]` has two inversion: `(3,1)` and `(3,2)`. In other words, an inversion is every pair of elements that is violating an ascending order of the elements.

Implement an algorithm for counting inversions in a naive way, where you go through every single pair of elements and check if it is an inversion. If it is, increase a counter by 1.

In [10]:
def count_inversions_brute_force(arr):
    """
    Count inversions in an array using the brute-force approach.
    
    Inversion in an array occurs when a pair of elements (arr[i], arr[j]) where i < j,
    and arr[i] > arr[j].
    
    Parameters
    ----------
    arr : list
        A list of numbers.

    Returns
    -------
    int
        The number of inversions in the array.
    """
    inv_count = 0  # Initialize inversion count
    n = len(arr)
    
    # Iterate through each pair of elements
    for i in range(n): 
        for j in range(i + 1, n): 
            # If arr[i] is greater than arr[j], increment inversion count
            if arr[i] > arr[j]: 
                inv_count += 1
  
    return inv_count

In [11]:
arr_1 = [1, 2, 3, 4, 5]
arr_2 = [5, 4, 3, 2, 1]
arr_3 = [2, 3, 8, 6, 1]

print(count_inversions_brute_force(arr_1))
print(count_inversions_brute_force(arr_2))
print(count_inversions_brute_force(arr_3))

0
10
5


### Exercise 4

What is the time and space complexity of this algorithm? 

* Time Complexity: $O(n^2)$ because two nested loops are needed to traverse the array from start to end.
* Space Complexity: $O(1)$ because no extra space is required.

### Exercise 5

Now, implement the counting inversions algorithm so that it runs in $O(n \log n)$ using divide-and-conquer. In the end, it should return the total number of inversions and the sorted input array. There's a very helpful section in the Algorithm Design (Kleinberg & Tardos) textbook (in chapter 5). The following hints for the implementations (and the solutions that will be provided) come from [this](https://www.geeksforgeeks.org/python-program-for-count-inversions-in-an-array-set-1-using-merge-sort/) website. Do try this yourself before looking at the implementation!

* the idea of divide-and-conquer is always to recursively divide the array into subarrays
* imagine that we divide an array into two subarrays and manage to find the number of inversions for each
* to find the total number of inversions, we are then only missing the inversions that need to be counted across the two subarray (i.e. in the 'combination' or 'merge' step of the divide-and-conquer algorithm)
* so the total number of inversions is the number of inversions in the left subarray, right subarray, and merge().
* to get the number of inversions in merge(): let i is used for indexing left sub-array and j for right sub-array. At any step in merge(), if a[i] is greater than a[j], then there are (mid – i) inversions. because left and right subarrays are sorted, so all the remaining elements in left-subarray (a[i+1], a[i+2] … a[mid]) will be greater than a[j]

To deal with the last part of the algorithm (counting the inversions in 'merge'), first write a merge-and-sort functions according to the following pseudo code from the Algorithm Design textbook. Note that this is a **very** similar algorithm to a part of the merge sort algorithm we looked at above (except you also have to keep track of the inversions as you merge the two arrays).

```{python}
    Merge-and-Count(A,B)
        Maintain a Current pointer into each list, initialized to point to the front elements (e.g. use i and j, that are both 0 to start with)
        Maintain a variable Count for the number of inversions, initialized to 0
        While both lists are nonempty:
            Let ai and bj be the elements pointed to by the Current pointers, ai = A[i] and bj = B[j]
            Append the smaller of these two to the output list
            If bj is the smaller element:
                Increment Count by the number of elements remaining in A
            Endif
            Advance the Current pointer in the list from which the smaller element was selected.
        EndWhile
        Once one list is empty, append the remainder of the other list to the output
        Return Count and the merged list
```

In [12]:
def merge_and_count(A, B):
    """
    Merge two sorted lists and count inversions
    
    Parameters
    ----------
    A : list
        A sorted list.
    B : list
        Another sorted list.

    Returns
    ----------
    tuple
        A tuple containing the merged sorted list and the number of inversions.
    """

    # initialise what will be the merged list
    len_C = len(A) + len(B)
    C = [0] * len_C

    # initialise the number of inversions
    inversions = 0  

    # initialise the pointers
    i = j = k = 0
    n_left, n_right = len(A), len(B)
      
    while i < n_left and j < n_right: 
        if A[i] <= B[j]: 
            C[k] = A[i] 
            i += 1
        else: 
            C[k] = B[j] 
            j += 1
            # increment the number of inversions by the number of elements remaining in A
            inversions += len(A) - i 
        k += 1
      
    # If there are elements in A that have not been visited 
    while i < n_left: 
        C[k] = A[i] 
        i += 1
        k += 1

    # If there are elements in B that have not been visited 
    while j < n_right: 
        C[k] = B[j] 
        j += 1
        k += 1
    
    return inversions, C

In [13]:
A = [1, 3, 5, 7]
B = [2, 4, 6, 8]
inversions, merged_list = merge_and_count(A, B)
print("Merged list:", merged_list)
print("Number of inversions:", inversions)

Merged list: [1, 2, 3, 4, 5, 6, 7, 8]
Number of inversions: 6


### Exercise 6

We now use the function written in the last exercise, to write the algorithm for counting inversions, that we call `sort_and_count`.

Again, use the pseudo-code from the text book as a helper:

```
Sort-and-Count(L)
If the list has one element:
    there are no inversions
Else
    Divide the list into two halves:
        A contains the first ⌈n/2⌉ elements
        B contains the remaining ⌊n/2⌋ elements
    (rA, A) = Sort-and-Count(A)
    (rB, B) = Sort-and-Count(B)
    (r,L) = Merge-and-Count(A,B)
Endif
Return r =rA +rB +r, and the sorted list L
```

In [14]:
def sort_and_count(L):
    """
    Sort a list and count inversions using divide-and-conquer approach
    
    Parameters
    ----------
    L : list
        A list of elements.

    Returns
    ----------
    tuple
        A tuple containing the number of inversions and the sorted list.
    """
    
    n = len(L)

    # if the length of the input list L is n=1, just return 0 inversions and the list itself
    if n == 1:
        return 0, L
    
    # Divide the list into two halves
    mid = n // 2
    A = L[:mid]
    B = L[mid:]
    
    # Recursively sort and count inversions in each half 
    rA, A = sort_and_count(A)
    rB, B = sort_and_count(B)
    
    # Merge the sorted halves and count inversions
    inversions, sorted_list = merge_and_count(A, B)
    
    # Return the total count of inversions and the sorted list
    return rA + rB + inversions, sorted_list

In [15]:
arr_1 = [1, 2, 3, 4, 5]
arr_2 = [5, 4, 3, 2, 1]
arr_3 = [2, 3, 8, 6, 1]

print(sort_and_count(arr_1))
print(sort_and_count(arr_2))
print(sort_and_count(arr_3))

(0, [1, 2, 3, 4, 5])
(10, [1, 2, 3, 4, 5])
(5, [1, 2, 3, 6, 8])
