# Arrays: Partitions
1. Lomuto's | `Medium`
2. Hoare's  | `Hard`

## Lomutos
1. Step 1:
    - Pick a random Pivot. Why? Because this will avoid the _worst case scenario_ (see details at end of section).
    - ```python
        rix = random.randint(0, len(a)-1)
        pivot = a[rix]
        swap(a, rix, high)
    ```
2. Step 2:
    - Define a sliding window
    - ```python
        i = low
        for j in range(low, high):
      ```
    - `i` points to the **lowest** index where we can swap to. All elements left of `i` will be less than the pivot value. It will only *manually increment*.
    - `j` points to the n'th value to compare, during one of the repeated passes. It will *automatically increment*.
3. Step 3:
    - Evaluate the n'th value relative to the pivot value
    - ```python
        if a[j] <= pivot:
            if j > i:
                swap(a, j, i)
            i += 1
      ```
    - <img src="https://imgur.com/IKGmgin.png" style="max-width:500px">
    - In the image above `i` = **lower pointer**, and `j` = **upper pointer**
    - <img src="https://imgur.com/vw4r16B.png" style="max-width:500px">
    - The pivot is `8`. The image shows the transition between, having finished evaluating `4` to `8`.  Since `4` is less than `8`, we increment `i`. The for loop, automatically will push `j` forward. And we'll see that since `9` is gte `8`, `i` will **not** move forward.
    - <img src="https://imgur.com/uIwpXqP.png" style="max-width:500px">
    - Now since `1` is lt `8`, we'll **swap** 1 with `9`. The result will be that `1` will be to the left of `9`, fullfilling the contract of the partition algo
    > After iterating, all values left of the pivot, will be less than the pivot.
    - <img src="https://imgur.com/ckgxRF3.png" style="max-width:500px">
    - As we can see, `1` has been swapped to the left of `9`, and `j` is now pointing to `7` which will also be swapped with `9` since it's also lt `lt` the pivot.
4. Step 4:
    - Put the pivot value in it's final resting place, and return the index of the pivot.
    - ```python
        swap(a, high, i)
      ```
    - `i` has been tracking the left-most index where the pivot will be greater than all elements to the left, so once we've looked at all values, we can be sure, `i` is the best/last place to put the `pivot` value. The pivot has been sitting at `a[high]` so it's out of the way while we evaluate where to put it.

#### Key Ideas
1. Pick a random index, and put that index to the far-right (out of the way)
2. `i` is finding the correct spot for the pivot.
3. `j` is looking at the `n'th` value in the current  array iteration.
4. All values between `i` & `j` are greater than the pivot.

In [3]:
import random


def swap(a, l, r):
    print(f'Swapping: {a[l]} w/ {a[r]}')
    a[l], a[r] = a[r], a[l]

def lomutos_partition(a, low, high):
    rix = random.randint(0, len(a)-1)
    pivot = a[rix]
    print(f'Pivot val = ', pivot)
    swap(a, rix, high)
    i = low
    for j in range(low, high):
        if a[j] <= pivot:
            if j > i:
                swap(a, j, i)
            i += 1
    swap(a, high, i)
    return i

a = [3, 4, 9, 1, 7, 0, 5, 2, 6, 8]
print(a)
low = 0
high = len(a) - 1
lomutos_partition(a, low, high)
print(a)


[3, 4, 9, 1, 7, 0, 5, 2, 6, 8]
Pivot val =  8
Swapping: 8 w/ 8
Swapping: 1 w/ 9
Swapping: 7 w/ 9
Swapping: 0 w/ 9
Swapping: 5 w/ 9
Swapping: 2 w/ 9
Swapping: 6 w/ 9
Swapping: 8 w/ 9
[3, 4, 1, 7, 0, 5, 2, 6, 8, 9]


### Worst Case Runtime Scenario
Naive versions of Quick Sort use the leftmost (or rightmost) element as a pivot, the worst occurs in the following cases.

1. Array is already sorted in the same order.
2. Array is already sorted in reverse order.
3. All elements are the same (a special case of cases 1 and 2)

The problems above are easily solved by choosing

1. A **random index** for the pivot
2. The middle index of the partition
    - A useful technique in problems where we don't want to sort, but rather, we want to eliminate batches of elements we don't care about while looking for a particular element. E.g. keep the right half, throw away the left half.
3. Choosing the median of the first, middle and last element of the partition for the pivot.
    - (especially for longer partitions)

With these modifications, the worst case of Quicksort has fewer chances to occur, but a worst case can still occur if the input array is such that the maximum (or minimum) element is always chosen as the pivot.

## Hoares | [Video](https://www.youtube.com/watch?v=NuQYFXmLUrM&ab_channel=BukanCaraCepat)

1. Step 1:
    - Assign the `pivot` as the middle element
    - ```python
        pivot = a[right // 2]
      ```
    - `right` is the length of the window initially.
2. Step 2:
    - Iterate as long as the left & right pointers have NOT crossed swords.
    - ```python
        while left <= right:
      ```
3. Step 3:
    - Check values on the left of the pivot...
    - If the left value is less than the pivot, it means the pivot is in the right spot in relation to that left value, so move the left pointer forward
    - ```python
        if pivot > a[left]:
            left += 1
      ```
4. Step 4:
    - Check the values on the right of the pivot...
    - Otherwise, we know we need to swap `a[left]` with some value right of the pivot, but we don't know which value `a[right]` until we find a value that's less than the pivot.
    - ```python
        elif pivot <= a[left]:
            while pivot < a[right]:
                right -= 1
      ```
    - The code above shows that we check the right value, and if it's greater than the pivot, then it's in the right spot in relation to the pivot, so move the right pointer back (to the left). Once we find a value that's less than the pivot, we know we can swap that value with the value at the left pointer.
    - ```python
        swap(a, left, right)
        left += 1
        right -= 1
       ```
5. Step 5:
    - Verify that the value to the left of the pivot, is in fact less than the pivot. If not, make one last swap. This may occur because of the outter-most while loop.
    - ```python
        if pivot < a[right]:
            swap(a, left, right)
            return right
        return left
      ```
    - Take note that `a[right]` in the above snippet is actually pointing to the value **left** of the pivot's index. We could right it more literally as ...
    ```python
        if pivot < a[left - 1]:
            swap(a, left, left - 1)
            return left - 1
        return left
    ```
    but that's also a bit hard to read & think about.

In [40]:
def swap(a, l, r):
    print(f'Swapping: {a[l]} w/ {a[r]}')
    a[l], a[r] = a[r], a[l]

def hoares_partition(a, left, right):
    pivot = a[right // 2]
    while left <= right:
        if pivot > a[left]:
            left += 1
        elif pivot <= a[left]:
            while pivot < a[right]:
                right -= 1
            swap(a, left, right)
            left += 1
            right -= 1
    if pivot < a[right]:
        swap(a, left, right)
        return right
    return left
    # Performed 2 swaps & 7 is in final spot

a = [3, 4, 9, 1, 7, 0, 5, 2, 6, 8]
left = 0
right = len(a) - 1
hoares_partition(a, left, right)
print(a)  # [3, 4, 6, 1, 2, 0, 5, 7, 9, 8]


Swapping: 9 w/ 6
Swapping: 7 w/ 2
[3, 4, 6, 1, 2, 0, 5, 7, 9, 8]


### Return the Index
- The purpose of partition algorithms is to divide a set of information between upper & lower halfs. The pivot index, describes the index that divides the two halfs.
- In `QuickSort` we recursively call smaller and smaller half sizes to the left and right of the last pivot, to eventually sort an entire array.
- In batching problems, where we want to find the top k batch of elements in a list, we can use the partition algo, to divide the elements in size less than k, and gte to k size.

# Practice Problems
1. QuickSort
2. LeetCode # 75. Sort Colors
    - Dutch National Flag
3. Top K Largest Elements


#### QuickSort
- Shown are both partitions
- I chose `hoares` as i think it's the easiest to read, and the most terse.

In [41]:
import random

def swap(a, l, r):
    a[l], a[r] = a[r], a[l]

def lomutos_partition(a, left, right):
    r = random.randint(left, right)
    swap(a, r, right)
    i, pivot = 0, a[right]
    for j in range(right):
        if a[j] <= pivot:
            if j > i: swap(a, i, j)
            i += 1
    swap(a, right, i)
    return i

def hoares_partition(a, left, right):
    pivot = a[left]
    while left < right:
        while a[left] < pivot: left += 1
        while a[right] > pivot: right -= 1
        swap(a, left, right)
    return right

def quick_sort(arr, start=None, end=None):
    if start is None and end is None:
        start, end = 0, len(arr) - 1
    if start < end:
       pivot = hoares_partition(arr, start, end)
       quick_sort(arr, start, pivot - 1)
       quick_sort(arr, pivot + 1, end)
    return arr

quick_sort([0, 6, 1, 4, 2, 3, 5, 7, 9, 8])

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

#### Dutch National Flag
1. The solution doesn't explicitly use either Hoare's or Lomuto's verbatim, however we can use the idea of Hoares partition which turns out to be rather straightforward.
2. We keep 3 pointers to subdivide the region of the input into three chunks. Exactly like Hoare's algorithm, which subdivides into `left` and `right`.
3. We let the middle section `igreen` be the main pointer. This is a distinct difference in the QuickSort implementation. We use the `igreen` pointer to compare to the adjacent values.
4. Whenever the current element needs to be moves to the left, we swap it with the left-most green element, and then increment both pointers forward.

In [43]:
def swap(a, l, r):
    a[l], a[r] = a[r], a[l]

def dutch_flag_sort(balls):
    red, green, blue = 'R', 'G', 'B'
    ired, igreen, iblue = 0, 0, len(balls) - 1
    while igreen <= iblue:
        if balls[igreen] == red:
            swap(balls, igreen, ired)
            ired += 1
            igreen += 1
        elif balls[igreen] == green:
            igreen += 1
        elif balls[igreen] == blue:
            swap(balls, igreen, iblue)
            iblue -= 1
    return balls

dutch_flag_sort(["G", "B", "G", "G", "R", "B", "R", "G"])

['R', 'R', 'G', 'G', 'G', 'G', 'B', 'B']

#### Top K Largest Elements
- In quicksort, we pick a pivot element and rearrange the array around it such that the pivot element comes at its correct position, all the elements less than or equal to the pivot move to its left and all the elements greater than the pivot move to its right. We perform this operation till all the elements in the array get sorted.
- But note that, here we do not have to sort the elements based on their values, rather we have to sort all the unique elements present in the array based on their frequencies. So, any two elements **will be compared based on their frequencies and not on their values**.
- We will first store the frequencies of all the distinct elements present in the array in a hashmap and using this hashmap, we will build an array of unique elements present in the array. Let us call this array `unique`.
- Note that we do not actually need to sort `unique` completely. We only need the top `k` most frequent elements. So if after partition, pivot ends up at the index unique_size - `k`, then two things are guaranteed:
    1. The pivot element is at its correct position
    2. All the elements on the right of the pivot have frequencies greater than the pivot.
- So the pivot element along with all elements on its right are the top `k` most frequent elements. So, we repeat until the partitioning gets us the `k`-th most frequent elements. If the pivot does not end up at index `unique_size - k`, we may have two more possibilities as follows:
    - `pivot_index > unique_size - k`: In this case, the frequency of the `k`-th most frequent element will be less than or equal to that of the pivot element. Since all the elements with frequency less than or equal to the frequency of the pivot element lie towards the left of it, we will discard the right subarray and recurse through the left subarray.
    - `pivot_index < unique_size - k`: In this case, the frequency of the `k`-th most frequent element will be greater than that of the pivot element. Since all the elements with a frequency greater than the frequency of the pivot element lie towards the right of it, we will discard the left subarray and recurse through the right subarray.
We will perform the above steps till the pivot ends up at index `unique_size - k`.
- We will use **Lomuto's algorithm** to partition the array. Lomuto's partitioning algorithm picks a random element from the array and partitions the array around it.
- **Time Complexity**
    - O(n2).
    - This algorithm will have a worst-case time complexity of O(n2). For example, if all of the elements in the array are unique and we have k = n, then the pivot_index will always end up at a value equal to high. Therefore in this case, our solution will perform the partitioning O(n) number of times. Since a single partitioning process takes O(n) amount of time, the overall worst-case time complexity will become equal to O(n2).

In [52]:
from collections import Counter
import random

def swap(arr, l, r):
    arr[l], arr[r] = arr[r], arr[l]

def partition(arr, low, high, freq):
    random_pivot = random.randint(low, high) % (high - low + 1) + low
    pivot_freq = freq.get(arr[random_pivot])
    swap(arr, random_pivot, high)
    i = low
    for j in range(low, high - 1):
        if freq[arr[j]] <= pivot_freq:
            swap(arr, i, j)
            i += 1
    swap(arr, i, high)
    return i

def quick_select(arr, k, freq):
    low, high = 0, len(arr) - 1
    while low <= high:
        pivot = partition(arr, low, high, freq)
        if pivot == len(arr) - k:
            return
        if pivot > len(arr) - k:
            high = pivot - 1
        else:
            low = pivot + 1

def k_largest(arr, k):
    freq = Counter(arr)
    unique = list(set(freq.keys()))
    quick_select(unique, k, freq)
    result = []
    for i in range(len(unique) - k, -1, -1):
        result.append(unique[i])
    return result


k_largest([1, 2, 3, 2, 4, 3, 1], 2)


[1, 2, 4]

In [46]:
from collections import Counter

def k_largest_elements(arr, k):
    freq = sorted(Counter(arr).items(), key=lambda item: item[1], reverse=True)
    k_largest = freq[0: k]
    return [num for num, freq in k_largest]

k_largest_elements([1, 2, 3, 2, 4, 3, 1], 2)


[1, 2]