### Comb Sort

- Comb sort is basically bubble sort, but with fancier comparison logic

- Recall that in bubble sort;
    - We run through the N-sized input array $N$ times
    - In each round, going from left to right, we compare the value at `i` and its next neighbour at `i+1`. If `i` exceeds `i+1`, swap them.
    - So in the k-th iteration, the k-th largest value gets "bubbled" to the end of the array

- In comb sort, the only thing that changes is that instead of comparing neighbours with offset of 1, we start by comparing neighbours further away, then iteratively decreasing the comparison distance until we hit offset = 1, which is the bubble sort case
    - The idea being that, by the time we hit the offset = 1 case, the array is much closer to being sorted, and will require fewer runs with gap = 1

- Through brute force, people have found that the offset should start at `len(arr)` and decrease by a factor of 1.3 at each iteration

- The final stage, where offset=1, is basically the bubble sort case. By the time you reach this, the array should be "kind of" sorted, so you need to run this fewer times

### Example

- `arr = [1,6,2,7,3]`

- Gap = 5//1.3 = 3
    - Compare 1 and 7, no swap
    - Compare 6 and 3, swap
    - `arr = [1,3,2,7,6]`

- Gap = 3//1.3 = 2
    - Compare 1 and 2, no swap
    - Compare 3 and 7, no swap
    - Compare 2 and 6, no swap
    - `arr = [1,3,2,7,6]`

- Gap = 2//1.3 = 1
    - Compare 1 and 3, no swap
    - Compare 3 and 2, swap
    - `arr = [1,2,3,7,6]`
    - Compare 3 and 7, no swap
    - Compare 7 and 6, swap
    - `arr = [1,2,3,6,7]`

- Run through array once more; no swaps, so array is sorted. Return


### Code Implementation

In [12]:
import random
arr=random.sample(range(500), k=20)

In [17]:
def comb_sort(arr: list[int]):
    scaling_factor = 1.3
    gap = int(len(arr) // 1.3)
    while gap > 1:
        for i in range(len(arr)-gap):
            if arr[i] > arr[i+gap]:
                arr[i], arr[i+gap] = arr[i+gap], arr[i]
        gap = int(gap // 1.3)
    
    sorted=True
    while not sorted:
        sorted=True
        for i in range(len(arr)-1):
            if arr[i] > arr[i+1]:
                arr[i], arr[i+1] = arr[i+1], arr[i]
                sorted=False
        
comb_sort(arr)
# arr

### Time Complexity

- Time complexity
    - In the worst case
        - The sorting algorithm behaves like bubble sort, and so will take $O(N^2)$
    - In best case
        - The gap starts from the array size $N$, and reduces by 1.3 at each iteration
        - This means that $\log_{1.3} N$ passes are made on the array
        - In each pass, comparison is made iterating through the whole array in $O(N)$
        - Assuming the array is already sorted when you reach the case where gap = 1, this suggests that you would have done the $O(N)$ operation $O(\log_{1.3} N)$ times
        - This leads to overall time complexity of $O(N \log_{1.3} N)$
            - By change of base, we know that $\log_{1.3} N$ can be expressed as  $\frac{\log N}{\log 1.3}$
            - Therefore, $O(N \log_{1.3} N) = O(\frac{N}{\log{1.3}}) \log N \approx O(N \log N)$

    - In the average case
        - This is a little handwavy, so treat it as such
        - We know that $N^2$ inversions exist in a random array
        - Also keep in mind that the distance between inversions matters a lot. Long distance inversions contribute disproportionately to the $N^2$ comparisons needed
            - For example, in [5,4,3,2,1], swapping 5 and 1 resolves 4 inversions, while swapping 2 and 1 resolves 1 inversion
            - Therefore, comb sort's efficiency comes from targeting these inversions early on
        - So when we start by resolving long distance inversions, we perform fewer swaps per run, but resolve the more impactful inversions
        - Using the same argument, resolving shorter distance inversions resolves more inversions, but less impactful ones.
        - It can be empirically observed that inversion count decays exponentially, with ~half of all inversions resolved on average in each pass

        - Hence, the complexity on average is $O(\frac{n^2}{2^{\log_{1.3} N }})$,

- Space complexity
    - Sort is in-place, so O(1) space needed