### Bucket Sort

- Idea
    - Given some array of size $k$, we create $k$ ordered buckets, and put the values into the corresponding buckets using some logic (e.g. bucket based on first digit)

    - Assume this placing into buckets for each digit can be done in $O(1)$ time

    - Since there are $k$ values and $k$ buckets, on average, there should only be ~1 value per bucket. If there are multiple values per bucket, sort the bucket. Sorting can be done however you want

    - Traverse all buckets in order and add the value in the bucket to the result

### Example

- Let arr = [0.9, 2.6, 0.1, 1.2, 4.2]

- Since len(arr) = 5, create 5 buckets

- Place values of arr into buckets according to their first digit
    - [[0.9,0.1], [1.2], [2.6], [], [4.2]]

- Iterating through all buckets, sort if bucket length exceeds 1
    - [[0.1,0.9], [1.2], [2.6], [], [4.2]]

- Recombine
    - [0.1, 0.9, 1.2, 2.6, 4.2]

### Code Implementation

In [6]:
test = [5,4,3,2,1]
test.sort()
test

[1, 2, 3, 4, 5]

In [10]:
## Assume the elements are float values from 0 - 100
import numpy as np

def bucket_sort(arr: list[float]):
    buckets = [[] for _ in range(len(arr))]
    
    ## O(N)
    for val in arr:
        val_bucket = int(val) % len(arr)
        buckets[val_bucket].append(val)

    ## O(K)
    res = []
    for bucket in buckets:
        if len(bucket) >= 2:
            bucket.sort()
        res += bucket
    return res

arr = [float(x) for x in np.random.uniform(0, 100, size=200)]
sorted_arr = bucket_sort(arr)
sorted_arr;

### Time Complexity

- The time complexity of bucket sort ie entirely dependent on how well the array is split into the ordered buckets
    - If the $k$ values are split relatively evenly into $k$ buckets, then the split takes $O(N)$ and the recombination will take $O(K)$, giving us $O(N+K)$ time complexity

    - HOWEVER, if the split is skewed, and we need to do some sorting within the buckets, then the time complexity can be closer to $O(N \log N)$ or $O(N^2)$ depending on how much sorting needs to be done

- In terms of space complexity, we have $O(K)$ extra space from using the buckets, and $O(N)$ extra space from the recombination array, giving us $O(N+K)$ space complexity