# Opdracht 1.5: Bucket Sort

Hieronder is een implementatie geschreven van het Bucket Sort algoritme voor het sorteren van gehele getallen.  
Dit algoritme heeft een Big O complexity van $\mathcal{O}(n^2)$.

In [1]:
from typing import List
import numpy as np
import time

In [2]:
class BucketSort:
    def __init__(self, data: List[int]):
        self.data = data
    
    def distribution_pass(self, j):
        """Place every value in a bucket"""
        # build the buckets
        self.buckets = {x:[] for x in range(0,10)}
        
        # put array elements in different buckets
        for x in self.data:
            index_b = int((x / j)) % 10
            self.buckets[index_b].append(x)
        
        
    def gathering_pass(self):
        """Gathers all buckets and copies them into the original array"""
        # empty the array
        self.data = []
        
        # gather and append all buckets
        for key in self.buckets:
            for x in self.buckets[key]:
                self.data.append(x)
    
    def sort(self):
        """Sorts the given array"""
        # get the length of the max digit
        max_length = len(str(max(self.data)))
        
        # loop over the length of the max digit and repeat the steps
        for j in [10**x for x in range(0, max_length)]:
            self.distribution_pass(j)
            self.gathering_pass()

Hieronder worden de tijden geplot van bucket sort bij een variabele hoeveelheid meetwaarden met random en gesorteerde lijsten.

In [3]:
n = [1000, 10000, 30000]

for x in n:
    time1 = time.time()
    data = np.random.randint(low=0, high=x+1, size=x)
    buc = BucketSort(data)
    buc.sort()
    time2 = time.time()
    print('{:s} function with {} took {:.3f} sec'.format('bucket sort', x, (time2-time1)))

bucket sort function with 1000 took 0.006 sec
bucket sort function with 10000 took 0.041 sec
bucket sort function with 30000 took 0.121 sec


In [4]:
algorithm = 'bucket sort'
n = 30000

time1 = time.time()
BucketSort(np.arange(0, n+1, 1)).sort()
time2 = time.time()
print('{:s} function with {} (sorted) took {:.3f} sec'.format(algorithm, n, (time2-time1)))

time1 = time.time()
BucketSort(np.arange(n, -1, -1)).sort()
time2 = time.time()
print('{:s} function with {} (reversed) took {:.3f} sec'.format(algorithm, n, (time2-time1)))

bucket sort function with 30000 (sorted) took 0.133 sec
bucket sort function with 30000 (reversed) took 0.096 sec
