# Introduction

In computer science a sorting algorithm is an algorithm that consists of a set of instructions that takes an array as input, manipulates it using specified operations, and outputs a sorted array generally in numerical order (1). The algorithm takes the input array and outputs a permutation of the array that has been sorted. Sorting algorithms are very important for other computational algorithms to work efficiently like merge and search algorithms(2). 

The first sorting algorithm that was created was bubble sort in 1956 with Betty Holbertong being the pioneer of sorting algorithms in 1951. It was recognized by the early computer scientists that stable and reusable sorting algorithms would play an important role in the future of computer science(2). 

Sorting algorithms can be divided up into comparison sorts and integer sorts. Comparison sorts compare elements at each step of the algorithm to identify if one element should be to the left or right of another element(3).Comparison sorts are generally easier to implement than integer sorts, but comparison sorts are limited by a lower bound of O(nlogn). A comparison sort can be modeled as a large binary tree called a decision tree where each node represents a single comparison (3). At every subsequent level of the tree you divide problem into half and do constant amount of additional work. O(log N) is found when time goes up linearly while the input goes up exponentially(4). 

Integer sorts work differently and do not make comparisons, so they are not bounded by Ω(nlogn). Integer sorts determine for each element X how many elements are less than X and work through the array in that fashion (3). This information is used to place each element into the correct slot immediately—no need to rearrange lists.The ability to perform integer arithmetic on the keys allows integer sorting algorithms to be faster than comparison sorting algorithms(5).

Sorting algorithms can be identified through their time complexity and space complexity. Time complexity is how fast the algorithm runs and is estimated by counting the number of elemenary operations performed by the algorithm, supposing that each individual operation takes a fixed amount of time to run(6). The time required by the algorithm falls under the three types: Worst case - Maximum time required by an algorithm to complete its instructions and it is most common way to analyze an algorithm. Best case - Minimum time required for the algorithm. Average case is the time required for an algorithm  and it is sometimes done while analyzing the algorithm (13). The running time of a sorting algorithm is measured in terms of big O, and not Omega and only rarely Theta. Big O notation is used to express the upper bound of an algorithm and which gives the measure for the worst time complexity for an algorithm take to complete it's job(13). For example, if an algorithm had a worst-case running time of O(nlogn), then it is understood that the algorithm will never be slower than O(nlogn), and if an algorithm has an average-case running time of O(n2)O(n2), then on average, it will not be slower than O(n2)O(n2) (7). The better the time complexity of an algorithm is, the faster the algorithm will carry out his work in practice. 

Space complexity is the number of memory cells which an algorithm needs to perform it's operation. A good algorithm keeps this number as small as possible. There is often a time-space-tradeoff involved in choosing the algorithm as in general problems cannot be solved with low computing time and low memory consumption (8). One then has to make a decision and to choose computing time for memory consumption or vice versa.  For example, insertion sort has a space complexity of O(1), because it doesn't need extra allocation of memory in order to sort the provided collection. In this case we say that the sorting operation is done in-place(8). In place sorting is modifying the given list by only changing the element order in the existing list. This saves on memory and is very space efficient.
Merge sort is different, and it has a space complexity of O(n). Merge sort recursively divides the array in two  creating a new array at each step. The sorting operation requires the allocation of new space in memory. This obviously requires extra RAM and storage(9).

Another important characteristic of a sorting algorithm is it's stability. Sorting stability means that records with the same key retain their relative order before and after the sort (10). Some sorting algorithms are stable by nature like Insertion sort, Merge Sort or Bubble Sort. And some sorting algorithms are not, like Heap Sort or Quick Sort. Stability is generally only important if the problem you are solving requires the rentenion of the array in the order provided. Depending on the importance of stability it is possible to choose very different sorting algorithms for the problem. If you don't need stability, you can use a fast, low space complexity algorithm like heapsort or quicksort. By contrast stable algorithms have higher big-O CPU and memory usage than unstable algorithms (11). This again goes back to the trade off between time and space complexity when choosing an appropriate sorting algorithm. 

The future of sorting algorithms appears to be creation hybrid algorithms like Timsort that was developed in 2002 and implemented on the Python programming language(12). Timsort is a combination of the well estabished merge sort and insertion sort. Computer scientists very early on realised the importance of these types of algorithms and it is considered that sorting algorithms are unlikely to become any more efficient than they already are. 

# Sorting Algorithms

## Bucket Sort

Bucket sort is a comparison sort algorithm that operates on elements by dividing them into different buckets and then sorting these buckets individually. Each bucket is sorted individually using a separate sorting algorithm or by applying the bucket sort algorithm recursively. Bucket sort is mainly useful when the input is uniformly distributed over a range(14). It is a distribution sort, a generalization of pigeonhole sort, and is a cousin of radix sort in the most-to-least significant digit flavor. Bucket sort can be implemented with comparisons and therefore can also be considered a comparison sort algorithm. The computational complexity depends on the algorithm used to sort each bucket, the number of buckets to use, and whether the input is uniformly distributed. 
Bucket sort works as follows: 
Set up an array of initially empty "buckets".
Scatter: Go over the original array, putting each object in its bucket.
Sort each non-empty bucket.
Gather: Visit the buckets in order and put all elements back into the original array(15). 

Negatives For the bucket sort, it’s the necessary part that the maximum value of the element must be known (16).


Average case, best case, and worst case time complexity of this algorithm is O(n)(16). Worst-case analysis[edit]
Bucket sort is mainly useful when input is uniformly distributed over a range. When the input contains several keys that are close to each other (clustering), those elements are likely to be placed in the same bucket, which results in some buckets containing more elements than average. The worst-case scenario occurs when all the elements are placed in a single bucket. The overall performance would then be dominated by the algorithm used to sort each bucket, which is typically 
O ( n 2 ) {\displaystyle O(n^{2})} 
 insertion sort, making bucket sort less optimal than 
O ( n log ⁡ ( n ) ) {\displaystyle O(n\log(n))} 
 comparison sort algorithms like Quicksort. 
 Optimizations
A common optimization is to put the unsorted elements of the buckets back in the original array first, then run insertion sort over the complete array; because insertion sort's runtime is based on how far each element is from its final position, the number of comparisons remains relatively small, and the memory hierarchy is better exploited by storing the list contiguously in memory(15).  

In [1]:
![Bucket_sort_1](http://localhost:8888/view/Bucket_sort_1.svg.png)

'[Bucket_sort_1]' is not recognized as an internal or external command,
operable program or batch file.


# References

1. https://brilliant.org/wiki/sorting-algorithms/  Visited on the 17/4/19 
2. https://en.wikipedia.org/wiki/Sorting_algorithm Visited on the 17/4/19
3. https://betterexplained.com/articles/sorting-algorithms/ Visited on the 18/4/19
4. https://stackoverflow.com/questions/2307283/what-does-olog-n-mean-exactly Visited on the 18/4/19
5. https://en.wikipedia.org/wiki/Integer_sorting Visited on the 18/4/19
6. https://en.wikipedia.org/wiki/Time_complexity Visited on the 18/4/19
7. http://www.leda-tutorial.org/en/official/ch02s02s03.html Visited on the 18/4/19 
8. https://brilliant.org/wiki/space-complexity/ Visited on the 18/4/19
9. https://stackoverflow.com/questions/16585507/sorting-in-place 
10. https://stackoverflow.com/questions/1517793/what-is-stability-in-sorting-algorithms-and-why-is-it-importan
11. https://brilliant.org/wiki/sorting-algorithms/
12. https://en.wikipedia.org/wiki/Timsort
13. https://www.datacamp.com/community/tutorials/analyzing-complexity-code-python
14. https://www.hackerearth.com/practice/algorithms/sorting/bucket-sort/tutorial/
15. https://en.wikipedia.org/wiki/Bucket_sort
16. https://www.includehelp.com/algorithms/bucket-sort-algorithm.aspx 

In [None]:


# Python3 program to sort an array  
# using bucket sort 
import time;
start_time = time.time()


def insertionSort(b): 
    for i in range(1, len(b)): 
        up = b[i] 
        j = i - 1
        while j >=0 and b[j] > up:  
            b[j + 1] = b[j] 
            j -= 1
        b[j + 1] = up     
    return b      
              
def bucketSort(x): 
    arr = [] 
    slot_num = 10 # 10 means 10 slots, each 
                  # slot's size is 0.1 
    for i in range(slot_num): 
        arr.append([]) 
          
    # Put array elements in different buckets  
    for j in x: 
        index_b = int(slot_num * j)  
        arr[index_b].append(j) 
      
    # Sort individual buckets  
    for i in range(slot_num): 
        arr[i] = insertionSort(arr[i]) 
          
    # concatenate the result 
    k = 0
    for i in range(slot_num): 
        for j in range(len(arr[i])): 
            x[k] = arr[i][j] 
            k += 1
    return x 
  
# Driver Code 
x = [0.897, 0.565, 0.656, 
     0.1234, 0.665, 0.3434]  
print("Sorted Array is") 
print(bucketSort(x)) 
  
# This code is contributed by 
# Oneil Hsiao 

end_time = time.time()
time_elapsed = end_time - start_time 
print("Benchmark time equals", time_elapsed)
