Skip to content

Pretty straightforward, right? Just me learning about sorting algorithms in different programming languages.

License

Notifications You must be signed in to change notification settings

PabloAceG/sorting-algorithms

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sorting Algorithms

This repository contains my own implementation of sorting algorithms in languages such as Python or Java, with an overall approximation of their efficiencies using O notation and other parameters. As well as definitions and other information of each algorithm.

Enjoy!

Table of Contents

Efficiency Evaluation

Sorting Algorithm Time Complexity/Num. Swaps Space Complexity Stable In Place Algorithm Paradigm
Best Case Average Case Worst Case Auxiliary
Bubble Sort O(n) O(1) O(n2) O(n2) O(n2) O(n2) O(1) Yes Yes --
Bucket Sort O(n+k) -- O(n+k) -- O(n2) -- O(1) Yes Yes SG
Counting Sort O(n+k) -- O(n+k) -- O(n+k) -- O(n+k) Yes No --
Heap Sort O(n) -- O(n log(n)) -- O(n log(n)) -- O(1) No Yes --
Insertion Sort O(n) O(1) O(n2) O(n2) O(n2) O(n2) O(n+k) Yes Yes Inc
Merge Sort O(n log(n)) -- O(n log(n)) -- O(n log(n)) -- O(n) Yes No D&C
Quick Sort O(n log(n)) O(1) O(n log(n)) O(n) O(n2) O(n) O(log(n)) No Yes D&C
Radix Sort O(n*k) -- O(n*k) -- O(n*k) -- O(n+k) Yes No --
Selection Sort O(n2) O(1) O(n2) O(n) O(n2) O(n) O(1) No Yes --
Glossary

D&C - Divide & Conquer Approach. In algorithmics means to break down a problem into smaller and separate parts until it is simple enough to be it directly.

Inc - Incremental Approach. Given an input, a solution is built working incrementally (step by step) on changes by adapting the input.

In Place An algorithm that does not need extra space and produces output in the same space (memory) that contains the data.

SG Scatter-Gather Approach. It sparsely copies the elements of x their corresponding locations in y. After performing the necessary operations, the elements are then put together from y to x.

Stable - For sorting algorithms it means that in independent executions elements that are equal are going to be in the same order.

Implemented Algorithms

Bubble Sort

Cam also be called sinking sort. Repeatedly compares adjacent elements and log(n) swaps them if they are wrongly ordered.

Performs poorly in real world problem due to the number of swaps and time it takes to order a list of elements.

Comparison

When implementing or selecting an algorithm it is almost never used (despite being the algorithm that is implemented by default in most sorting libraries).

The reason behind this discrimination is the number of swaps done by the algorithm. In insertion sort, the best case scenario (an almost ordered list) time complexity is the same as in bubble sort with the difference that the number of swaps done by bubble sort each algorithm is vastly greater.

When to use it?

For its simplicity, is a good starting point when learning sorting algorithms.

According to some sources (GeekforGeeks or Wikipedia) it is also used in computer graphics for its capability of detecting small errors in almost sorted arrays and solve them in linear time complexity O(2n).

Bucket Sort

Also known as bin sort, is a sorting algorithm that works by distributing elements into buckets and then applying sorting on each individual bucket. The time complexity of the algorithm depends on the sorting log(n)algorithm of the buckets.

For this implementation, it has been selected insertion sort as defualt.

Comparison

Bucket sort is a generalization of counting sort; it can degenerate into this algorithm if this size of the bucket approximates to 1.

Counting Sort

This technique sorts a collection by counting the number of elements that have a distinct key. Then, arithmetics are used to calculate the position of each key value in the output sequence.

Where to use it?

It is not generally used. Normally, it is used a subroutine in another sorting, like radix sort or bucker sort.

Heap Sort

This sorting algorithm comes from an optimization of selection sort. This algorithm shrinks the unsorted region extracting the largest element and placing it in the sorted region. By maintaining the unsorted region in a heap data can be accessed faster.

Comparison

SelectionSort divides the list into sorted and unsorted, wasting linear time evaluating those regions. On the other hand, HeapSort uses time to find the largest element in each step.

Insertion Sort

This algorithm comes from an incremental approach, by building a sorted array/list one step at a time.

It works by consuming one element at a time, by finding its location in the ordered sub-array and reinserting the element at its rightful position. This process is repeated until no element is left.

When to use it?

This sorting algorithm is used when number of elements to order is small. Another situation where this approach can be useful is when the array is already almost sorted as not many elements need to be moved.

Merge Sort

It is a Divide and Conquer approach that divides a given list or array into N separate parts (sub-lists), with each of those lists containing one element. Afterwards, each sub-lists is merged into a new sorted sub-list, until one ordered list is left.

Comparison

MergeSort is used to sort linked lists in O(nLog(n)) time complexity. It is more efficient that, say, QuickSort. Unline in array, inserting an element in the middle of the array takes O(1), in extra space and time, very useful for the merge operation (sequential data access).

On the other hand, arbitrary access to a give position, is is more expensive on a linked-list than on an array. That is why QuickSort is more efficient with arrays, as it uses a lot of this kind of accesses.

When to use it?

For the aforementioned reasons, MergeSort is used when dealing with LinkedLists. It also comes in handy when having a Count Inversion Problems. It is also used in External Sorting.

Quick Sort

Divide and Conquer approach. In the algorithm a pivot element is used to divide the array into two separate sub-arrays, depending on whether the elements are smaller/bigger than the pivot. Then, the arrays are sorted recursively.

This can be on place to reduce space complexity, although the recursion also takes some space. That's why it takes O(nLog(n)) in of auxiliary space.

Regarding on the pivot selection, there are different approaches:

  1. Always pick the first element.
  2. Always pick the last element.
  3. Select a random number.
  4. Select median using first and last elements and the element in the of these two.

According to Jon L. Bentley[1], using the median seems like the most optimal approach.

Comparison

When sorting array, it is preferred over MergeSort as it does not need as much auxiliary space. Both algorithms are recursive, but MergeSort needs O(n).

When to use it?

When dealing with sorting arrays and having to decide MergeSort or QuickSort.

QuickSort is also tail recursive, therefore tail call optimization can be applied.

Radix Sort

This algorithm, also known as DigitalSort is a non-comparative sorting algortithm that distributes elements through buckets according to their radix (or base) - CountSort. When the elements have more than one significant digit, this process is repeated for every digit.

Where to use it?

This implementation only allows integer sorting (more than one digit). On the other hand, other implementations also allow string sorting and other types of sorting. This means that it can be used in the same situations as CountSort, but with more digits.

Comparison

When the number of bits is log2(n) for every digit, the running time of RadixSort is better than QuickSort for a wide range of input numbers. The constant factors hidden in asymptotic notation are higher for RadixSort. QuickSort uses hardware caches more efficiently. It should also be taken into account that RadixSort uses CountSort underneath, which takes extra space.

Selection Sort

This algorithms splits the array to order into separate parts: sorted (usually left) and unsorted (usually right). The algorithm finds the largest/smallest (descending/ascending) element and inserts it at the end of the sorted sub-array.

When to use it?

It never uses more than O(n) swaps, so it can be useful for when a memory write operation is costly.

References

[1] Jon L. Bentley, M. Douglas McIlroy, "Engineering a Sort Function". Software Practice and Experience, 23(11):1249-1265, Nov 1993.