# Sort Algorithms
In this notebook we will cover and implement 3 of the main sorting algorithms:
* Bubble Sort;
* Merge Sort;
* Quick Sort;

## Bubble Sort
---
Bubble sort is a simple and straightforward sorting algorithm that repeatedly steps through the list, compares adjacent elements, and swaps them if they are in the wrong order. The process is repeated until the list is sorted. Here’s a breakdown of the main ideas behind bubble sort:

### Main Ideas of Bubble Sort

#### Comparison and Swap
   - Compare each pair of adjacent elements.
   - If the elements are in the wrong order (e.g., the first is greater than the second for ascending order), swap them.

#### Multiple Passes:
   - The process of comparison and swapping is repeated for multiple passes.
   - Each pass through the list places the next largest (or smallest, depending on the order) element in its correct position.


<br><br>

### Characteristics of Bubble Sort

#### Time Complexity
  - Worst and average case: $O(n^2)$ (when the array is in reverse order).
  - Best case: $O(n)$ (when the array is already sorted, with the optimized version).
  - Bubble sort is generally not used for large datasets due to its inefficiency.

#### Space Complexity
  - $O(1)$ (in-place sorting).

#### Stability
  - Bubble sort is a stable sorting algorithm because it does not change the relative order of elements with equal keys.

In [1]:
### Implement the code here!

def bubble_sort(data=list()):
    # Iterate over all elements in the list
    for i in range(0, len(data), 1):
        # Iterate over (N - i - 1) elements -> Because at every iteration,
        # the last value is already sorted, so (N - i). The (-1) is to avoid
        # OutOfIndex Exception.
        for j in range(0, len(data) - i - 1):
            
            # If the left value is greater than the right value, swap them.
            if data[j] > data[j+1]:
                data[j], data[j+1] = data[j+1], data[j]
    return data

Testing the above function

In [2]:
### Test the code here!
import random
import time

randomlist = random.sample(range(40), 40)

timenow = time.time()
print(f"Random List: {randomlist}")
print(f"Sorted Random List: {bubble_sort(randomlist)}")
print(f"Execution time: {time.time() - timenow:.8f} s")

Random List: [22, 13, 14, 24, 33, 11, 7, 10, 8, 31, 9, 39, 30, 6, 12, 21, 15, 0, 36, 1, 5, 23, 25, 28, 29, 27, 32, 2, 38, 4, 35, 37, 34, 26, 3, 17, 18, 20, 19, 16]
Sorted Random List: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39]
Execution time: 0.00100732 s


#### Optimization for Bubble Sort
   - An optimized version of bubble sort can stop early if, during a pass, no swaps are made, indicating that the list is already sorted.

In [3]:
### Implement the code here!

def opt_bubble_sort(data=list()):
    for i in range(0, len(data), 1):
        swap = False
        for j in range(0, len(data) - i - 1):
            if data[j] > data[j+1]:
                data[j], data[j+1] = data[j+1], data[j]
                swap = True
        if not swap:
            return data
    return data

In [4]:
### Test the code here!
import random
import time

randomlist = random.sample(range(40), 40)

timenow = time.time()
print(f"Random List: {randomlist}")
print(f"Sorted Random List: {opt_bubble_sort(randomlist)}")
print(f"Execution time: {time.time() - timenow:.8f} s")

Random List: [16, 13, 30, 36, 29, 27, 22, 23, 15, 37, 5, 12, 28, 34, 8, 21, 7, 38, 32, 9, 4, 33, 3, 18, 26, 25, 24, 10, 6, 17, 35, 19, 2, 31, 39, 20, 1, 11, 0, 14]
Sorted Random List: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39]
Execution time: 0.00000000 s


## Merge Sort
Merge sort is a more advanced sorting algorithm that follows the divide-and-conquer paradigm. It divides the input array into two halves, recursively sorts each half, and then merges the two sorted halves back together.

### Main Ideas of Merge Sort

#### Divide:
   - Divide the unsorted list into two approximately equal halves.

#### Conquer (Recursion):
   - Recursively sort both halves. If the list has only one element, it is already sorted.

#### Combine (Merge):
   - Merge the two sorted halves into a single sorted list.

<br><br>

### Characteristics of Merge Sort

#### Time Complexity:
  - Merge sort has a time complexity of $O(n \log n)$ in all cases (worst, average, and best).

#### Space Complexity:
  - The space complexity is $O(n)$ because of the extra space used for the temporary arrays.

#### Stability:
  - Merge sort is a stable sorting algorithm, maintaining the relative order of equal elements.

### Advantages and Disadvantages

#### Advantages:
  - Efficient for large datasets.
  - Guarantees $O(n \log n)$ time complexity.
  - Stable sort.

#### Disadvantages:
  - Requires additional memory for the temporary arrays.
  - Not an in-place sort; the extra space complexity is $O(n)$.

Merge sort is widely used for sorting linked lists and external sorting (e.g., when data is too large to fit into memory). It is also a preferred sorting algorithm in situations where stability is required.

In [5]:
def mergeSort(data):
    # Checking if the length of the data is greater than 1. if so, get middle value.
    if len(data) > 1:
        mid = len(data) // 2
        
        # Divide the elements in 2 halves:
        left = data[:mid]
        right = data[mid:]
        
        # Sort the first half
        mergeSort(left)
        
        #Sort the second half
        mergeSort(right)
        
        # REPEAT until every half is composed of 1 element
        
        i, j, k = 0, 0, 0
        
        # Copy data to temporary lists.
        while i < len(left) and j < len(right):
            if left[i] < right[j]:
                data[k] = left[i]
                i += 1
            else:
                data[k] = right[j]
                j += 1
            k += 1
        
        # Verifying if any element was left:
        while i < len(left):
            data[k] = left[i]
            i += 1
            k += 1
            

        # Verifying if any element was left:
        while j < len(right):
            data[k] = right[j]
            j += 1
            k += 1
    return data

Testing the above funtction

In [6]:
### Test the code here!
import random
import time

randomlist = random.sample(range(40), 40)

timenow = time.time()
print(f"Random List: {randomlist}")
print(f"Sorted Random List: {mergeSort(randomlist)}")
print(f"Execution time: {time.time() - timenow:.8f} s")

Random List: [14, 0, 6, 28, 24, 20, 8, 7, 18, 32, 19, 11, 36, 16, 3, 33, 27, 26, 39, 25, 37, 12, 34, 23, 22, 21, 4, 29, 5, 15, 30, 10, 13, 38, 17, 31, 35, 2, 1, 9]
Sorted Random List: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39]
Execution time: 0.00000000 s


### Bubble Sort vs Merge Sort:

In [7]:
import random
import time

randomlist = random.sample(range(7000), 7000)
randomlist2 = randomlist.copy()

timenow = time.time()
randomlist = bubble_sort(randomlist)
print(f"Execution time: {time.time() - timenow:.8f} s")

print(end="\n\n")

timenow = time.time()
randomlist2 =mergeSort(randomlist2)
print(f"Execution time: {time.time() - timenow:.8f} s")

Execution time: 2.09803677 s


Execution time: 0.01251078 s


## Quick Sort
Quick Sort is a popular and efficient sorting algorithm that uses the divide-and-conquer strategy to sort elements. Here are the main ideas behind Quick Sort, along with a Python implementation:

### Main Ideas

#### Divide-and-Conquer:
* Quick Sort divides the array into subarrays and sorts each subarray independently.
#### Pivot Selection:
* A pivot element is chosen from the array. The pivot can be any element, but common choices are the first element, the last element, the middle element, or a random element.
#### Partitioning:
* The array is rearranged so that all elements less than the pivot come before it, and all elements greater than the pivot come after it. This places the pivot in its correct sorted position.
#### Recursion:
* The same process is recursively applied to the subarrays formed by dividing the array at the pivot.

<br><br>

### Steps of Quick Sort

* Choose a Pivot:
    * Select an element from the array as the pivot.
* Partition:
    * Rearrange the elements in the array so that elements less than the pivot are on the left, elements greater than the pivot are on the right.
* Recursively Apply:
    * Apply the same process to the left and right subarrays.


In [8]:
def quickSort(data):
    def partitioning(first, last):
        
        # Selecting always the last element as the pivot !
        pivot = data[last]
        
        # i = first - 1 because when data[j] <= pivot, i increases in 1
        i = first - 1
        # Iterate from first to last element compairing it to pivot
        for j in range(first, last):
            if data[j] <= pivot:
                i += 1
                # Swap data[i] with data[j], since data[j] is lesser or equal to pivot
                data[i], data[j] = data[j], data[i]

        # At the end of the loop, swap the pivot with the data[i + 1] element, guaranteeing the right position for the pivot
        data[i + 1], data[last] = data[last], data[i + 1]
        
        # Return the position for the pivot
        return i + 1
    
    def quickSortRecursive(first, last):
        if first < last:
            pivot_index = partitioning(first, last)
            
            # Recursively call the quickSortRecursive without passing the pivot as an element.
            quickSortRecursive(first, pivot_index - 1)
            quickSortRecursive(pivot_index + 1, last)
    
    # Enter the recursion
    quickSortRecursive(0, len(data) - 1)
    
    # Return sorted data
    return data

Testing:

In [14]:
data = random.sample(range(50), 50)
print(data)
data = quickSort(data)
print(data)

[42, 17, 28, 21, 11, 6, 43, 14, 18, 13, 31, 26, 41, 35, 7, 27, 3, 0, 1, 9, 12, 24, 45, 5, 36, 40, 33, 34, 49, 16, 10, 38, 20, 30, 8, 2, 19, 32, 48, 47, 22, 29, 46, 15, 39, 44, 23, 37, 4, 25]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49]


Implementing Quick Sort again, but now it always selects the first element as the pivot:

In [9]:
def quickSortFirstElementAsPivot(data):
    def partitioning(first, last):
        
        # Selecting always the last element as the pivot !
        pivot = data[first]
        
        # i = first - 1 because when data[j] <= pivot, i increases in 1
        i = last + 1
        # Iterate from first to last element compairing it to pivot
        for j in range(last, first, -1):
            if data[j] >= pivot:
                i -= 1
                # Swap data[i] with data[j], since data[j] is greater or equal to pivot
                data[i], data[j] = data[j], data[i]

        # At the end of the loop, swap the pivot with the data[i + 1] element, guaranteeing the right position for the pivot
        data[i - 1], data[first] = data[first], data[i - 1]
        
        # Return the position for the pivot
        return i - 1
    
    def quickSortRecursive(first, last):
        if first < last:
            pivot_index = partitioning(first, last)
            
            # Recursively call the quickSortRecursive without passing the pivot as an element.
            quickSortRecursive(first, pivot_index - 1)
            quickSortRecursive(pivot_index + 1, last)
    
    # Enter the recursion
    quickSortRecursive(0, len(data) - 1)
    
    # Return sorted data
    return data

Testing:

In [12]:
data = random.sample(range(50), 50)
print(data)
data = quickSortFirstElementAsPivot(data)
print(data)

[19, 29, 25, 28, 45, 13, 41, 21, 16, 40, 47, 4, 42, 11, 37, 27, 5, 46, 34, 9, 23, 24, 26, 17, 33, 44, 8, 35, 18, 3, 22, 10, 12, 38, 30, 20, 49, 6, 2, 43, 36, 32, 1, 7, 0, 48, 31, 15, 39, 14]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49]


### Quick Sort vs Bubble Sort vs Merge Sort
---

In [13]:
import random
import time

randomlist = random.sample(range(15000), 15000)
randomlist2 = randomlist.copy()
randomlist3 = randomlist.copy()

timenow = time.time()
randomlist = bubble_sort(randomlist)
print(f"Bubble Sort execution time: {time.time() - timenow:.8f} s")

timenow = time.time()
randomlist2 =mergeSort(randomlist2)
print(f"Merge Sort execution time: {time.time() - timenow:.8f} s")

timenow = time.time()
randomlist3 =quickSort(randomlist3)
print(f"Quick Sort execution time: {time.time() - timenow:.8f} s")

Bubble Sort execution time: 9.58008265 s
Merge Sort execution time: 0.03053498 s
Quick Sort execution time: 0.02152944 s


## Conclusion:
---

As you could see above, both Merge Sort and Quick Sort deals very well with large arrays / lists, but it has a price in memory. As this study is merely for educational purposes (problably you will never need to really implement a sorting algorithm, because most languages has it implemented), I'll leave the above implementations as they are, as I don't see the need for optimizations for this Python Notebook.