# Sorting Algorithms

Welcome to this Jupyter Notebook on Sorting Algorithms! This notebook aims to provide a comprehensive overview of various sorting algorithms, which are essential for arranging data in a specified order. Understanding sorting algorithms is crucial for optimizing data processing, enhancing performance, and solving many computational problems efficiently.

### What Are Sorting Algorithms?

Sorting algorithms are methods used to reorder elements in a data structure, such as an array or list, according to a specified sorting criterion (e.g., ascending or descending order). Different sorting algorithms have varying characteristics and performance trade-offs, making them suitable for different types of data and applications.

### Key Concepts

In this notebook, we will explore the following key sorting algorithms:

1. **Bubble Sort**: A simple algorithm that repeatedly steps through the list, compares adjacent elements, and swaps them if they are in the wrong order.

2. **Selection Sort**: An algorithm that repeatedly selects the smallest (or largest) element from the unsorted portion and moves it to the end of the sorted portion.

3. **Insertion Sort**: A method that builds the sorted array one item at a time by inserting each new element into its correct position among the already sorted elements.

4. **Merge Sort**: A divide-and-conquer algorithm that divides the array into smaller subarrays, sorts them, and then merges them back together in a sorted order.

5. **QuickSort**: A divide-and-conquer algorithm that selects a pivot element, partitions the array into subarrays around the pivot, and recursively sorts the subarrays.

6. **Heap Sort**: An algorithm that converts the array into a heap structure, repeatedly extracts the maximum element from the heap, and places it into the sorted portion.

7. **Counting Sort**: A non-comparative sorting algorithm that counts the occurrences of each distinct element and uses this count to place elements into their correct positions.

8. **Radix Sort**: A non-comparative algorithm that sorts numbers by processing individual digits, starting from the least significant digit to the most significant.

9. **Bucket Sort**: An algorithm that distributes elements into a number of buckets, sorts each bucket individually, and then concatenates the sorted buckets.

10. **HeapSort**: An algorithm that converts the array into a heap structure (typically a max heap), repeatedly extracts the maximum element from the heap, places it at the end of the sorted portion, and restores the heap property.

11. **Tim Sort**: A hybrid sorting algorithm derived from Merge Sort and Insertion Sort. It divides the array into smaller runs, sorts them using Insertion Sort, and then merges the sorted runs using a modified Merge Sort. It is particularly efficient on real-world data and is used in Python’s built-in sorting functions.

### Learning Objectives

By the end of this notebook, you will:
- Understand the basic concepts and operations of each sorting algorithm.
- Learn how to implement these sorting algorithms in Python.


In the following cell, we are creating an array (not sorted) and a dictionary to save the execution time of each algorithm, is it possible to modify the array length, the max value and the min value (we are going to use this array for all the algorithms).
Each algorithm will be measured in time units for comparison purposes.

In [None]:
import random as rand
import time # Useful for time misuring

len_array = 50000
min_value = 0
max_value = 1000

array = []

time_dict = {}

for index in range(len_array):
    array.append(rand.randint(min_value, max_value))

print(array)

## Selection Sort
Selection Sort is a simple and straightforward algorithm for sorting an array. It works by repeatedly selecting the smallest (or largest, depending on the desired order) element from the unsorted portion of the array and moving it to its correct position in the sorted portion. Initially, the entire array is considered unsorted.

The steps of Selection Sort are as follows:
1. Find the minimum element in the unsorted portion of the array.
2. Swap this minimum element with the first element of the unsorted portion, thereby marking it as sorted.
3. Move the boundary between the sorted and unsorted portions and repeat the process for the rest of the array.

This process continues until the entire array is sorted.

Selection Sort is known for its simplicity and ease of implementation, making it a good choice for small arrays or when simplicity is prioritized over performance. One of its advantages is that it performs a minimal number of swaps compared to other sorting algorithms, which can be beneficial when swap operations are costly, such as when sorting data stored in flash memory or databases. Selection Sort is an in-place algorithm, meaning it requires only a constant amount of extra space beyond the original array.

A visualization of how the Selection Sort algorithm works is available at the following link: https://visualgo.net/en/sorting

In [2]:
def selection_sort(array: list[int]) -> list[int]:
    # Iterate over each element in the array except the last one
    for i in range(len(array) - 1):
        # Assume the current element is the smallest
        min_index = i
        # Iterate through the remaining elements to find the actual smallest element
        for j in range(i + 1, len(array)):
            # Update min_index if a smaller element is found
            if array[j] < array[min_index]:
                min_index = j
        # Swap the smallest element found with the current element
        array[i], array[min_index] = array[min_index], array[i]
    # Return the sorted array
    return array

Execution of the Selection Sort algorithm:

In [None]:
selection_sort_array = array[:]

start_time = time.time()  # Record start time
selection_sort(selection_sort_array)
end_time = time.time()  # Record end time

selection_sort_execution_time = end_time - start_time

time_dict["Selection Sort"] = selection_sort_execution_time

print(selection_sort_array)

print("Execution time:", selection_sort_execution_time, "seconds")

## Bubble Sort
Bubble Sort is a basic and easy-to-understand sorting algorithm. It works by repeatedly stepping through the list to be sorted, comparing each pair of adjacent elements and swapping them if they are in the wrong order. This process is repeated until no more swaps are needed, which means the list is sorted.

The steps of Bubble Sort can be outlined as follows:
1. Start at the beginning of the array and compare the first two elements.
2. If the first element is greater than the second, swap them.
3. Move to the next pair of elements and repeat the comparison and swap if necessary.
4. Continue this process until the end of the array is reached. After each pass, the largest unsorted element "bubbles" up to its correct position in the sorted portion of the array.

The algorithm repeats these passes through the array, excluding the last sorted elements, until no swaps are needed in a pass, indicating that the array is fully sorted.

Bubble Sort is known for its simplicity and is often used for educational purposes to demonstrate basic sorting concepts. It has the advantage of being easy to implement and understand. Additionally, it is a stable sorting algorithm, meaning that it preserves the relative order of equal elements. Bubble Sort also works in place, requiring only a small constant amount of additional memory. However, it is generally not suitable for large datasets due to its inefficiency compared to more advanced sorting algorithms.

A visualization of how the Bubble Sort algorithm works is available at the following link: https://visualgo.net/en/sorting

In [None]:
def bubble_sort(array: list[int]) -> list[int]:
    # Iterate over each element in the array
    for i in range(len(array)):
        # Iterate over the array, ignoring the last i elements which are already sorted
        for j in range(0, len(array) - i - 1):
            # Compare adjacent elements
            if array[j] > array[j + 1]:
                # Swap elements if they are in the wrong order
                array[j], array[j + 1] = array[j + 1], array[j]
    # Return the sorted array
    return array

Execution of the Bubble Sort algorithm:

In [None]:
bubble_sort_array = array[:]

start_time = time.time()  # Record start time
bubble_sort(bubble_sort_array)
end_time = time.time()  # Record end time

bubble_sort_execution_time = end_time - start_time

time_dict["Bubble Sort"] = bubble_sort_execution_time

print(bubble_sort_array)

print("Execution time:", bubble_sort_execution_time, "seconds")

## Insertion Sort
Insertion Sort is a straightforward and intuitive sorting algorithm. It works by building a sorted portion of the array one element at a time. The algorithm iterates through the array, taking one element from the unsorted portion and inserting it into its correct position in the sorted portion.

Here’s how Insertion Sort works:
1. Start with the second element of the array (considering the first element as the initial sorted portion).
2. Compare this element with the elements in the sorted portion (to its left).
3. Shift all elements in the sorted portion that are greater than the current element one position to the right.
4. Insert the current element into its correct position in the sorted portion.
5. Move to the next element and repeat the process until the entire array is sorted.

The algorithm continues this process until all elements from the unsorted portion have been inserted into the sorted portion.

Insertion Sort is appreciated for its simplicity and ease of implementation. It works well for small or partially sorted datasets. Additionally, it is a stable sorting algorithm, meaning it maintains the relative order of equal elements. It also sorts the array in place, requiring only a small amount of additional memory. Despite its advantages, Insertion Sort is less efficient for larger datasets compared to more advanced sorting algorithms.

A visualization of how the Insertion Sort algorithm works is available at the following link: https://visualgo.net/en/sorting

In [None]:
def insertion_sort(array: list[int]) -> list[int]:
    # Iterate through the array starting from the second element
    for i in range(1, len(array)):
        # Store the current element (key) that needs to be inserted in the sorted part
        key = array[i]
        # Initialize j to the index of the element just before i
        j = i - 1
        # Shift elements of the sorted segment that are greater than key to the right
        while j >= 0 and key < array[j]:
            array[j + 1] = array[j]
            j -= 1
        # Place the key in its correct position within the sorted segment
        array[j + 1] = key
    # Return the sorted array
    return array

Execution of the Insertion Sort Algorithm:

In [None]:
insertion_sort_array = array[:]

start_time = time.time()  # Record start time
insertion_sort(insertion_sort_array)
end_time = time.time()  # Record end time

insertion_sort_execution_time = end_time - start_time

time_dict["Insertion Sort"] = insertion_sort_execution_time

print(insertion_sort_array)

print("Execution time:", insertion_sort_execution_time, "seconds")

## Merge Sort
Merge Sort is a highly efficient and widely used sorting algorithm based on the divide-and-conquer strategy. It works by dividing the array into smaller subarrays, sorting those subarrays, and then merging them back together to form a sorted array.

Here’s how Merge Sort operates:
1. **Divide:** Split the array into two roughly equal halves.
2. **Conquer:** Recursively apply Merge Sort to each half, breaking them down into even smaller subarrays until each subarray contains a single element or is empty (which means they are inherently sorted).
3. **Merge:** Combine the sorted subarrays back into a single sorted array. This is done by comparing the elements of each subarray and merging them in sorted order.

The merging process involves comparing the elements from the two subarrays and arranging them in the correct order to form a new, sorted array.

Merge Sort is known for its efficiency and stability. It guarantees a consistent performance regardless of the initial arrangement of the data, and it is particularly effective for sorting large datasets. The algorithm is also stable, meaning it preserves the relative order of equal elements. Merge Sort requires additional memory for the temporary arrays used during the merging process, but it is well-suited for applications where stability and performance are critical.

A visualization of how the Merge Sort algorithm works is available at the following link: https://visualgo.net/en/sorting

In [None]:
def merge_sort(array: list[int]) -> list[int]:
    # Base case: If the array has 1 or no elements, it is already sorted
    if len(array) <= 1:
        return array
    # Find the middle index of the array
    mid = len(array) // 2
    # Recursively split and sort the left half
    left_array = merge_sort(array[:mid])
    # Recursively split and sort the right half
    right_array = merge_sort(array[mid:])
    # Merge the two sorted halves
    merged_array = merge(left_array, right_array)
    # Return the merged and sorted array
    return merged_array

def merge(left_array: list[int], right_array: list[int]) -> list[int]:
    result = []
    i, j = 0, 0
    # Merge the two arrays while there are elements in both
    while i < len(left_array) and j < len(right_array):
        if left_array[i] < right_array[j]:
            result.append(left_array[i])
            i += 1
        else:
            result.append(right_array[j])
            j += 1
    # Append any remaining elements from the left array
    while i < len(left_array):
        result.append(left_array[i])
        i += 1
    # Append any remaining elements from the right array
    while j < len(right_array):
        result.append(right_array[j])
        j += 1
    # Return the merged and sorted array
    return result

Execution of the Insertion Sort Algorithm:

In [None]:
start_time = time.time()  # Record start time
merge_sort_result = merge_sort(array)
end_time = time.time()  # Record end time

merge_sort_execution_time = end_time - start_time

time_dict["Merge Sort"] = merge_sort_execution_time

print(merge_sort_result)

print("Execution time:", merge_sort_execution_time, "seconds")

## QuickSort
QuickSort is a widely used and efficient sorting algorithm based on the divide-and-conquer strategy. It works by selecting a "pivot" element from the array and partitioning the other elements into two groups: those less than the pivot and those greater than the pivot. The algorithm then recursively applies the same process to the subarrays formed by the partitioning.

Here’s how QuickSort operates:
1. **Choose a Pivot:** Select an element from the array to act as the pivot. The choice of pivot can vary, but common methods include choosing the first element, the last element, or a random element.
2. **Partition:** Rearrange the array so that elements less than the pivot come before it and elements greater than the pivot come after it. The pivot is placed in its correct position in the sorted array.
3. **Recursively Sort Subarrays:** Apply QuickSort recursively to the subarrays of elements less than the pivot and greater than the pivot. This step continues until each subarray is sorted.

The process involves repeatedly choosing pivots, partitioning the array, and sorting the resulting subarrays. The recursive approach continues until the base case is reached, where subarrays are either empty or contain a single element, making them inherently sorted.

QuickSort is appreciated for its efficiency and is commonly used in practice for large datasets. It is an in-place sorting algorithm, meaning it requires only a small, constant amount of additional memory beyond the original array.

A visualization of how the Quick Sort algorithm works is available at the following link: https://visualgo.net/en/sorting

In [None]:
def partition(array: list[int], start: int, end: int) -> int:
    # Choose the last element as the pivot
    pivot = array[end]
    # Initialize the index of the smaller element
    i = start - 1
    # Iterate through the array from start to end - 1
    for j in range(start, end):
        # If the current element is less than or equal to the pivot
        if array[j] <= pivot:
            # Increment the index of the smaller element
            i = i + 1
            # Swap the current element with the element at index i
            array[i], array[j] = array[j], array[i]

    # Place the pivot in its correct position by swapping with the element at i + 1
    array[i + 1], array[end] = array[end], array[i + 1]
    # Return the index of the pivot
    return i + 1

def quick_sort(array: list[int], start: int, end: int) -> None:
    # If there is more than one element in the segment
    if start < end:
        # Partition the array and get the index of the pivot
        pivot = partition(array, start, end)
        # Recursively apply quickSort to the left segment
        quick_sort(array, start, pivot - 1)
        # Recursively apply quickSort to the right segment
        quick_sort(array, pivot + 1, end)

Execution of the Quick Sort Algorithm:

In [None]:
quick_sort_array = array[:]

start_time = time.time()  # Record start time
quick_sort(quick_sort_array, 0, len(quick_sort_array) - 1)
end_time = time.time()  # Record end time

quick_sort_execution_time = end_time - start_time

time_dict["Quick Sort"] = quick_sort_execution_time

print(quick_sort_array)

print("Execution time:", quick_sort_execution_time, "seconds")

### Randomized Quick Sort

Randomized QuickSort is a variant of the traditional QuickSort algorithm that incorporates randomness to enhance performance and reduce the likelihood of encountering worst-case scenarios. It operates using the same divide-and-conquer strategy as QuickSort, but with a randomized approach for selecting the pivot element.

Here’s how Randomized QuickSort works:

Randomly Choose a Pivot: Select a pivot element from the array randomly. This randomness helps to ensure that the algorithm performs well on a variety of input distributions and reduces the chance of consistently poor performance.
Partition: Rearrange the array so that elements less than the pivot are placed before it and elements greater than the pivot are placed after it. The pivot is then positioned in its correct place in the sorted array.
Recursively Sort Subarrays: Apply Randomized QuickSort recursively to the subarrays formed by elements less than the pivot and those greater than the pivot. This process is repeated until each subarray is sorted.
By introducing randomness in the pivot selection process, Randomized QuickSort minimizes the risk of encountering the worst-case performance scenarios typical of deterministic pivot selection methods, such as always picking the first or last element. This makes it more robust and often improves its performance on average.

Randomized QuickSort retains the benefits of the original QuickSort, including being an in-place sorting algorithm that requires only a small amount of additional memory. It is effective for large datasets and can be easily adapted to sort arrays in different orders. The randomness in pivot selection helps ensure more balanced partitions and generally leads to efficient sorting performance.

A visualization of how the Random Quick Sort algorithm works is available at the following link: https://visualgo.net/en/sorting

In [None]:
def randomized_partition(array: list[int], start: int, end: int) -> int:
    # Randomly choose an index between start and end to be the pivot
    pivot_index = rand.randint(start, end)
    # Swap the pivot element with the last element
    array[pivot_index], array[end] = array[end], array[pivot_index]
    # Choose the last element as the pivot (which is now the previously randomly chosen pivot)
    pivot = array[end]
    # Initialize the index of the smaller element
    i = start - 1
    # Iterate through the array from start to end - 1
    for j in range(start, end):
        # If the current element is less than or equal to the pivot
        if array[j] <= pivot:
            # Increment the index of the smaller element
            i += 1
            # Swap the current element with the element at index i
            array[i], array[j] = array[j], array[i]
    # Place the pivot in its correct position by swapping with the element at i + 1
    array[i + 1], array[end] = array[end], array[i + 1]
    # Return the index of the pivot
    return i + 1

def randomized_quick_sort(array: list[int], start: int, end: int) -> None:
    # If there is more than one element in the segment
    if start < end:
        # Partition the array and get the index of the pivot
        pivot = randomized_partition(array, start, end)
        # Recursively apply quickSort to the left segment
        randomized_quick_sort(array, start, pivot - 1)
        # Recursively apply quickSort to the right segment
        randomized_quick_sort(array, pivot + 1, end)


Execution of the Randomized Quick Sort Algorithm:

In [None]:
rand_quick_sort_array = array[:]

start_time = time.time()  # Record start time
randomized_quick_sort(rand_quick_sort_array, 0, len(rand_quick_sort_array) - 1)
end_time = time.time()  # Record end time

rand_quick_sort_execution_time = end_time - start_time

time_dict["Randomized Quick Sort"] = rand_quick_sort_execution_time

print(rand_quick_sort_array)

print("Execution time:", rand_quick_sort_execution_time, "seconds")

## Counting Sort
Counting Sort is a non-comparative sorting algorithm that is particularly effective for sorting integers within a known range. It works by counting the occurrences of each distinct element in the input array and then using this count to determine the position of each element in the sorted output.

Here’s how Counting Sort operates:
1. **Determine the Range:** Identify the range of the input values, which is the difference between the maximum and minimum values in the array.
2. **Count Occurrences:** Create a count array where each index corresponds to a possible value in the input array. Iterate through the input array and populate this count array with the frequency of each value.
3. **Accumulate Counts:** Modify the count array by accumulating the counts. This step transforms the count array into a position array, which indicates the final position of each element in the sorted output.
4. **Build the Sorted Array:** Create an output array of the same length as the input array. Use the position array to place each element from the input array into its correct position in the output array based on the accumulated counts.

Counting Sort is efficient for sorting integers within a limited range and is particularly useful when the range of input values is not significantly larger than the number of elements to be sorted. It is a stable sorting algorithm, meaning it preserves the relative order of equal elements, and it sorts the array in linear time with respect to the number of elements.

This algorithm is best suited for applications where the range of possible values is known and relatively small compared to the size of the input array. Counting Sort works in-place in terms of the count array but requires additional space proportional to the range of the input values.

A visualization of how the Counting Sort algorithm works is available at the following link: https://visualgo.net/en/sorting

In [None]:
def counting_sort(array: list[int]) -> list[int]:
    max_value, min_value = max(array), min(array)
    range_value = max_value - min_value + 1
    
    result = [0] * len(array)
    count_array = [0] * range_value
    
    for value in array:
        count_array[value] += 1
    
    for index in range(1, range_value):
        count_array[index] += count_array[index-1]
        
    for value in reversed(array):
        result[count_array[value] - 1] = value
        count_array[value] -= 1
    
    return result

Execution of the Counting Sort Algorithm:

In [None]:
start_time = time.time()  # Record start time
counting_sort_result = counting_sort(array)
end_time = time.time()  # Record end time

counting_sort_execution_time = end_time - start_time

time_dict["Counting Sort"] = counting_sort_execution_time

print(counting_sort_result)

print("Execution time:", counting_sort_execution_time, "seconds")

## Radix Sort
Radix Sort is a non-comparative sorting algorithm that sorts integers by processing individual digits. It is particularly effective for sorting large sets of numbers where each number can be broken down into digits. Radix Sort works by sorting the elements based on each digit, starting from the least significant digit to the most significant digit.

Here’s how Radix Sort operates:
1. **Determine the Maximum Digits:** Identify the maximum number of digits in the largest number of the array to determine the number of passes required.
2. **Sort by Each Digit:** Perform a stable sort (such as Counting Sort) on each digit, starting with the least significant digit and moving to the most significant digit. This process is repeated for each digit position.
3. **Stable Sorting:** For each digit position, use a stable sorting algorithm to ensure that the relative order of elements with the same digit value remains unchanged from the previous pass.

In each pass, Radix Sort uses a stable sort to arrange the numbers according to the current digit being processed. By sorting on each digit position from least significant to most significant, Radix Sort effectively places each number into its correct position in the overall sorted order.

Radix Sort is efficient for sorting large datasets where the numbers have a fixed or limited number of digits. It avoids the need for comparison operations and can handle a large number of elements effectively. However, it requires additional space for sorting each digit and is best suited for scenarios where the number of digits is relatively small compared to the number of elements. Radix Sort is also adaptable to other types of keys beyond integers, such as strings, by processing characters or other units of information.

A visualization of how the Radix Sort algorithm works is available at the following link: https://visualgo.net/en/sorting

In [None]:
def counting_sort_for_radix(array: list[int], exp: int) -> list[int]:
    # Initialize the count array
    count_array = [0] * 10
    # Initialize the result array
    result = [0] * len(array)
    
    # Count occurrences of each digit at the current place value
    for value in array:
        index = (value // exp) % 10
        count_array[index] += 1
    
    # Compute the cumulative count
    for index in range(1, 10):
        count_array[index] += count_array[index - 1]
    
    # Place the elements in their sorted position based on the current digit
    for value in reversed(array):
        index = (value // exp) % 10
        result[count_array[index] - 1] = value
        count_array[index] -= 1
    
    return result

def radix_sort(array: list[int]) -> list[int]:
    # Find the maximum value to determine the number of digits
    max_value = max(array)
    exp = 1  # Start with the least significant digit
    
    # Perform counting sort for each digit
    while max_value // exp > 0:
        array = counting_sort_for_radix(array, exp)
        exp *= 10  # Move to the next digit place
    
    return array

Execution of the Radix Sort Algorithm:

In [None]:
start_time = time.time()  # Record start time
radix_sort_result = radix_sort(array)
end_time = time.time()  # Record end time

radix_sort_execution_time = end_time - start_time

time_dict["Radix Sort"] = radix_sort_execution_time

print(radix_sort_result)

print("Execution time:", radix_sort_execution_time, "seconds")

## Bucket Sort
Bucket Sort is a non-comparative sorting algorithm that works by distributing elements into a number of buckets, sorting each bucket individually, and then concatenating the results. It is particularly effective for sorting floating-point numbers or uniformly distributed data.

Here’s how Bucket Sort operates:
1. **Create Buckets:** Determine the number of buckets and create an empty array of buckets. Each bucket corresponds to a specific range of values.
2. **Distribute Elements:** Iterate through the input array and distribute each element into the appropriate bucket based on its value. The placement of elements in buckets is determined by a hashing function that maps the value to the correct bucket.
3. **Sort Buckets:** Sort each bucket individually using a suitable sorting algorithm, such as Insertion Sort (particulary efficient for sorting small and partially sorted arrays) or another efficient sorting method.
4. **Concatenate Buckets:** Concatenate the sorted buckets to form the final sorted array.

The effectiveness of Bucket Sort depends on the distribution of elements and the choice of the hashing function. Ideally, the elements should be uniformly distributed across the buckets to achieve optimal performance.

Bucket Sort is known for its efficiency in specific scenarios, particularly when dealing with a uniform distribution of elements. It is stable when the sorting algorithm used for individual buckets is stable, and it sorts the array in linear time under optimal conditions. Bucket Sort requires additional space for the buckets, making it most suitable for applications where the number of buckets and the range of input values are well-defined and manageable.

This algorithm is particularly useful for sorting floating-point numbers and other types of data where a uniform distribution across buckets can be expected. It leverages the strengths of other sorting algorithms within each bucket to achieve efficient overall sorting.

A visualization of how the Bucket Sort Algorithm works is avaliable at the following link: https://www.cs.usfca.edu/~galles/visualization/BucketSort.html

In [None]:
def bucket_sort(array: list[int]) -> list[int]:
    # Find the maximum and minimum values in the array
    max_value, min_value = max(array), min(array)
    range_value = max_value - min_value

    # Create an array of empty buckets
    bucket_valueber = len(array)
    buckets = [[] for _ in range(bucket_valueber)]

    # Distribute the array elements into the appropriate buckets
    for value in array:
        index = int((value - min_value) / (range_value + 1) * bucket_valueber)
        buckets[index].append(value)

    # Sort each bucket using insertion sort and concatenate the results
    result = []
    for bucket in buckets:
        result.extend(insertion_sort(bucket))

    return result


Execution of the Bucket Sort Algorithm:

In [None]:
start_time = time.time()  # Record start time
bucket_sort_result = bucket_sort(array)
end_time = time.time()  # Record end time

bucket_sort_execution_time = end_time - start_time

time_dict["Bucket Sort"] = bucket_sort_execution_time

print(bucket_sort_result)

print("Execution time:", bucket_sort_execution_time, "seconds")

## Tim Sort
Tim Sort is an efficient hybrid sorting algorithm derived from Merge Sort and Insertion Sort. It is designed to perform well on real-world data, taking advantage of natural runs (already sorted subsequences) in the data to optimize performance. Tim Sort is the default sorting algorithm used in Python's `sorted()` function and Java's `Arrays.sort()` for objects.

Here’s how Tim Sort operates:
1. **Divide into Runs:** The algorithm starts by dividing the input array into small chunks called "runs." These runs are either identified as naturally occurring sorted sequences in the data or created by sorting smaller subarrays using Insertion Sort.
2. **Sort Runs:** If necessary, use Insertion Sort to ensure that each run is sorted. This step efficiently handles small arrays and maintains order within runs.
3. **Merge Runs:** Merge the sorted runs together using a modified Merge Sort. The merging process is optimized by taking advantage of the already sorted nature of the runs, which reduces the amount of merging needed.

Tim Sort adapts to the characteristics of the data by efficiently handling both small and large arrays. It is known for its stability, which means it maintains the relative order of equal elements. The merging phase leverages the existing order within runs to minimize the amount of additional sorting required.

Tim Sort is highly effective for real-world data where patterns of sorted sequences are common. It is particularly well-suited for sorting large datasets with varying levels of order and is used in many modern programming languages and libraries due to its practical performance and stability.

In [None]:
def modified_insertion_sort(array: list[int], start: int, end: int) -> None:
    # Perform insertion sort on the segment of the array defined by start and right indices
    for i in range(start + 1, end + 1):
        # Key element to be inserted into the sorted portion
        key = array[i]
        j = i - 1
        # Shift elements of the sorted segment that are greater than key to the right
        while j >= start and array[j] > key:
            array[j + 1] = array[j]
            j -= 1
        # Insert the key into its correct position
        array[j + 1] = key

def modified_merge(array: list[int], start: int, mid: int, end: int) -> None:
    # Create temporary arrays to hold the two halves of the segment
    len_left, len_right = mid - start + 1, end - mid
    left_array = array[start:mid + 1]
    right_array = array[mid + 1:end + 1]

    i, j, k = 0, 0, start
    # merge the two halves back into the original array
    while i < len_left and j < len_right:
        if left_array[i] <= right_array[j]:
            array[k] = left_array[i]
            i += 1
        else:
            array[k] = right_array[j]
            j += 1
        k += 1

    # Copy any remaining elements from left_array, if any
    while i < len_left:
        array[k] = left_array[i]
        i += 1
        k += 1

    # Copy any remaining elements from right_array, if any
    while j < len_right:
        array[k] = right_array[j]
        j += 1
        k += 1

def tim_sort(array: list[int]) -> None:
    run = 32

    # Perform insertion sort on small segments of size run
    for i in range(0, len(array), run):
        end = min(i + run - 1, len(array) - 1)
        modified_insertion_sort(array, i, end)

    # merge sorted segments using a merge approach with increasing segment sizes
    size = run
    while size < len(array):
        for left in range(0, len(array), 2 * size):
            mid = min(left + size - 1, len(array) - 1)
            right = min(left + 2 * size - 1, len(array) - 1)
            if mid < right:
                modified_merge(array, left, mid, right)
        size *= 2


In [None]:
tim_sort_array = array[:]

start_time = time.time()  # Record start time
tim_sort(tim_sort_array)
end_time = time.time()  # Record end time

tim_sort_execution_time = end_time - start_time

time_dict["Tim Sort"] = tim_sort_execution_time

print(tim_sort_array)

print("Execution time:", tim_sort_execution_time, "seconds")

## Bogo Sort
Bogo Sort, also known as "stupid sort" is an intentionally inefficient and impractical sorting algorithm that is primarily used for educational purposes or as a joke. It works by repeatedly generating random permutations of the input array until it finds a permutation that is sorted.

Here’s how Bogo Sort operates:
1. **Check if Sorted:** Verify if the current permutation of the array is sorted.
2. **Generate Random Permutation:** If the array is not sorted, randomly shuffle the elements to create a new permutation.
3. **Repeat:** Repeat the check-and-shuffle process until the array is sorted.

The algorithm continues to shuffle the elements randomly and check if they are in the correct order. Once it finds a permutation where the array is sorted, the algorithm stops.

Bogo Sort is notable for its simplicity but is highly inefficient. Its primary purpose is to demonstrate how not to design a sorting algorithm. The algorithm's performance is highly unpredictable, and it is not suitable for practical use with large datasets due to its extremely poor average-case performance. The expected time to sort an array using Bogo Sort is factorial in nature, making it infeasible for anything but trivial cases.

A visualization of how the Bogo Sort algorithm works is available at the following link: https://www.sortvisualizer.com/bogosort/

In [None]:
def is_not_sorted(array: list[int]) -> bool:
    for index in range(len(array) - 1):
        if array[index] > array[index + 1]:
            return True
    return False
    
def bogo_sort(array: list[int]) -> list[int]:
    while is_not_sorted(array):
        rand.shuffle(array)
    return array

At your own risk...

In [None]:
bogo_sort_array = array[:]

start_time = time.time()  # Record start time
bogo_sort_result = bogo_sort(bogo_sort_array)
end_time = time.time()  # Record end time

bogo_sort_execution_time = end_time - start_time

print(bogo_sort_result)

print("Execution time:", bogo_sort_execution_time, "seconds")

Now we can see the time results.

In [None]:
print(f"The created array has {len_array} elements between {min_value} and {max_value}.")

sorted_algorithms = sorted(time_dict.items(), key=lambda item:item[1])


for algorithm_name, execution_time in sorted_algorithms:
    print(algorithm_name,"-", execution_time,)

## HeapSort
*The heap Data Structure is better explained in the following chapter.*

HeapSort is a comparison-based sorting algorithm that utilizes the properties of a heap data structure to sort elements efficiently. It transforms the input array into a heap and then repeatedly extracts the maximum (or minimum) element to produce a sorted array. This algorithm is particularly notable for its ability to sort data in-place, making it memory efficient.

Here's how HeapSort operates:

1. **Heap Construction (Build Heap):** The first step in HeapSort is to create a heap from the input array. This is typically done by calling the **Build Heap** operation, which rearranges the elements of the array to satisfy the heap property (either max heap or min heap). In a max heap, each parent node is greater than or equal to its children, ensuring that the largest element is at the root.

2. **Sorting Process:**
   - **Extraction:** After the heap is built, the next step is to sort the elements. This is achieved by repeatedly extracting the root element (the maximum or minimum element in a max heap) and placing it at the end of the array.
   - After extracting the root, the last element in the heap is moved to the root position, and the heap size is reduced by one.

3. **Repeat Extraction:** The extraction and heapify process is repeated until the heap size is reduced to zero. At this point, the array will be sorted in ascending order (for max heaps) or descending order (for min heaps). 

Steps of HeapSort:

1. **Build Heap:** Convert the input array into a max heap using the Build Heap operation.
  
2. **Sort the Array:**
   - For `i` from the last index of the array down to 1:
     - Swap the root element (maximum) with the last element in the heap.
     - Reduce the size of the heap by one.
     - Apply heapify on the root element to restore the heap property.

HeapSort has a time complexity of O(n log n) for the worst, average, and best cases. This performance is attributed to the fact that both the Build Heap operation and the heapify process are logarithmic with respect to the number of elements being processed.

HeapSort is an efficient, in-place sorting algorithm that leverages the properties of heaps to sort data. By building a max heap and then repeatedly extracting the maximum element, HeapSort achieves a sorted array. It is particularly useful for scenarios where memory usage is a concern and provides a consistent performance across various input conditions.

In [1]:
# This code is for creating a heap structure, 
# this structure is better explained in the following chapter. 

class MaxHeap:
    def __init__(self):
        self.heap = []
        
    def leftchild(self, index: int) -> int:
        return 2 * index + 1
    
    def rightchild(self, index: int) -> int:
        return 2 * index + 2

    def maxHeapify(self, index: int, end: int) -> None:
        left = self.leftchild(index)
        right = self.rightchild(index)
        largest = index

        if left < end and self.heap[left] > self.heap[largest]:
            largest = left

        if right < end and self.heap[right] > self.heap[largest]:
            largest = right

        if largest != index:
            self.heap[index], self.heap[largest] = self.heap[largest], self.heap[index]
            self.maxHeapify(largest, end)

    def buildHeap(self, array: list[int]) -> None:
        self.heap = array
        for i in range(len(self.heap) // 2 - 1, -1, -1):
            self.maxHeapify(i, len(array))
            

class MinHeap:
    def __init__(self):
        self.heap = []

    def leftchild(self, index: int) -> int:
        return 2 * index + 1

    def rightchild(self, index: int) -> int:
        return 2 * index + 2

    def minHeapify(self, index: int, end: int) -> None:
        left = self.leftchild(index)
        right = self.rightchild(index)
        smallest = index

        if left < end and self.heap[left] < self.heap[smallest]:
            smallest = left

        if right < end and self.heap[right] < self.heap[smallest]:
            smallest = right

        if smallest != index:
            self.heap[index], self.heap[smallest] = self.heap[smallest], self.heap[index]
            self.minHeapify(smallest, end)

    def buildHeap(self, array: list[int]) -> None:
        self.heap = array
        for i in range(len(self.heap) // 2 - 1, -1, -1):
            self.minHeapify(i, len(array))


HeapSort Code:

In [2]:
def heapsort_ascending(heap: MaxHeap) -> list[int]:
    # Get the index of the last element in the heap
    end = len(heap.heap) - 1 
    # Iterate over the elements, starting from the last element towards the first
    for i in range(end, 0, -1):
        # Swap the first (largest) element with the current element at index i
        heap.heap[i], heap.heap[0] = heap.heap[0], heap.heap[i]
        # Restore the max heap property for the reduced heap (exclude the last sorted elements)
        heap.maxHeapify(0, i)
    # Return the sorted heap (ascending order)
    return heap.heap


def heapsort_descending(heap: MinHeap) -> list[int]:
    # Get the index of the last element in the heap
    end = len(heap.heap) - 1 
    # Iterate over the elements, starting from the last element towards the first
    for i in range(end, 0, -1):
        # Swap the first (smallest) element with the current element at index i
        heap.heap[i], heap.heap[0] = heap.heap[0], heap.heap[i]
        # Restore the min heap property for the reduced heap (exclude the last sorted elements)
        heap.minHeapify(0, i)
    # Return the sorted heap (descending order)
    return heap.heap


Example of use:

In [None]:
array = [10, 22, 3, 6, 13, 4, 7, 18, 1, 25]
print("Start array ->", array)

max_heap = MaxHeap()
max_heap.buildHeap(array.copy())  # Use array.copy() to avoid modifying the original array
ascending_array = heapsort_ascending(max_heap)
print("Ascending Order ->",max_heap.heap)

min_heap = MinHeap()
min_heap.buildHeap(array.copy())  # Use array.copy() to avoid modifying the original array
descending_array = heapsort_descending(min_heap)
print("Descending Order ->",min_heap.heap)

## Time Complexity of Sorting Algorithms

### Selection Sort
**Time Complexity: $O(n^2)$**

Selection Sort repeatedly scans the entire array to find the minimum (or maximum) element and swaps it with the first unsorted element. For each element in the array, it performs a linear scan of the remaining unsorted elements to find the minimum. With $ n $ elements, this results in $ n-1 $ comparisons for the first element, $ n-2 $ for the second, and so on, leading to a total of approximately $\frac{n(n-1)}{2}$ comparisons. Thus, the time complexity is $O(n^2)$, which reflects the quadratic growth in the number of operations as the number of elements increases.

**Auxiliary Space:** $O(1)$

Selection Sort is an in-place sorting algorithm. It requires only a constant amount of extra space beyond the input array itself for temporary variables used during swapping.

### Bubble Sort
**Time Complexity: $O(n^2)$**

Bubble Sort compares each pair of adjacent elements and swaps them if they are in the wrong order. This process is repeated until no more swaps are needed. In the worst case, every element must be compared with every other element, resulting in roughly $\frac{n(n-1)}{2}$ comparisons. The algorithm's best-case time complexity is $O(n)$ if the array is already sorted and optimized with a flag to detect if no swaps were made.

**Auxiliary Space:** $O(1)$

Bubble Sort operates in-place, meaning it only needs a small, constant amount of extra space for temporary variables during swapping.

### Insertion Sort
**Time Complexity: $O(n^2)$**

Insertion Sort builds the final sorted array one element at a time by inserting each new element into its correct position within the already sorted portion. In the worst case, inserting an element may require comparing it with every previously sorted element, leading to $\frac{n(n-1)}{2}$ comparisons in total. However, it can achieve $O(n)$ complexity when the array is already sorted or nearly sorted.

**Auxiliary Space:** $O(1)$

Insertion Sort is an in-place sorting algorithm, requiring only a small amount of extra space for temporary storage.

### Merge Sort
**Time Complexity: $O(n \log n)$**

Merge Sort divides the array into two halves, recursively sorts each half, and then merges the sorted halves back together. The divide step takes $\log n $ levels of recursion, and each level involves merging $ n $ elements. Since each merge operation takes linear time and there are $\log n $ levels, the overall time complexity is $O(n \log n)$.

**Auxiliary Space:** $O(n)$

Merge Sort requires additional space proportional to the size of the input array to hold the temporary arrays used during merging.

### Quick Sort
**Time Complexity: $O(n^2)$ (worst case), $O(n \log n)$ (average and best case)**

Quick Sort selects a pivot and partitions the array into elements less than and greater than the pivot. In the average case, the pivot roughly divides the array into equal halves, leading to $O(n \log n)$ complexity. However, in the worst case, such as when the pivot choices are poor and the array is already sorted or nearly sorted, it can degrade to $O(n^2)$. The average-case performance is typically very efficient.

**Auxiliary Space:** $O(\log n)$ (average), $O(n)$ (worst case)

The space complexity arises from the recursion stack. On average, it uses $ \log n $ space for recursion. In the worst case, the recursion depth can be $ n $, particularly with poor pivot choices.

### Randomized Quick Sort
**Time Complexity: $O(n^2)$ (worst case), $O(n \log n)$ (average case)**

Randomized Quick Sort randomly selects a pivot, which generally avoids the worst-case scenarios associated with deterministic pivot choices. By randomizing the pivot selection, it ensures that the partitioning is more balanced on average, leading to an expected time complexity of $O(n \log n)$. The worst-case remains $O(n^2)$ but is less likely to occur compared to deterministic Quick Sort.

**Auxiliary Space:** $O(\log n)$ (average), $O(n)$ (worst case)

Similar to Quick Sort, the auxiliary space is primarily due to the recursion stack, which averages $O(\log n)$ but can reach $O(n)$ in the worst case.

### Counting Sort
**Time Complexity: $O(n + k)$**

Counting Sort counts the occurrences of each distinct element in the input array, where $ k $ is the range of input values. The time complexity is $ O(n + k) $ because it requires a linear scan to count elements and another linear scan to build the sorted output array.

### Counting Sort
**Auxiliary Space:** $O(n + k)$

Counting Sort requires additional space for two arrays: the `countArray` and the `outputArray`. The `countArray` stores the frequency of each distinct element in the input array, while the `outputArray` is used to build the final sorted array. Here, $ n $ represents the number of elements in the input array, and $ k $ represents the range of the input values (i.e., the size of the `countArray`). Therefore, the total auxiliary space required is $ O(n + k) $.

### Radix Sort
**Time Complexity: $O(n \cdot k)$**

Radix Sort processes the input array digit by digit. If $ k $ represents the number of digits in the largest number, then each pass through the digits involves counting and sorting operations. Since there are $ k $ digits and each digit requires linear time to sort, the time complexity is $ O(n \cdot k) $.

**Auxiliary Space:** $O(n + k)$

Radix Sort requires space for both the counting array and temporary arrays used during the sorting of each digit. Therefore, the space complexity is $ O(n + k) $.

### Bucket Sort
**Time Complexity: $O(n^2)$ (worst case), $O(n + k)$ (average case)**

Bucket Sort distributes elements into buckets and sorts each bucket individually. If the buckets are well-distributed and sorted efficiently, the time complexity is $O(n + k)$. However, in the worst case, if many elements fall into a single bucket, sorting that bucket can be time-consuming, leading to $O(n^2)$ complexity.

**Auxiliary Space:** $O(n + k)$

Bucket Sort requires space for the buckets and the final sorted output array. The space complexity is proportional to the number of elements plus the number of buckets.

### Tim Sort
**Time Complexity: $O(n \log n)$**

Tim Sort is a hybrid sorting algorithm that combines Merge Sort and Insertion Sort. It first divides the array into small runs and sorts them using Insertion Sort. Then it merges the runs using a modified Merge Sort. The divide-and-conquer approach of merging sorted runs ensures that the time complexity is $O(n \log n)$.

**Auxiliary Space:** $O(n)$

Tim Sort requires additional space proportional to the size of the input array for temporary storage during the merging process.

### HeapSort
**Time Complexity: $O(n \log n)$**

HeapSort is a comparison-based sorting algorithm that utilizes the heap data structure to sort elements. The process begins with constructing a max heap from the input array, which takes $O(n)$ time. Once the heap is built, the algorithm repeatedly extracts the maximum element (the root of the heap) and moves it to the end of the array. This extraction process involves swapping the root with the last element of the heap and then applying the heapify down operation to restore the heap property. Since there are $ n $ elements to extract and each extraction requires $O(\log n)$ time to maintain the heap structure, the overall time complexity of HeapSort is $O(n \log n)$.

**Auxiliary Space:** $O(1)$

HeapSort is an in-place sorting algorithm, which means it requires a constant amount of extra space beyond the input array itself. It does not require any additional data structures, making it efficient in terms of memory usage. The only extra space needed is for a few temporary variables used during the swapping of elements.