# Sorting

Sorting is a fundamental operation that significantly impacts the efficiency and effectiveness of various algorithms, data structures, and applications. Choosing the right sorting algorithm depends on the specific requirements of the task at hand, such as the size of the dataset, the nature of the data, and the desired performance characteristics.

**Search Operations:**

Sorted data allows for more efficient searching. Binary search, for example, can only be performed on a sorted list. This search algorithm has a time complexity of O(log n), making it significantly faster than linear search for large datasets.

<br>

**Data Retrieval:**

Databases often use sorting to retrieve information more quickly. For instance, when you query a database to retrieve records based on a specific attribute (e.g., sorting employees by their salaries), having the data pre-sorted can significantly speed up the retrieval process.

<br>

**Data Analysis:**

Sorting is fundamental in data analysis and statistics. It facilitates tasks such as finding the median, quartiles, and other statistical measures. These operations are more efficient on sorted data.

<br>

**User Experience:**

In user interfaces, sorted data provides a better user experience. For example, sorted lists or tables make it easier for users to find the information they are looking for, improving the overall usability of applications.

<br>

**Algorithm Design:**

Many algorithms and data structures rely on sorted data for optimal performance. For instance, heap data structures, priority queues, and certain graph algorithms can benefit from sorted input.

With Big O, you express complexity in terms of how quickly your algorithm’s runtime grows relative to the size of the input.

Assuming that n is the size of the input to an algorithm, the Big O notation represents the relationship between n and the number of steps the algorithm takes to find a solution. 

O(1): Constant time complexity. The algorithm's runtime or space usage does not depend on the size of the input. Finding an element in a hash table is an example of an operation that can be performed in constant time.

O(n): Linear time complexity. The runtime or space usage grows linearly with the size of the input. A function that checks a condition on every item of a list is an example of an O(n) algorithm.

O(n^2): Quadratic time complexity. Common in algorithms with nested iterations. A naive implementation of finding duplicate values in a list, in which each item has to be checked twice, is an example of a quadratic algorithm.

O(log n): Logarithmic time complexity. The runtime grows linearly while the size of the input grows exponentially. For example, if it takes one second to process one thousand elements, then it will take two seconds to process ten thousand, three seconds to process one hundred thousand, and so on. Binary search is an example of a logarithmic runtime algorithm.

O(2^n): The runtime grows exponentially with the size of the input. These algorithms are considered extremely inefficient. An example of such an algorithm is the naive recursive implementation of the Fibonacci sequence. Example below:



In [11]:
def fibonacci(n):
    if n <= 1:
        return n
    else:
        return fibonacci(n-1) + fibonacci(n-2)

# Example usage
result = fibonacci(5)
print(result)

5


**Bubble sort**

Bubble Sort is a simple sorting algorithm that repeatedly steps through the list, compares adjacent elements, and swaps them if they are in the wrong order.

Bubble Sort is one of the most straightforward sorting algorithms. Its name comes from the way the algorithm works: With every new pass, the largest element in the list “bubbles up” toward its correct position.

Bubble sort consists of making multiple passes through a list, comparing elements one by one, and swapping adjacent items that are out of order.

<br>

**Pros**:

Simple to understand and implement.

Space complexity is minimal as it only requires a constant amount of additional memory.

**Cons:**

Inefficient for large datasets.

Time complexity is O(n^2) in the worst and average cases.


In [17]:
def bubble_sort(arr):
    
    # We have to go through all the elements in the array
    n = len(arr)
    
    for i in range(n):
        # Last i elements are already sorted, so we don't need to check them
        for j in range(0, n-i-1):
            # Swap if the element found is greater than the next element.
            if arr[j] > arr[j+1]:
                arr[j], arr[j+1] = arr[j+1], arr[j]

# Example usage:
arr = [64, 34, 25, 12, 22, 11, 90]
bubble_sort(arr)
print("Sorted array:", arr)

Sorted array: [11, 12, 22, 25, 34, 64, 90]


In [None]:
array = [5, 3, 8, 2, 1]

Pass 1:
[3, 5, 2, 1, 8]   # Largest element (8) gets in its correct position, while (5) got forward one position, because it is higher than 3.

#In this pass, the inner loop runs through all the items (from 0 to n-1).

#The last element after Pass 1 is the largest and sorted, so no need to compare it in the subsequent passes.

Pass 2:
[3, 2, 1, 5, 8]   # Second-largest element gets in its correct position

#In this pass, the inner loop runs from 0 to n-2 because the last element is already sorted.

Pass 3:
[2, 1, 3, 5, 8]   # Third-largest element gets in its correct position

#In this pass, the inner loop runs from 0 to n-3.

Pass 4:
[1, 2, 3, 5, 8]   # All elements are in their correct positions

#In this pass, the inner loop runs from 0 to n-4.

**Selection sort**

Selection Sort is a simple sorting algorithm that divides the input list into two parts: a sorted region and an unsorted region. The algorithm repeatedly selects the smallest (or largest) element from the unsorted region and swaps it with the first element of the unsorted region. This process is repeated until the entire array is sorted.

<br>

**Pros:**

Simple to implement.

In-place sorting algorithm.

**Cons:**

Inefficient for large datasets.

Time complexity is O(n^2) in the worst and average cases.

In [29]:
def selection_sort(arr):
    n = len(arr)
    
    #We have to go through all the elements in the array
    for i in range(n):
        #We mark the first element of the unsorted region.
        min_index = i
        #We begin with the second element (index 1) in the array, we will always investigate from the second item 
        #in the unsorted area (i+1) if it is smaller than the first item.
        for j in range(i+1, n):
            #Find the minimum element in the remaining unsorted array. We go through all the items in the unsorted area and
            #choose the smallest one
            if arr[j] < arr[min_index]:
                min_index = j
        #Here we replace the chosen one with the first item of the unsorted area.
        arr[i], arr[min_index] = arr[min_index], arr[i]

# Example usage:
arr = [64, 34, 25, 12, 22, 11, 90]
selection_sort(arr)
print("Sorted array:", arr)
print(len(arr))
for i in range(1,7):
    print(i)

Sorted array: [11, 12, 22, 25, 34, 64, 90]
7
1
2
3
4
5
6


The algorithm starts with the entire array considered as the unsorted region.

In each pass through the unsorted region, the algorithm identifies the minimum element.

The minimum element found in the unsorted region is then swapped with the first element of the unsorted region.

The sorted region grows by one element, and the unsorted region shrinks by one element.

Consider the array [64, 25, 12, 22, 11].

Pass 1:

Find the minimum element (11) in the unsorted region and swap it with the first element.

Array: [11, 25, 12, 22, 64]


Pass 2:

Find the minimum element (12) in the remaining unsorted region and swap it with the first element of the unsorted region.

Array: [11, 12, 25, 22, 64]


Pass 3:

Find the minimum element (22) in the remaining unsorted region and swap it with the first element of the unsorted region.

Array: [11, 12, 22, 25, 64]


Pass 4:

Find the minimum element (25) in the remaining unsorted region and swap it with the first element of the unsorted region.

Array: [11, 12, 22, 25, 64]


Pass 5:
The array is now fully sorted.


**Insertion Sort**

Insertion Sort is a simple sorting algorithm that builds the final sorted array one element at a time. It is much less efficient on large lists than more advanced algorithms such as quicksort, heapsort, or merge sort. However, insertion sort has advantages for small lists or partially ordered lists, and it is often used as the building block for more complex sorting algorithms.

Like bubble sort, the insertion sort algorithm is straightforward to implement and understand. But unlike bubble sort, it builds the sorted list one element at a time by comparing each item with the rest of the list and inserting it into its correct position. This “insertion” procedure gives the algorithm its name.

<br>

**Pros:**

Efficient for small datasets or nearly sorted data.

Adaptive, i.e., it performs well when the data is partially ordered.

**Cons:**

Inefficient for large datasets.

Time complexity is O(n^2) in the worst and average cases.

1. In the beginning the first element of the array is considered as the sorted region, and the rest is the unsorted region.

2. (Literally the second step) The code investigates, if the first value in the unsorted area is lower than the value in the sorted region. Right now we have only one item in the sorted area. If our newly found element is lower than the one in the sorted region then they switch. Now we have 2 items in the sorted region.

3. During the next iteration we will have already 2 items in our sorted list. Now we investigate the first value in the unsorted list again (3rd value in our list now). If it is lower than the highest value in the sorted area then they switch. Then if it is even lower than the second lowest value in the sorted area, then they switch too.

4. We go through all the elements in the unsorted area


In [21]:
def insertion_sort(arr):
    # Go through all array elements starting from the second element
    for i in range(1, len(arr)):
        key = arr[i]

        # Move elements to the left until it reached the destination.
        j = i - 1
        while j >= 0 and key < arr[j]:
            arr[j + 1] = arr[j]
            j -= 1

        arr[j + 1] = key

# Example usage:
arr = [64, 25, 12, 22, 11]
insertion_sort(arr)
print("Sorted array:", arr)

Sorted array: [11, 12, 22, 25, 64]


Let's visualize the Insertion Sort algorithm with a step-by-step example using the array [64, 25, 12, 22, 11]

Initial State:

Sorted region: [64]

Unsorted region: [25, 12, 22, 11]

<br>

Pass 1:

Key Element: 25

Compare 25 with 64 in the sorted region. Since 25 is smaller, move 64 one position to the right.

Sorted region: [25, 64]

Unsorted region: [12, 22, 11]

Update array: [25, 64, 12, 22, 11]

<br>

Pass 2:

Key Element: 12

Compare 12 with 64 and 25 in the sorted region. Move 64 and 25 one position to the right.

Insert 12 in its correct position.

Sorted region: [12, 25, 64]

Unsorted region: [22, 11]

Update array: [12, 25, 64, 22, 11]

<br>

Pass 3:

Key Element: 22

Compare 22 with 64, 25, and 12 in the sorted region. Move 64, 25, and 12 one position to the right.

Insert 22 in its correct position.

Sorted region: [12, 22, 25, 64]

Unsorted region: [11]

Update array: [12, 22, 25, 64, 11]

<br>

Pass 4:

Key Element: 11

Compare 11 with 64, 25, 22, and 12 in the sorted region. Move 64, 25, 22, and 12 one position to the right.

Insert 11 in its correct position.

Sorted region: [11, 12, 22, 25, 64]

Unsorted region: []

Update array: [11, 12, 22, 25, 64]

**Merge Sort**

Merge Sort is a divide-and-conquer algorithm that divides the input list into two halves, recursively sorts each half, and then merges the sorted halves.

Divide-and-conquer algorithms typically follow the same structure:

The original input is broken into several parts, each one representing a subproblem that’s similar to the original but simpler. Each subproblem is solved recursively.

The solutions to all the subproblems are combined into a single overall solution.

**Pros:**

Efficient for large datasets.

Stable sort (maintains the relative order of equal elements).

Time complexity is O(n log2 n) in the worst, average, and best cases.

**Cons:**

Requires additional memory for the merge step.

In [31]:
def merge_sort(arr):
    # Check if the array has more than one element
    if len(arr) > 1:
        
        # Find the middle of the array
        mid = len(arr) // 2

        # Divide the array into two halves
        left_half = arr[:mid]
        right_half = arr[mid:]

        # Recursively call merge_sort on each half
        merge_sort(left_half)
        merge_sort(right_half)

        # Merge the sorted halves
        merge(arr, left_half, right_half)

def merge(arr, left, right):
    i = j = k = 0
    
    # Compare elements from left and right halves and merge them in sorted order
    while i < len(left) and j < len(right):
        if left[i] < right[j]:
            arr[k] = left[i]
            i += 1
        else:
            arr[k] = right[j]
            j += 1
        k += 1

    # Check for any remaining elements in left and right halves
    while i < len(left):
        arr[k] = left[i]
        i += 1
        k += 1

    while j < len(right):
        arr[k] = right[j]
        j += 1
        k += 1

# Example usage:
arr = [64, 25, 12, 22, 11]
merge_sort(arr)
print("Sorted array:", arr)

Sorted array: [11, 12, 22, 25, 64]


**Initial State:**

Unsorted array: [64, 25, 12, 22, 11]

<br>

**Step 1: Splitting the Array**

Split the array into two halves: [64, 25] and [12, 22, 11].

<br>

**Step 2: Recursive Calls on Left Half [62, 25]**

Split into [64] and [25]. After the split, they are single element lists, therefore no additional recursive step is required now.

The merge() function sorts these elements and as the name suggests, it merges them together.

<br>

**Step 3: Recursive Calls on Right Half**

(Recursive Call 1):

Splits them into [12] and [22, 11].

Recursively apply Merge Sort on both halves.

Since the [12] is a single element list, no actions are taken on this side.

But for [22, 11] an additional recursive call is needed to separate them.

<br>

(Recursive Call 2):

Split the right half of the already splitted array into [22] and [11].

The merge() function sorts and merges them into [11, 22]

<br>

Now we have [12] and [11, 22]

The merge() function sorts and merges them together into [11, 12, 22]

<br>

**Step 4: Merging Two Halves**

Now the last step is to sort and merge the 2 created arrays togerhet [25, 64] and [11, 12, 22]

The entire array is now sorted: [11, 12, 22, 25, 64]

**Quick Sort**

Quick Sort is a divide-and-conquer algorithm that selects a 'pivot' element and partitions the other elements into two sub-arrays according to whether they are less than or greater than the pivot. The sub-arrays are then sorted recursively.

<br>

**Pros:**

Efficient for large datasets.

In-place, requiring only a small amount of additional memory.

**Cons:**

Not stable.

Time complexity is O(n^2) in the worst case, but usually O(n log n) on average.

<br>

The O(n) best-case scenario happens when the selected pivot is close to the median of the array, and an O(n^2) scenario happens when the pivot is the smallest or largest value of the array. 

Theoretically, if the algorithm focuses first on finding the median value and then uses it as the pivot element, then the worst-case complexity will come down to O(n log2n). The median of an array can be found in linear time, and using it as the pivot guarantees the Quicksort portion of the code will perform in O(n log2n).

In [47]:
def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    else:
        pivot = arr[0]
        print("Pivot:", pivot)
        
        # Elements less than or equal to the pivot
        less = [x for x in arr[1:] if x <= pivot]
        print("Less than or equal to pivot:", less)
        
        # Elements greater than the pivot
        greater = [x for x in arr[1:] if x > pivot]
        print("Greater than pivot:", greater)
        
        # Recursive calls and combining results
        sorted_result = quick_sort(less) + [pivot] + quick_sort(greater)
        print("Sorted Result:", sorted_result)
        
        return sorted_result

# Example usage:
arr = [25, 12, 22, 11, 90]
result = quick_sort(arr)
print("Final Sorted array:", result)

Pivot: 25
Less than or equal to pivot: [12, 22, 11]
Greater than pivot: [90]
Pivot: 12
Less than or equal to pivot: [11]
Greater than pivot: [22]
Sorted Result: [11, 12, 22]
Sorted Result: [11, 12, 22, 25, 90]
Final Sorted array: [11, 12, 22, 25, 90]
