# Sorting algorithms

Almost every action you take in a web relies on sorted data. Mostly programming languages provide their own standard implementation. In Python, we use `sorted()`. Example: 

In [2]:
class Influencer:
    def __init__(self, num_selfies, num_bio_links):
        self.num_selfies = num_selfies
        self.num_bio_links = num_bio_links

def vanity(influencer):
    vanity_score = (influencer.num_bio_links * 5) + influencer.num_selfies
    return vanity_score


def vanity_sort(influencers):
    return sorted(influencers, key=lambda influencer: vanity(influencer))

theprimeagen = Influencer(100, 1)
pokimane = Influencer(800, 2)
spambot = Influencer(0, 200)
lane = Influencer(10, 2)
badcop = Influencer(1, 2)

print(vanity(lane))
print(vanity_sort([pokimane, theprimeagen, spambot, badcop, lane]))

20
[<__main__.Influencer object at 0x72ef6bddc470>, <__main__.Influencer object at 0x72ef4b78b0b0>, <__main__.Influencer object at 0x72ef4b78af90>, <__main__.Influencer object at 0x72ef4b78b800>, <__main__.Influencer object at 0x72ef4b78af30>]


# Bubble sort `O(n^2)`

Basic sorting algorithm, named for the way elements "bubble up" to the top of the list. It repeats steps through a slice and compares adjacent elements, swapping them if they are out of order. It continues until the whole list is completely sorted. 

### **Bubble sort**

In [10]:
import time
def bubble_sort(nums):
    swapping = True
    end = len(nums)
    while swapping:
        swapping = False
        for i in range(1, end):
            if nums[i - 1] > nums[i]:
                temp = nums[i -1]
                nums[i - 1] = nums[i]
                nums[i] = temp
                swapping = True
        end -= 1
    return nums
lst = [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
start = time.time()
print(bubble_sort(lst))
end = time.time()
print(f"The function took {end - start} seconds")

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
The function took 0.0002396106719970703 seconds


> In the best case, an amount of data already sorted is passed to **bubble sort**, so the algorithm is really fast, only getting slower by the amount of data it has to process, so it is a **O(n)**
>
> In the worst case, the data is in reverse order, so **bubble sort** becomes really slow, because it has to pass all the information through the inner loop, so the complexity becomes **O(n^2)**

___



# Merge Sort `O(n*log(n))`

Is a recursive sorting algorithm faster than **bubble sort**. It is a **divide and conquer algorithm**:

* Divide the large problem into smaller ones, and recursively solve the smaller problems
* Combine the results of the smaller problems to solve the larger

In this algorithm:

- Divide the array into two halves
- Recursively sort the two halves
- Merge the two halves to form a sorted array

This algorithm consist in two functions:

> **`merge_sort()`** divides the input into two halves, calls itself on each half and then merges the two sorted halves back together. Here is where the divide happens
>
> **`merge()`** merges two already sorted lists back into a single sorted list. At the lowest level, the two sorted lists will each have only one element that will be compared to be sorted into a list of two. Here is where the sorting happens

### **Merge sort**

In [9]:
import time
def merge_sort(nums):
    #This is the base case
    if len(nums) < 2:
        return nums
        
    half_left_limit = len(nums)//2
    half_right_start = len(nums)//2 
    
    sorted_left_side = merge_sort(nums[:half_left_limit])
    sorted_right_side = merge_sort(nums[half_right_start:])
    
    return merge(sorted_left_side, sorted_right_side)

def merge(first, second):
    final = []
    
    #Used to keep track of indexes in inputs lists
    i = 0
    j = 0

    #The loops continues until all items from one of the lists have been added
    while i < len(first) and j < len(second):
        if first[i] <= second[j]:
            final.append(first[i])
            i += 1
        else:
            final.append(second[j])
            j += 1
    #Appending extra items that left over
    final.extend(first[i:])
    final.extend(second[j:])
    return final
    
start = time.time()
print(merge_sort([9, 8, 7, 6, 5, 4, 3, 2, 1, 0]))
end = time.time()
print(f"The function took {end - start} seconds")

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
The function took 0.0002923011779785156 seconds


# Insertion Sort

This algorithm builds a sorted list one item at a time. It is less efficient on large lists than `merge sort` because is **O(n^2)**, but it is faster on small lists. 


    Fast: for very small data sets 
    Adaptive: Faster for partially sorted data sets
    Stable: Does not change the relative order of elements with equal keys
    In-Place: Only requires a constant amount of memory
    Inline: Can sort a list as it receives it


### **Insertion sort**

In [27]:
def insertion_sort(nums):
    for i in range(1, len(nums)):
        j = i
        while j > 0 and nums[j - 1] > nums[j]:
            nums[j], nums[j - 1] = nums[j - 1], nums[j]
            j -= 1
    return nums

start = time.time()
print(insertion_sort([9, 8, 7, 6, 5, 4, 3, 2, 1, 0]))
end = time.time()
print(f"The function took {end - start} seconds")

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
The function took 0.0003261566162109375 seconds


# Quick Sort

Efficient sorting algorithm that is widely used in production sorting implementations. It is a **divide and conquer algorithm**:

**Divide**
- Select a pivot
- Move everything onto the greater and less than the pivot
- Pivot is set now
- Recursively repeat the operation on both sides

**Conquer**
- The array is sorted after all elements have been through the pivot operation

On average, `quicksort` is **`O(n*log(n))`**. In the worst case (an already sorted list) it degrades to **`O(n^2)`**

**`partition()`** has a single for-loop that ranges from the lowest to higher index, so is a **`O(n)`** function

The overall complexity of **`quicksort`** depends on how many times partition() is called

In the best case, the pivot is already the middle element of each sublist, so the quicksort is **`O(n)`** because it calls **`n`** times `quicksort`

The following version of **`Quicksort`** is almost always able to perform at speeds of **`O(n*log(n))`**, but is still technically **`O(n^2)`**. The solution could be:

> *Random Approach*

- The function shuffles the list before sorting, which is an `O(n)` operation

> *Median of Three Approach*

- Three elements of each partition are chosen, and the median is found between them. That item is the pivot
- This solution makes the function remain deterministic and pure

### **Quicksort**

In [15]:
import time
def quick_sort(nums, low, high):
    if low < high:
        
        middle_index = partition(nums, low, high)

        #middle_index < pivot
        left_side = quick_sort( nums, low, middle_index - 1)

        #middle_index >= pivot
        right_side = quick_sort( nums, middle_index, high)


def partition(nums, low, high):

    #Pivot is the reference to compare
    pivot = nums[high]
    i = low - 1

    #j is the index number that compares values in nums to pivot
    for j in range(low, high):

        if nums[j] < pivot:
            i += 1
            nums[i], nums[j] = nums[j], nums[i]

    #Because the pivot now is the middle value, it should be pleaced now
    #With lower values on the left, and higher on the right
    nums[i+1], nums[high] = nums[high], nums[i+1]

    #This is returned because the pivot was replaced with the 
    #value on i  (the lowest number before previous pivot)
    return i + 1

start = time.time()
input1 = [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
quick_sort(input1, 0, 9)
print(input1)
end = time.time()
print(f"The function took {end - start} seconds")

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
The function took 0.00039577484130859375 seconds


# Selection Sort

This algorithm is similar to bubble sort in that it works by repeatedly swapping items in a list. It is slightly more efficient, because it only makes one swap per iteration


In [16]:
import time
def selection_sort(nums):
    for i in range(len(nums)):
        smallest_idx = i
        for j in range(i + 1, len(nums)):
            #Checking over the hole list to find the lowest number
            if nums[j] < nums[smallest_idx]:
                smallest_idx = j
        nums[i], nums[smallest_idx] = nums[smallest_idx], nums[i]
    return nums

start = time.time()
print(selection_sort([9, 8, 7, 6, 5, 4, 3, 2, 1, 0]))
end = time.time()
print(f"The function took {end - start} seconds")

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
The function took 0.00029754638671875 seconds
