Bubble Sort, Merge Sort, Quicksort and Divide-n-Conquer Algorithms in Python

Let's check out our problem from lesson 3 of the Jovian course:

## Problem 


In this notebook, we'll focus on solving the following problem:

> **QUESTION 1**: You're working on a new feature on Jovian called "Top Notebooks of the Week". Write a function to sort a list of notebooks in decreasing order of likes. Keep in mind that up to millions of notebooks  can be created every week, so your function needs to be as efficient as possible.


The problem of sorting a list of objects comes up over and over in computer science and software development, and it's important to understand common approaches for sorting, and the trade-offs they offer. Before we solve the above problem, we'll solve a simplified version of the problem:

> **QUESTION 2**: Write a program to sort a list of numbers.


"Sorting" usually refers to "sorting in ascending order", unless specified otherwise.

Remember our method of attack:

1. State the problem clearly, in your own words. Identify the input and output.
2. Come up with some examples for test cases using the inputs and outputs. Cover your edge cases.
3. Find a simple solution to the problem in English (or whatever language you prefer).
4. Impliment your solution and test it with the example inputs. Squash your bugs.
5. Analyze your algorithms' time and space complexity. 
6. Optimize your code for any inefficiencies.

Sorting solutions are essential to solving common problems in computer science.

Generally, we'll have inputs and outputs like these:

Input
nums: A list of numbers e.g. [4, 2, 6, 3, 4, 6, 2, 1]
Output
sorted_nums: The sorted version of nums e.g. [1, 2, 2, 3, 4, 4, 6, 6]

Here are some cases that we need to test:

1. Lists of nummbers in random orders.
2. Lists that have been sorted.
3. Lists sorted in descending order.
4. Lists with repeating elements.
5. Empty lists.
6. Lists with one element.
7. Lists with one element that is repeated many times.
8. Long lists.

In [1]:
def sort(nums):
    pass

In [26]:
#Random list with no repeated elements
test0 = {
    'input': {
        'nums': [4, 6, 3, 8, 5, 7, 2, 1]
    },
    'output': [1, 2, 3, 4, 5, 6, 7, 8]
}

In [5]:
# Random list with negative elements
test1 = {
    'input': {
        'nums': [5, 2, 6, 1, 23, 7, -12, 12, -243, 0]
    },
    'output': [-243, -12, 0, 1, 2, 5, 6, 7, 12, 23]
}

In [6]:
# Sorted list
test2 = {
    'input': {
        'nums': [3, 5, 6, 8, 9, 10, 99]
    },
    'output': [3, 5, 6, 8, 9, 10, 99]
}

In [19]:
# Descending order lists
test3 = {
    'input': {
        'nums': [99, 10, 9, 8, 6, 5, 3]
    },
    'output': [3, 5, 6, 8, 9, 10, 99]
}

In [21]:
# Random list with repeating elements
test4 = {
    'input': {
        'nums': [5, -12, 2, 6, 1, 23, 7, 7, -12, 6, 12, 1, -243, 1, 0]
    },
    'output': [-243, -12, -12, 0, 1, 1, 1, 2, 5, 6, 6, 7, 7, 12, 23]
}

In [10]:
#Empty list
test5 = {
    'input': {
        'nums': []
    },
    'output' : []
}

In [12]:
#List with one element
test6 = {
    'input': {
        'nums': [23]
    },
    'output': [23]
}

In [17]:
#List with one element and many repeats
test7 = {
    'input': {
        'nums' : [23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23]
    },
    'output': [23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23]
}


In [27]:
#Really long lists
#Use helper methods from random like here, so you don't have to do this manually:
import random

in_list = list(range(10000))
out_list = list(range(10000))

random.shuffle(in_list)

test8 = {
    'input':{
        'nums': in_list
    },
    'output': out_list

}

In [28]:
tests = [test0, test1, test2, test3, test4, test5, test6, test7, test8]

First, we should come up with a simple solution in English.

1. Iterate over a given list.
2. Compare one number with the next.
3. Swap the first number with the second if the value is greater.
4. Repeat steps 1-3 until the list is sorted.

Time complexity: we will repeat the first three steps up to n-1 times, because largest number in the list will become the final element at worst when iterating through every other element.

Bubble sort: this method is called bubble sort, because the larger elements bubble to the top while the others sink to the bottom.

See the following for a visual representation:

![](https://upload.wikimedia.org/wikipedia/commons/c/c8/Bubble-sort-example-300px.gif)

Next, we will impliment a solution

In [29]:
def bubble_sort(nums):
    
    #copy list
    nums = list(nums)

    #iterate n-1 times
    for _ in range(len(nums) - 1):

        #for each element in the array except the last
        for i in range(len(nums) - 1):

            #compare one number to the next number:
            if nums[i] > nums[i + 1]:

                #Swap the numbers. (We can do both at once, because Python is awesome.)
                nums[i], nums[i + 1] = nums[i + 1], nums[i]

    #return the sorted list
    return nums

Testing, attention please:

In [30]:
nums0, output0 = test0['input']['nums'], test0['output']

print('Input:', nums0)
print('Expected output:', output0)
result0 = bubble_sort(nums0)
print('Actual output:', result0)
print('Match:', result0 == output0)

Input: [4, 6, 3, 8, 5, 7, 2, 1]
Expected output: [1, 2, 3, 4, 5, 6, 7, 8]
Actual output: [1, 2, 3, 4, 5, 6, 7, 8]
Match: True


Let's think about time complexity.

We have two loops in bubble sort. 

Each of them iterate through our list up to n-1 times. 

Therefore, the worst case scenario would result in (n-1)*(n-1) iterations through our list.

(n-1)*(n-1) = n^2 - 2n + 1

Remember, we drop our constants and are left with an exponential space complexity where the highest power is n^2, or O(N^2). This is also called quadratic complexity.

What about space complexity?

Although our list only requires a constant space, we do need to initially consider the numbers in our list. Therefore, the space complexity depends on the amount of nums and results in O(N) space complexity.

Where is the inefficiency?

Large lists require an exponential amount of time compared to more efficient algorithms, because only two elements are being compared and shifted one position at a time. 

Let's look at another algorithm of the same complexity, and improve our efficiency after that.



In [31]:
def insertion_sort(nums):
    nums = list(nums)
    for i in range(len(nums)):
        cur = nums.pop(i)
        j = i-1
        while j >=0 and nums[j] > cur:
            j -= 1
        nums.insert(j+1, cur)
    return nums            

In [33]:
nums0, output0 = test0['input']['nums'], test0 ['output']

print('input', nums0)
print('expected output', output0)
result0 = insertion_sort(nums0)
print('Actual output', result0)
print('Do we have a match?:', result0 == output0 )

input [4, 6, 3, 8, 5, 7, 2, 1]
expected output [1, 2, 3, 4, 5, 6, 7, 8]
Actual output [1, 2, 3, 4, 5, 6, 7, 8]
Do we have a match?: True


Let's make these algorithms more efficient by dividing and conquering.

1. Divide the inputs into two roughly equal parts.
2. Recursively solve the two problems individually for each of the parts.
3. Combine the results to solve the problem for the original inputs.
4. Include terminating conditions for small or indivisible inputs.

Here's a visual representation:


![](https://www.educative.io/api/edpresso/shot/5327356208087040/image/6475288173084672)

### Merge Sort

<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/e/e6/Merge_sort_algorithm_diagram.svg/2560px-Merge_sort_algorithm_diagram.svg.png" width="480">



## 6. Apply the right technique to overcome the inefficiency. Repeat Steps 3 to 6.


To performing sorting more efficiently, we'll apply a strategy called **Divide and Conquer**, which has the following general steps:

1. Divide the inputs into two roughly equal parts.
2. Recursively solve the problem individually for each of the two parts.
3. Combine the results to solve the problem for the original inputs.
4. Include terminating conditions for small or indivisible inputs.

Here's a visual representation of the strategy:

![](https://www.educative.io/api/edpresso/shot/5327356208087040/image/6475288173084672)

This strategy is known as

### Merge Sort

Here's an example:


<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/e/e6/Merge_sort_algorithm_diagram.svg/2560px-Merge_sort_algorithm_diagram.svg.png" width="480">
