Chapter 13 Sorting with Divide and Conquer<br>

Divide and Conquer is a paradigm for algorithm design. It usually consists of 3 (plus one) parts. The first part is to divide the problem into 2 or more pieces. The second part is the conquer step, where one solves the problem on the pieces. The third part is the combine step where one combines the solutions on the parts into a solution on the whole.<br>

The description of these parts leads pretty directly to recursive algorithms, i.e. using recursion for the conquer part. The other part that ap- pears in many such algorithms is a base case, as one might expect in a recursive algorithm. This is where you deal with inputs so small that they cannot be divided.<br>

Basically you divide the problem untill u cannot divide it further and then combine their solutions together.<br>

13.1 Mergesort<br>

The most direct application of the Divide and Conquer paradigm to the sorting problem is the mergesort algorithm. In this algorithm, all the difficult work is in the merge step.

In [3]:
def mergesort(L):
    if len(L) < 2:
        return
    
    mid = len(L) // 2
    A = L[:mid]
    B = L[mid:]
    
    mergesort(A)
    mergesort(B)

    merge(A, B, L)

def merge(A, B, L):
    i = 0
    j = 0
    while i < len(A) and j < len(B):
        if A[i] < B[j]:
            L[i + j] = A[i]
            i += 1
        else:
            L[i + j] = B[j]
            j += 1
        
    L[i + j:] = A[i:] + B[j:]

l=[3,2,1]
mergesort(l)
print(l)

[1, 2, 3]


To reduce the last line calculation we can use the updated function

In [4]:
def merge(A, B, L):
    i, j = 0, 0
    while i < len(A) or j < len(B):
        if j == len(B) or (i < len(A) and A[i] < B[j]):
            L[i+j] = A[i]
            i = i + 1
        else:
            L[i+j] = B[j]
            j = j + 1

The complex if statement above relies heavily on something called short-circuited evaluation of boolean expressions. If we have a boolean operation like or, and the first operand is True, then we don’t have to evaluate the second operand to find out that the overall result will be True. Using this fact, Python will not even evaluate the second operand. Similarly, if we have an and expression and the first operand is False, then the second operand is never evaluated. The key to remember is that the order does matter here. The expression (i < len(A) and A[i] < B[j]) or j == len(B) is logically equivalent, but if we use this expression instead, it will raise an IndexError when j == len(B).

13.1.1 An Analysis<br>

The merge function for two lists whole length add up to n takes O(n) time.<br>
So, we have log n levels, each costing O(n), and thus, the total cost is ***O(n log n).***

13.1.2 Merging Iterators<br>

We will implement mergesort using iterators.

In [4]:
#sample example of how iterator works

class SimpleIterator:
    def __init__(self):
        self._count = 0
    
    def __iter__(self):
        return self
    
    def __next__(self):
        if self._count < 10:
            self._count += 1
            return self._count
        else:
            raise StopIteration

iterator1 = SimpleIterator()
for x in iterator1:
    print(x)

1
2
3
4
5
6
7
8
9
10


In [5]:
iterator2 = SimpleIterator()
L = [2 * x for x in iterator2]
L

[2, 4, 6, 8, 10, 12, 14, 16, 18, 20]

In [8]:
class BufferedIterator:
    def __init__(self, i):
        self._i = iter(i)
        self._hasnext = True
        self._buffer = None
        self._advance()
    
    def peek(self):
        return self._buffer
    
    def hasnext(self):
        return self._hasnext
    
    def _advance(self):
        try:
            self._buffer = next(self._i)
        except StopIteration:
            self._buffer = None
            self._hasnext = False
    
    def __iter__(self):
            return self
    
    def __next__(self):
            if self.hasnext():
                output = self.peek()
                self._advance()
                return output
            else:
                raise StopIteration

The BufferedIterator class is built from any iterator. It stays one step of the iteration ahead of the user and stores the next item in a buffer.

In [9]:
def merge(A, B):
    a = BufferedIterator(A)
    b = BufferedIterator(B)
    while a.hasnext() or b.hasnext():
        if not a.hasnext() or (b.hasnext() and b.peek() < a.peek()):
            yield next(b)
        else:
            yield next(a)

We can use this new merge iterator to write a new version of mergesort.

In [10]:
def mergesort(L):
    if len(L) > 1:
        m = len(L) // 2
        A, B = L[:m], L[m:]
        mergesort(A)
        mergesort(B)
        L[:] = merge(A, B)

13.2 Quicksort<br>

The mergesort code does a lot of slicing, which creates a copy and increases space. To overcome that we have quicksort which sorts the item **inplace**.

In [12]:
def quicksorted(L):
    #base case
    if len(L) < 2:
        return L[:]
    # Divide
    pivot = L[-1]
    LT = [e for e in L if e < pivot]
    ET = [e for e in L if e == pivot]
    GT = [e for e in L if e > pivot]
    # Conquer
    A = quicksorted(LT)
    B = quicksorted(GT)
    # Combine
    return A + ET + B

In [13]:
# inplace version

def quicksort(L, left = 0, right = None):
    if right is None:
        right = len(L)
    
    if right - left > 1:
        mid = partition(L, left, right)

        quicksort(L, mid, left)
        quicksort(L, mid + 1, right)

def partition(L, left, right):
    pivot = right - 1
    i = left
    j = pivot - 1

    while i < j:
        while L[i] < L[pivot]:
            i += 1
        
        while i < j and L[j] >= L[pivot]:
            j -= 1
        
        if i < j:
            L[i], L[j] = L[j], L[i]
    
    if L[pivot] <= L[i]:
        L[pivot], L[i] = L[i], L[pivot]
        pivot = i
    
    return pivot

L = [5,2,3,1,4]
quicksort(L)
print(L)

[1, 2, 3, 4, 5]


Below is a second implementation. It uses a (private) helper function rather than using default parameters to handle the initial call.<br>
The main difference in the code below is that it uses a random pivot element instead of always choosing the last element. Notice that random is not the same as arbitrary.

In [16]:
from random import randrange

def quicksort(L):
    _quicksort(L, 0, len(L))

def _quicksort(L, left, right):
    if right - left > 1:
        mid = partition(L, left, right)
        _quicksort(L, left, mid)
        _quicksort(L, mid+1, right)

def partition(L, left, right):
    pivot = randrange(left, right)
    L[pivot], L[right -1] = L[right -1], L[pivot]
    i, j, pivot = left, right - 2, right - 1
    
    while i < j:
        while L[i] < L[pivot]:
            i += 1
    
        while i < j and L[j] >= L[pivot]:
            j -= 1
    
        if i < j:
            L[i], L[j] = L[j], L[i]
    
    if L[pivot] <= L[i]:
        L[pivot], L[i] = L[i], L[pivot]
        pivot = i
    
    return pivot

L = list(reversed(range(1000)))
quicksort(L)
print(L)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221,

Visualize MergeSort algorithms [here](https://www.hackerearth.com/practice/algorithms/sorting/merge-sort/visualize/)