### Parallel Quicksort in (Python) [4 bonus points]

Consider the following sequential algorithm for Quicksort, taken from Introduction to Algorithms by Cormen, Leiserson, Rivest, and Stein. Let `a: 0 .. N – 1 → integer` for constant `N ≥ 0` be a global variable. In algorithms, a call to procedure `p` with value parameters `E`, `F` and results `x`, `y` is written as `x, y ← p(E, F)`.

```algorithm
procedure partition(p, r: integer) → (q: integer)
        x, i = a(r), p – 1
        for j = p to r – 1 do
            if a(j) ≤ x then
                i := i + 1 ; a(i), a(j) := a(j), a(i)
        a(i + 1), a(r) := a(r), a(i + 1)
        q := i + 1

procedure quicksort(p, r: integer)
    if p < r then
        q ← partition(p, r)
        quicksort(p, q – 1)
        quicksort(q + 1, r)
```


Implement Quicksort in Java by sorting in parallel after partitioning; for this, the parent thread can continue sorting one segment and a child thread is created for sorting the other segment. However, create a new thread only if both segments contain more than `S` elements; otherwise, sort sequentially both segments.

In [14]:
from threading import Thread
from random import randint
from time import time

def partition(p, r): 
    x, i = a[r], p - 1
    for j in range(p, r):
        if a[j] <= x: 
            i += 1; a[i], a[j] = a[j], a[i]
    a[i + 1], a[r] = a[r], a[i + 1]
    return i + 1;
    

def sequentialsort(p, r):
    if p < r:
        q = partition(p, r)
        sequentialsort(p, q - 1)
        sequentialsort(q + 1, r)
        
def parallelsort(p, r):
    if p < r:
        q = partition(p, r)

        if q - p <= S or r - q <= S:
            sequentialsort(p, q - 1)
            sequentialsort(q + 1, r)

        else:
            #source: https://stackoverflow.com/questions/35244577/is-it-possible-to-use-an-inline-function-in-a-thread-call
            lt = Thread(target=lambda : parallelsort(p, q - 1)) 
            rt = Thread(target=lambda : parallelsort(q + 1, r))

            lt.start(); rt.start()
            lt.join(); rt.join()

def quicksort(N, step):
    global a, S
    a, S = [randint(0, 10000) for _ in range(N)], step

    start = time()    
    parallelsort(0, N - 1)
    end = time()

    print(str(int((end - start) * 1000)) + " ms")
    for i in range(1, N): assert a[i - 1] <= a[i]    

Use the cell below to test your implementation.

In [15]:
quicksort(100000, 1)

5968 ms


In [16]:
quicksort(100000, 10)

2017 ms


In [10]:
quicksort(100000, 100)

671 ms


In [11]:
quicksort(100000, 1000)

471 ms


In [12]:
quicksort(100000, 10000)

431 ms


In [13]:
quicksort(100000, 100000)

284 ms


Now summarize your observations about the running time for varying values of `S` and explain!

I notice that as the value of S increases, our run time decreases. since we split the work up into more threads. 