<table border="0" align="left" width="700" height="144">
<tbody>
<tr>
<td width="120"><img width="100" src="https://static1.squarespace.com/static/5992c2c7a803bb8283297efe/t/59c803110abd04d34ca9a1f0/1530629279239/" /></td>
<td style="width: 600px; height: 67px;">
<h1 style="text-align: left;">Algorithms: Quicksort</h1>
<p><em>with excerpts from Grokking Algorithms, by Aditya Y. Bhargava</em>
<p><a href="https://colab.research.google.com/github/KenzieAcademy/python-notebooks/blob/master/demo_quicksort.ipynb"> <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab" align="left" width="188" height="32" /> </a></p>
</td>
</tr>
</tbody>
</table>

In order to arrive at Quicksort, let's start with the simpler **selection sort**.

### Selection Sort
Suppose you have a bunch of music on your computer, and, for each artist, you have a play count. You want to sort this list from most to least played, so that you can rank your favorite artists.

One way to accomplish this would be to go through the list and find the most-played artist and add that artist to a new list. Then, do it again to find the next-most-played artist. Keep doing this and you'll end up with a sorted list.

To illustrate this idea using a simple list of numbers, sorting from greatest to least, that would look like this:

```python
[3, 7, 2, 9, 4, 13, 1, 8]
                ^-highest -> [13]
[3, 7, 2, 9, 4, 1, 8]
          ^-highest -------> [13, 9]
[3, 7, 2, 4, 1, 8]
                ^----------> [13, 9, 8]
[3, 7, 2, 4, 1]
    ^----------------------> [13, 9, 8, 7]
[3, 2, 4, 1]
       ^-------------------> [13, 9, 8, 7, 4]
[3, 2, 1]
 ^-------------------------> [13, 9, 8, 7, 4, 3]
[2, 1]
 ^-------------------------> [13, 9, 8, 7, 4, 3, 2]
[1]
 ^-------------------------> [13, 9, 8, 7, 4, 3, 2, 1]
[]
DONE
```

In order to find the highest number, you have to check each item in the list. This represents **O(n)** time, or *linear time*. However, you also have to do that *n* times, once for each item in the list. That means that the time cost of a selection sort is **O(n x n)**, commonly known as **O(n<sup>2</sup>)**.

In [None]:
# Selection sort
def selection_sort(list_):
  new_list = []
  for i in range(len(list_)):
    smallest_index = 0
    for j in range(1, len(list_)):
      if list_[j] < list_[smallest_index]:
        smallest_index = j
    new_list.append(list_.pop(smallest_index))
  return new_list

selection_sort([3, 7, 2, 9, 4, 13, 1, 8])

Selection sort is a neat algorithm, but it's not very fast.

**Quicksort** is a sorting algorithm that is much faster than selection sort. Before we dive into the details of it, let's take a look at the strategy behind it.

### Divide and Conquer
Quicksort uses a technique called **divide and conquer**, which is a recursive technique for solving problems. Divide and conquer is not a simple algorithm that you can apply to a problem. Instead, it is a way to think about a problem.

You're given a list of numbers.

```python
[2, 4, 6]
```

You have to add up all the numbers and return the total. It's pretty easy to do this with a loop.

In [None]:
# iterative sum function
def get_sum(list_):
    total = 0
    for x in list_:
        total += x
    return total

get_sum([2, 4, 6])

But, how would you do this with a recursive function?

1. Figure out the base case.
  * What's the simplest list you could get? A list with 0 or 1 element is pretty easy to sum. Let's use an empty list as the base case.
2. You need to move closer to an empty list with every recursive call.
  * How do you reduce your problem size? The following two approaches are the same, but in the second version, you're passing a smaller list into the `sum()` function. That is, you *decrease the size of your problem*.
    * sum([2, 4, 6]) = 12
    * 2 + sum([4, 6]) = 2 + 10 = 12

Now, our `sum()` function could work like this:
  1. Get a list.
  2. If the list is empty, return zero.
  3. Otherwise, the total sum is the first number in the list plus the sum of the rest of the list.

This ends up looking like this:
* sum([2, 4, 6])
* 2 + sum([4, 6])
* 4 + sum([6])
* 6 + sum(`[ ]`)
* `[ ]`  # base case!

In [None]:
# recursive sum function
def r_sum(list_):
  if not list_:
    # base case -- list is empty
    return 0
  # recursive case -- remove the first number and
  # add it to the sum of the rest of the numbers
  return list_.pop() + r_sum(list_)

r_sum([2, 4, 6])

What's the simplest list that can make use of a sorting algorithm? Well, some lists don't need to be sorted at all (e.g., `[]`, `[20]`).

Empty lists and lists with just one element will be the base case. You can just return those lists as is &mdash; there's nothing to sort.

Let's use the following list as an example:
* `[33, 15, 10]`

Now, remember you're using Divide and Conquer, so you want to break down this list until you arrive at the base case.

Here's how **quicksort** works.
1. Pick an element from the list &mdash; this is called the *pivot*.
  * Let's use the first element of the list, `33`, as the pivot.
2. Find the elements smaller than the pivot and the elements larger than the pivot &mdash; this is called partitioning.
  * smaller than pivot: `[15, 10]`
  * pivot: `33`
  * greater than pivot: `[]`

At this point, the two sub-lists, "smaller than" and "greater than", are not sorted. They're just partitioned. But if, by chance, they *were* sorted, then sorting the whole list would be pretty easy:

```python
# "smaller than" list already sorted
[10, 15] + [pivot] + []
```

So, how do you sort the sub-lists? Well, the quicksort base case already knows how to sort lists of two elements (the left sub-list) and empty lists (the right sub-list), so if you call quicksort on the two sub-lists and then combine the results you get a sorted list!

```python
quicksort([15, 10]) + [33] + quicksort([])
[10, 15, 33]
```


In [33]:
# quicksort function
def quicksort(list_):
  if len(list_) < 2:
    # base case: lists with 0 or 1 element are already "sorted"
    return list_
  else:
    # recursive case
    pivot = list_[0]
    less = [i for i in list_[1:] if i <= pivot]  # sub-list of all elements less than the pivot
    greater = [i for i in list_[1:] if i > pivot]  # sub-list of all elements greater than the pivot
    return quicksort(less) + [pivot] + quicksort(greater)

In [None]:
print(quicksort([33, 15, 10]))

### Big O
With quicksort, the best case for time cost is also the average case! If you choose a random element from a list as the pivot, quicksort will complete in **O(*n* log *n*)** time on average.

Quicksort is one of the fastest sorting algorithms!