## Understanding radix sort

First building block we will need is the `is_sorted` function. 

In [5]:
def is_sorted(array):
    """Takes a sequence and returns true if an only if sequence is sorted."""
    # check all the n-1 pairs of adjacent elements for 
    # order violation
    for i in xrange(1, len(array)):
        if array[i-1] > array[i]:
            return False
    # if no violations, then by transitivity of < the sequence is sorted.
    return True

In [7]:
# Verify implementation on few test cases
example1 = [1,4,6,7,8]
example2 = [1,4,7,6,8]
print(example1, is_sorted(example1))
print(example2, is_sorted(example2))

([1, 4, 6, 7, 8], True)
([1, 4, 7, 6, 8], False)


## Stable sorting

Python sort is stable. It means that if we have two elements that have the same value of *sorting key* they will appear in the output in the same order that they appeared in the input.

Let's see an example: we have a list of pairs $(a,b)$ and we want to sort them in the nondecreasing order by $a$ and by nondecreasing $b$ if $a$'s are the same

In [49]:
example = [ (3,1),(3,2),(1,1),(1,2), (2,2),(2,1)]
example

[(3, 1), (3, 2), (1, 1), (1, 2), (2, 2), (2, 1)]

We can achieve that by sorting first by $b$ and then **stable-sorting** by $a$.

In [50]:
sorted_idx2 = sorted(example, key=lambda x: x[1])
sorted_idx2

[(3, 1), (1, 1), (2, 1), (3, 2), (1, 2), (2, 2)]

In [51]:
sorted_idx12 = sorted(sorted_idx2, key=lambda x: x[0])
sorted_idx12

[(1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)]

### Unstable sort example

Sort does no have to be stable. For example merge sort isn't.

In [58]:
def merge(list1, list2, key=lambda x:x):
    res = []
    ptr1, ptr2 = 0, 0
    while len(res) < len(list1) + len(list2):
        if ptr2 == len(list2) or (ptr1 < len(list1) and key(list1[ptr1]) < key(list2[ptr2])):
            res.append(list1[ptr1])
            ptr1 += 1
        else:
            res.append(list2[ptr2])
            ptr2 += 1
    return res

def merge_sort(array, key=lambda x: x):
    if len(array) in [0,1]:
        return array
    else:
        sorted_left  = merge_sort(array[:len(array) / 2], key)
        sorted_right = merge_sort(array[len(array) / 2:], key)
    return merge(sorted_left, sorted_right, key)

In [59]:
sorted_idx2 = merge_sort(example, key=lambda x: x[1])
sorted_idx2

[(2, 1), (1, 1), (3, 1), (2, 2), (1, 2), (3, 2)]

In [114]:
sorted_idx12 = merge_sort(sorted_idx2, key=lambda x: x[0])
print(sorted_idx12)
print("Notice that secondary sorting criterion is violated.")

[(1, 2), (1, 1), (2, 2), (2, 1), (3, 2), (3, 1)]
Notice that secondary sorting criterion is violated.


## Counting Sort

In order to keep complexity at $O(n)$, we will need to divise a procude that sorts without using comparisons.

Assume we only have elements $0, 1, ..., (k-1)$ in the array. We know that all zeros come before all ones etc. We can therefore put all the numbers in $k$ different buckets and later read them off.

In [221]:
def count_sort(array, k, key=lambda x: x):
    """Stable sorts array by using key to determine ordering of elements.
    
    Assumes all elements are in range(0, k)"""
    # initialize array 
    buckets = [[] for _ in range(k)]
    # for every key store all the elements
    # with that key
    for element in array:
        buckets[key(element)].append(element)
    output = []
    # red numbers from buckets in order
    for bucket in buckets:
        for element in bucket:
            output.append(element)
    return output

In [222]:
count_sort([4,3,2,5,5,1,2], 10)

[1, 2, 2, 3, 4, 5, 5]

### Count sort complexity analysis

We have the following steps:
- allocate space for $b$ buckets: $O(b)$
- loop throgh all the elements in the input array and put them in buckets $O(n)$
- remove elements from the buckets $O(n)$

Therefore the total complexity is $O(n+b)$


## Radix sort idea

Imagine that you want to compare two long numbers. For example 85823421348134214 and  85823421348452456. The algorithm you would use is to compare the first digit and if it is the same then compare the next digit etc. We can say that first digit is the primary comparison criterion, second digit is the secondary sorting criterion etc. This is almost correct, but we actually need to make sure that we add extra zeros at the beginning of the number that is shorter (because sorter numbers come before longer numbers). 

Radix sort uses this idea directly for sorting. It first sorts the numbers by last digit. The it *stable-sorts* it by the second to last digit (making second to last digit primary sorting criterion and the last digit secondary sorting criterion) and so on. At the end of that process we end up with digitst sorted in exactly the order we discussed above.

To implement that idea let's first look at how we would obtain the digits. 

In [223]:
def ith_digit(number, i):
    """Returns the i-th digit from the end. 
    
    i=0 resuts the very last digit."""
    for _ in range(i):
        number /= 10
    return number % 10

In [224]:
print(ith_digit(123, 0))
print(ith_digit(123, 1))
print(ith_digit(123, 2))
print(ith_digit(123, 3))
print(ith_digit(123, 4))

3
2
1
0
0


Sweet! We have a function that returns i-th digit and it even yields additional zeros in the front - just what we needed.


## Radix Sort using digit $i$ from the end

The idea is to use count sort with the digit being the key. 

For example if we sort `[123, 42, 73]` by the last digit, bucket nr 2 will have one number `[42]`, while bucket number three would have two numbers `[123, 73]`, while the remaining eight buckets would be empty. It is imporant that bucket nr two has `[123, 73]` not `[73, 123]` - this way if we read out the numbers in order they appear in the buckets we will get a stable sort.

In [225]:
def radix_sort_by_ith_digit(array, i):
    return count_sort(array, 
                      10,     # we have 10 different digits.
                      key=lambda number: ith_digit(number, i)) # use i-th digit as a key.

In [226]:
# sort by the last digit
pass1 = radix_sort_by_ith_digit([123,42,73], 0)
pass1

[42, 123, 73]

In [227]:
# sort result of previous pass by the second to last digit
pass2 = radix_sort_by_ith_digit(pass1, 1)
pass2

[123, 42, 73]

In [228]:
# sort result of previous pass by the third to last digit
# none of the numbers are longer than third digit, so we are done.
pass3 = radix_sort_by_ith_digit(pass2, 2)
pass3

[42, 73, 123]

What happened above is exactly radix sort! Sort iteratively by digits further and further from the end until the sequence ends up sorted.

In [229]:
def radix_sort(array):
    """Returns array sorted by i-th digit from the end.
    
    The sorting procedure is stable."""
    i = 0
    while True:
        if is_sorted(array):
            # we stop once the array is sorted
            # the latest this can happen is when 
            # we run the number of passes eqaul to
            # the length of the longest number
            break
        # stable sort by i-th digit.
        array = radix_sort_by_ith_digit(array, i)
        i += 1
    return array

In [230]:
radix_sort([123,42,73])

[42, 73, 123]

In [231]:
# harder example
radix_sort([123,42,73, 123123, 142124, 524, 512, 5214])

[42, 73, 123, 512, 524, 5214, 123123, 142124]

### Radix sort with different numeric base. 

Let's try to improve our algorithm slightly. Notice that the fact that we use digits in base $10$ is kind of arbitrary. How hard would it be to use any $b \geq 2$? In theory all we should be required to do is to change the digit extraction procedure and the number of buckets. 

Let's start with the digits.

In [232]:
def ith_digit(number, b, i):
    """Returns the i-th digit from the end (base b).
    
    i=0 resuts the very last digit."""
    for _ in range(i):
        number /= b      # changed 10 to b
    return number % b    # changed 10 to b

In [233]:
print("7 mod 2")
print(ith_digit(7, 2, 0))
print(ith_digit(7, 2, 1))
print(ith_digit(7, 2, 2))
print(ith_digit(7, 2, 3))
print(ith_digit(7, 2, 4))

7 mod 2
1
1
1
0
0


In [234]:
print("7 mod 3")
print(ith_digit(7, 3, 0))
print(ith_digit(7, 3, 1))
print(ith_digit(7, 3, 2))
print(ith_digit(7, 3, 3))
print(ith_digit(7, 3, 4))

7 mod 3
1
2
0
0
0


Now we are ready to augment to radix_sort.

In [235]:
def radix_sort_by_ith_digit(array, b, i):
    """Returns array sorted by i-th digit from the end (base b).
    
    The sorting procedure is stable."""
    return count_sort(array, b, key=lambda number: ith_digit(number, b, i))

In [236]:
def radix_sort(array, b):
    """Returns array sorted by i-th digit from the end.
    
    The sorting procedure is stable."""
    i = 0
    while True:
        if is_sorted(array):
            # we stop once the array is sorted
            # the latest this can happen is when 
            # we run the number of passes eqaul to
            # the length of the longest number
            break
        print("Iteration %d" % (i,))
        # stable sort by i-th digit.
        array = radix_sort_by_ith_digit(array, b, i)
        i += 1
    return array

Let's try sorting in base $b=2$

In [237]:
radix_sort([123,42,73], 2)

Iteration 0
Iteration 1
Iteration 2
Iteration 3
Iteration 4
Iteration 5
Iteration 6


[42, 73, 123]

Whoah! 7 iterations? That is a lot to sort just 3 numbers. How about if we increase the base? Maybe $b=1000$ ?

In [238]:
radix_sort([123,42,73], 1000)

Iteration 0


[42, 73, 123]

Much better - we only have one iteration. Notice however that we have many more buckets than numbers - even though in theory we decrease number of iterations, now every iteration is dominated by looping through every bucket. In this example $1000$ buckets visited in one iteration are much worst than two buckets visited in $7$ iterations (total of $14$ acceses). Actually array acceses contribute another 3 operations per iteraions ($7 * 3 = 21$) adding up to total of $35$ operations, but this is still much less than $1000$.

In [239]:
# much healthier choice
radix_sort([123,42,73], 5)

Iteration 0
Iteration 1
Iteration 2


[42, 73, 123]

## Radix sort complexity analysis

Let $b$ be the base and $n$ size of the array. Moreover let's assume that all the numbers in the array are less than or equal $a$.


Single iteration of count sort is $O(n + b)$.

How many iterations are there? At most as many as the number of digits in the longest number: O($log_b\ a$)

Therefore the total complexity of the algorithm is O($(n+b) log_b\ a)$.

In theory we often assume that both $b$ and $a$ are constants - they are after all independent of $n$ - they won't influence the run time as $n$ grows. That's why some theorists say that Radix Sort is $O(n)$.

## Exercises

1. We said that the best possible algorithm that does sorting has complexity $O(n\ lg\ n)$. How is it possible that radix sort takes only $O(n)$ time? 

2. Can you come up with a sorting problem where it would be hard to use Radix Sort?

# Be sure to checkout the Performance of Radix Sort notebook!

# Aside: implementation of count sort from the lectures

This implementation has the samee time and space complexity, but is faster in practice.

In [241]:
def count_sort_from_the_lecture(array, k, key=lambda x: x):
    # initialize array 
    count = [0 for _ in range(k)]
    # for every key count the number of times
    # it occurs
    for element in array:
        count[key(element)] += 1
    # compute cumulative count of occurences
    for i in range(1, k):
        count[i] += count[i-1]
    # create output array
    output = [None for _ in range(len(array))]
    # fill in output array computing slots using
    # counts array
    for i in range(len(array) - 1, -1, -1):
        output[count[key(array[i])] - 1] = array[i]
        count[key(array[i])] -= 1
    return output

In [243]:
count_sort_from_the_lecture([4,3,2,5,5,1,2], 10)

[1, 2, 2, 3, 4, 5, 5]