# Sorting
There are two main approaches to sorting: comparison sorts and specialized sorts.

### Comparison sorts
Comparison sorts are one-size-fits-all approaches that can be applied to any data type. They abstract the logic of 'what goes before?' into a comparitor function.

#### Merge sort
A recursize algorithm that splits the array in half, sorts each half recursively then merges them. Time complexity: $O(n*log(n))$

In [4]:
def cmp(x, y):
    if x < y: return -1
    elif x == y: return 0
    else: return 1

def merge(arr1, arr2):
    i, j = 0, 0
    merged_arr = []
    while i < len(arr1) and j < len(arr2):
        if cmp(arr1[i],arr2[j]) == 1:
            merged_arr.append(arr2[j])
            j += 1
        else:
            merged_arr.append(arr1[i])
            i += 1
    return merged_arr + arr1[i:] + arr2[j:]

def merge_sort(arr):
    n = len(arr)
    if n <= 1:
        return arr
    left = merge_sort(arr[:n//2])
    right = merge_sort(arr[n//2:])
    return merge(left, right)

In [5]:
import random
arr = random.sample(range(100),20)
print(f"Initial array: {arr}")

print(f"Sorted array: {merge_sort(arr)}")

Initial array: [58, 74, 9, 28, 55, 73, 78, 0, 56, 22, 5, 32, 6, 40, 13, 27, 8, 30, 42, 64]
Sorted array: [0, 5, 6, 8, 9, 13, 22, 27, 28, 30, 32, 40, 42, 55, 56, 58, 64, 73, 74, 78]


#### Quicksort
Quicksort recursively picks an element at random to use as a pivot then partitions the array into three parts: lower, equal to and higher than the pivot.
Performance depends on the luck of the pivot drawn. In worst case scenarios, this can be close to $O(n^{2})$, though the probability of this is negligible with a large array. $O(n*log(n))$ is generally the worst performance, with high probability.

In [7]:
def quicksort(arr):
    n = len(arr)
    if n <= 1: return arr
    pivot = random.choice(arr)
    lower, equal, higher = [], [], []
    for x in arr:
        if x < pivot:
            lower.append(x)
        elif x == pivot:
            equal.append(x)
        else:
            higher.append(x)
    return quicksort(lower) + equal + quicksort(higher)

In [8]:
quicksort(arr)

[0, 5, 6, 8, 9, 13, 22, 27, 28, 30, 32, 40, 42, 55, 56, 58, 64, 73, 74, 78]

### Specialized sorts
Some sorts will allow more efficient approaches, given particular parameters of the problem.

#### Counting sort
This approach is useful when there's a small range of different values. It involes a single iteration through the list to count repetitions of each value and then the construction of a new list from the counts. Time complexity is $O(n)$.

In [10]:
def counting_sort(arr, lower_bound, upper_bound):
    counts = [0] * (upper_bound + 1 - lower_bound)
    for entry in arr:
        counts[entry - lower_bound] += 1
    result = []
    for entry, count in enumerate(counts):
        result += [entry + lower_bound] * count
    return result

In [11]:
arr = [random.randint(50,55) for i in range(20)]
print(f"Initial array: {arr}")

print(f"Count sorted: {counting_sort(arr, 50, 55)}")

Initial array: [51, 54, 51, 51, 53, 54, 50, 55, 50, 55, 50, 50, 52, 55, 55, 51, 51, 51, 50, 53]
Count sorted: [50, 50, 50, 50, 50, 51, 51, 51, 51, 51, 51, 52, 53, 53, 54, 54, 55, 55, 55, 55]


### Built-in sort
Python has a `sorted()` function that takes a list and returns a new, sorted list. It also has a `sort()` function, that sorts a list in place. Both functions accept an optional `key()` function. Note that even though `sort()` sorts in place, it still requires $O(n)$ extra space.

#### Case-insensitive sort
Sort a given array lexicographically - ie ignoring case - and in descending order.

In [14]:
def case_insensitive_sort(arr):
    arr.sort(key=lambda s: s.lower(), reverse=True)
    return arr

In [15]:
arr = ['apple', 'Banana', '3', 'Cherry', '24', 'GRAPE', '30']
print(f"Initial array: {arr}")

print(f"After case-insensitive sort: {case_insensitive_sort(arr)}")

Initial array: ['apple', 'Banana', '3', 'Cherry', '24', 'GRAPE', '30']
After case-insensitive sort: ['GRAPE', 'Cherry', 'Banana', 'apple', '30', '3', '24']


#### Sort by element at index
Given an array of intervals $[x,y]$, sort the array by the latter value.

In [17]:
def sort_by_index(arr, idx):
    arr.sort(key=lambda int: int[idx])
    return arr

In [18]:
arr = [[3,9],[1,4],[4,7],[2,3]]
print(f"Initial array: {arr}")

print(f"After sort by index: {sort_by_index(arr,idx=1)}")

Initial array: [[3, 9], [1, 4], [4, 7], [2, 3]]
After sort by index: [[2, 3], [1, 4], [4, 7], [3, 9]]


#### Sort by field
Create an array of shuffled Card objects, representing a deck of playing cards. Sort the cards by value with Ace, Jack, Queen and King valued at 1, 11, 12 and 13, and Clubs < Hearts < Spades < Diamonds.

In [20]:
# First, create a card object
class Card:
    def __init__(self, value, suit):
        if not isinstance(value, (str, int)):
            raise TypeError('Invalid value: ', value)
        if isinstance(value, int):
            if not 2 <= value <=10:
                raise ValueError('Invalid value: ', value)
            else:
                self.value = str(value)
                self.rank = value
        elif len(value) == 1:
            if ord('2') <= ord(value) <= ord('9'):
                self.value = value
                self.rank = int(value)
            else:
                raise ValueError('Invalid value: ', value)
        else:
            special_values = {'ace':1, '10':10, 'jack':11, 'queen':12, 'king':13}
            if value.lower() not in special_values:
                raise ValueError('Invalid value: ', value)
            self.value = value.lower()
            self.rank = special_values[value.lower()]
        if suit.lower() not in ['clubs', 'hearts', 'spades', 'diamonds']:
            raise ValueError('Invalid suit: ', suit)
        self.suit = suit.lower()

In [21]:
# Now create a randomly ordered hand of cards
def deal(n):
    if n > 52:
        raise ValueError('Cannot deal more than 52 cards')
    hand = []
    suits = ['clubs', 'hearts', 'spades', 'diamonds']
    values = ['ace', 'jack', 'queen', 'king'] + [str(x) for x in range(2,11)]
    potential_deck = [[v,s] for v in values for s in suits]
    for value, suit in random.sample(potential_deck, n):
        hand.append(Card(value, suit))
    return hand

In [22]:
# Test the deal() function
hand = deal(3)
for card in hand:
    print(f"Suit: {card.suit}, value: {card.value}, rank: {card.rank}")

Suit: diamonds, value: 5, rank: 5
Suit: clubs, value: ace, rank: 1
Suit: diamonds, value: king, rank: 13


In [23]:
# Now create the sort function
def sort_by_field(hand):
    suit_order = {'clubs':0, 'hearts':1, 'spades':2, 'diamonds':3}
    hand.sort(key=lambda Card: (Card.rank, suit_order[Card.suit]))
    return hand

In [24]:
n = 12
print(f"Verifying sort order for randomly dealt selection of {n} cards...:")
for count, card in enumerate(sort_by_field(deal(n))):
        print(f"Card {count+1}: {card.value} of {card.suit}")

Verifying sort order for randomly dealt selection of 12 cards...:
Card 1: 2 of spades
Card 2: 3 of clubs
Card 3: 3 of spades
Card 4: 3 of diamonds
Card 5: 7 of spades
Card 6: 7 of diamonds
Card 7: 8 of hearts
Card 8: 10 of diamonds
Card 9: queen of spades
Card 10: queen of diamonds
Card 11: king of hearts
Card 12: king of spades


#### New deck order
Sort a given deck into 'new deck order', with suits separated, in the order Hearts, Clubs, Diamonds, Spades. Ace is low, king is high.

In [28]:
def new_deck_order(hand):
    suit_order = {'hearts':0, 'clubs':1, 'diamonds':2, 'spades':3}
    hand.sort(key=lambda Card: (suit_order[Card.suit], Card.rank))
    return hand

In [30]:
n = 12
print(f"Verifying new deck order sort for randomly dealt selection of {n} cards...:")
for count, card in enumerate(new_deck_order(deal(n))):
        print(f"Card {count+1}: {card.value} of {card.suit}")

Verifying new deck order sort for randomly dealt selection of 12 cards...:
Card 1: 7 of hearts
Card 2: king of hearts
Card 3: 4 of clubs
Card 4: 6 of clubs
Card 5: 7 of clubs
Card 6: queen of clubs
Card 7: 7 of diamonds
Card 8: 3 of spades
Card 9: 4 of spades
Card 10: 5 of spades
Card 11: 6 of spades
Card 12: jack of spades


#### Stable sorting
Now sort by value while preserving the given order of suits. <i>Stable</i> sorting algorithms break ties by using the input order.

In [33]:
# Python's sort is stable by default, so we only need a single sort key.
def stable_sort(hand):
    hand.sort(key=lambda Card: Card.rank)
    return hand

#### Sorting by frequency
Given a string of lowercase letters, sort it by the frequency of each letter, high to low, breaking ties by using allphabetical order.

In [45]:
def freq_sort(word):
    letter_freq = dict()
    for l in word:
        if l not in letter_freq:
            letter_freq[l] = 0
        letter_freq[l] += 1
    res = ''.join(sorted(sorted(letter_freq), key=lambda l: letter_freq[l], reverse=True))
    return res

In [49]:
word = 'hello'
print(f"Applying frequency sort to '{word}': {freq_sort(word)}")
word = 'supercalifragilisticexpialidocious'
print(f"Applying frequency sort to '{word}': {freq_sort(word)}")

Applying frequency sort to 'hello': leho
Applying frequency sort to 'supercalifragilisticexpialidocious': iaclseoprudfgtx
