## 6.3 – More Sorting
### Comparing Objects
Remember in the last section we mentioned the fact that our implementation of selection sort was *unstable* – for objects with equal value they are not guaranteed to remain in the same order. 

In this selection we'll introduce *insertion sort*, which has a stable implementation. But first, the issue of stability highlights the fact that there can be more to an object than just a simple value. This makes sense if we think about it within the object oriented programming framework – objects have many attributes and methods, for any two given objects it might not be obvious how to *order* them. In fact, it might not even be obvious whether two objects are even *equal*!

So here's a nice collection of Python features. You can actually define these things for your own classes. If you try to use `<` with two objects, you get an error:

In [1]:
class PlayingCard:
    def __init__(self, number, suit):
        self.number = number
        self.suit = suit
        
nine_of_hearts = PlayingCard(9, "♥")
nine_of_spades = PlayingCard(9, "♠")

nine_of_hearts < nine_of_spades

TypeError: '<' not supported between instances of 'PlayingCard' and 'PlayingCard'

In addition, you can use `==` on these objects, but it will check whether the two objects are *literally* the same object. Remember objects are mutable, so if we create two instances which happen to have the same contents, they are still two separate instances in memory – they must be, we might change one and the other should remain the same. But the code below looks a bit odd:

In [2]:
nine_of_hearts1 = PlayingCard(9, "♥")
nine_of_hearts2 = PlayingCard(9, "♥")

nine_of_hearts1 == nine_of_hearts2

False

Thankfully Python has an easy way to support custom behaviour for these operators. Perhaps we are playing a card game where suit does not matter, so the cards should just be compared on value.

First of all, we can override the `__eq__` method to define our own notion of equality. This will change the behaviour of `==`, but also affect code that uses keywords like `in`, e.g. `if card in hand_of_cards`. 

It's worth noting that if you do choose to override `__eq__` you *should* also override `__hash__`. We'll come back to *hash functions* in next week's material to learn why, but the simple answer is that sets and dictionaries will break for your objects if you do not. For now you can [read more here](https://docs.python.org/3/reference/datamodel.html#object.__hash__), but I am going to be lazy and only override `__eq__` until we learn more about what hash functions do next week.

In addition, we can implement a method called `__lt__` (less than) to enable the `<` operator:

In [3]:
class PlayingCard:
    def __init__(self, number, suit):
        self.number = number
        self.suit = suit
        
    def __eq__(self, other):
        if type(other) is type(self):
            return self.number == other.number
        else:
            return NotImplemented
        
    def __lt__(self, other):
        if type(other) is type(self):
            return self.number < other.number
        else:
            return NotImplemented
        
    def __str__(self):
        return f"{self.number}{self.suit}"
    
    def __repr__(self):
        return f"{self.number}{self.suit}"
        
        
nine_of_hearts = PlayingCard(9, "♥")
nine_of_spades = PlayingCard(9, "♠")

print(nine_of_hearts < nine_of_spades)
print(nine_of_spades < nine_of_hearts)

False
False


In [4]:
nine_of_hearts1 = PlayingCard(9, "♥")
nine_of_hearts2 = PlayingCard(9, "♥")

nine_of_hearts1 == nine_of_hearts2

True

Notice if we want the old behaviour which compares whether the two objects are identical, we can do that with the `is` operator:

In [5]:
nine_of_hearts1 is nine_of_hearts2

False

Also worth explaining `__str__` and `__repr__`. These are called when Python wants a string based representation of the object. `str` is for a human-readable version of the string – called if you print the object. `repr` is supposed to give an unambiguous representation, it will be called by debuggers for example, but it also gets called when we put multiple items in a list and print the list, so it's helpful for us to define both (here they return the same thing).

In [6]:
print(PlayingCard(9, "♥"))
print([PlayingCard(9, "♠"), PlayingCard(9, "♥")])

9♥
[9♠, 9♥]


And as a final note, we obviously haven't implemented any of the other operators, and there are corresponding methods like `__gt__` for `>` or `__le__` for `<=`. It is quite a bore to go through and implement them all, so you can use a decorator from the `functools` module to help [if you want](https://docs.python.org/3/library/functools.html#functools.total_ordering). 

### Insertion Sort
Now we can finally get around to the stable sort implementation. Insertion sort is actually similar to selection sort in the sense that it builds up the sorted list one element at a time. And like insertion sort, while we could implement it creating a new list, it is more efficient to have it work in-place.

Here is the basic idea: at the start of iteration $k$ we assume that the sub-list of just the first $k$ elements is already sorted. Obviously this is okay because on iteration number 1 we only need to consider the first element as its own list, and any list of length 1 must be considered sorted.

Then you take the next element from the list (the item at position `k` in the list, assuming zero-indexing – so the item directly after your sorted sub-list) and you store this item in a temporary variable. Now you go backwards through the array, checking the item at position `i = k-1`, `k-2`, `k-3`, and so on. Each time, you compare the item to the temporary variable: if it is less than or equal, you stop and insert the temporary variable value into position `i+1`. If the item is bigger, you move it one item to the right in the array, and continue moving down.

We will need to test for either “less than or equal” or “greater than” in our code, but we did not implement these for the `PlayingCard` class. But we did implement `__lt__` and `__eq__`, so we can combine both in the code below to avoid having to redefine the class with new methods.

Have a look at the code below to ensure you understand!

In [7]:
def insertion_sort(my_list):
    for k in range(1, len(my_list)):
        temp = my_list[k]
        
        for i in reversed(range(-1, k)):
            if i == -1 or (my_list[i] < temp or my_list[i] == temp):
                my_list[i+1] = temp
                break
            else:
                my_list[i+1] = my_list[i]
    

my_list = [37, 42, 9, 19, 35, 4, 53, 22]
insertion_sort(my_list)
print(my_list)

[4, 9, 19, 22, 35, 37, 42, 53]


In [8]:
my_cards = [PlayingCard(9, "♠"), PlayingCard(9, "♥"), PlayingCard(4, "♣")]
insertion_sort(my_cards)
print(my_cards)

[4♣, 9♠, 9♥]


Notice the list has been sorted successfully, and the order is stable: the two 9s are in the same positions they were at the start.

***Exercise:*** go and modify the code for the PlayingCard class, using the `total_ordering` [decorator](https://docs.python.org/3/library/functools.html#functools.total_ordering) from `functools` to enable the use of `<=`, then modify the `insertion_sort` function to use that instead of the combination of `<` and `==`.

### Insertion Sort Complexity
***Have a think about the time complexity of insertion sort before continuing.***

In the best case, the complexity is actually $O(n)$, which is when the list is already sorted. In this case the inner for loop will never perform more than one operation, so does not scale with the length of the list at all, so we are just left with the outer for loop for $O(n)$.

In the average and worst case, the complexity is $O(n^2)$ again. The worst case is easiest to demonstrate, and occurs when the list is in exactly the reverse order. In that case, both for loops iterate the maximum number of times. The outer for loop repeats $n$ times the inner for loop, which itself repeats $n$ times, so $O(n^2)$. The average case is better, but that inner for loop still scales with the length of the list, and we get the same quadratic complexity class.

Insertion sort outperforms selection sort in the number of comparisons, and the closer the list is to already being sorted, the better it performs.

It might actually be that the *swapping* of elements is the time-sensitive bottleneck in our system. In that case, selection sort actually performs best. Insertion sort moves its items around more.

## What Next?
Once you are done, go back to Engage to move onto the next section.