# Sorting

Sorting refers to arranging data in a particular format.
Sorting algorithm specifies the way to arrange data in a particular order.
Most common orders are in numerical or lexicographical order.

The importance of sorting lies in the fact that data searching can be optimized to a very high level, if data is stored in a sorted manner.
Sorting is also used to represent data in more readable formats. Below we see five such implementations of sorting in python.

![](resources/sort.png)

Python uses an algorithm called **Timsort:**
```
Timsort is a hybrid sorting algorithm, derived from merge sort and insertion sort,
designed to perform well on many kinds of real-world data. It was invented by Tim 
Peters in 2002 for use in the Python programming language. The algorithm finds subsets 
of the data that are already ordered, and uses the subsets to sort the data more 
efficiently. This is done by merging an identified subset, called a run, with existing 
runs until certain criteria are fulfilled. Timsort has been Python's standard sorting 
algorithm since version 2.3. It is now also used to sort arrays in Java SE 7, and on 
the Android platform.
```

A simple ascending sort is very easy -- just call the sorted() function. It returns a new sorted list: 

In [1]:
sorted([5, 2, 3, 1, 4])

[1, 2, 3, 4, 5]

You can also use the **list.sort()** method of a list.

It modifies the list in-place (and returns None to avoid confusion).
Usually it's less convenient than sorted() - but if you don't need the original list, it's slightly more efficient. 

In [2]:
a = [5, 2, 3, 1, 4]
a.sort()
print(a)

[1, 2, 3, 4, 5]


Another difference is that the list.sort() method is only defined for lists. In contrast, the sorted() function accepts any iterable.

In [3]:
sorted({1: 'D', 2: 'B', 3: 'B', 4: 'E', 5: 'A'})

[1, 2, 3, 4, 5]

## Key Functions

Starting with Python 2.4, both list.sort() and sorted() added a key parameter to specify a function to be called on each list element prior to making comparisons.

For example, here's a case-insensitive string comparison:

In [4]:
sorted("This is a test string from Andrew".split(), key=str.lower)

['a', 'Andrew', 'from', 'is', 'string', 'test', 'This']

The value of the key parameter should be a function that takes a single argument and returns a key to use for sorting purposes. This technique is fast because the key function is called exactly once for each input record.

A common pattern is to sort complex objects using some of the object's indices as a key. For example: 

In [5]:
student_tuples = [
    ('john', 'A', 15),
    ('jane', 'B', 12),
    ('dave', 'B', 10)]

sorted(student_tuples, key=lambda student: student[2])   # sort by age

[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]

The same technique works for objects with named attributes.

For example:

In [6]:
class Student:
    def __init__(self, name, grade, age):
        self.name = name
        self.grade = grade
        self.age = age
    def __repr__(self):
        return repr((self.name, self.grade, self.age))
    def weighted_grade(self):
        return 'CBA'.index(self.grade) / float(self.age)

student_objects = [
        Student('john', 'A', 15),
        Student('jane', 'B', 12),
        Student('dave', 'B', 10)]

sorted(student_objects, key=lambda student: student.age)   # sort by age

[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]

## Ascending and Descending

Both list.sort() and sorted() accept a reverse parameter with a boolean value. This is using to flag descending sorts. For example, to get the student data in reverse age order: 

In [9]:
from operator import itemgetter, attrgetter, methodcaller

l1 = sorted(student_tuples, key=itemgetter(2), reverse=False)
print(l1)

l2 = sorted(student_objects, key=attrgetter('age'), reverse=True)
print(l2)

[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]
[('john', 'A', 15), ('jane', 'B', 12), ('dave', 'B', 10)]
