# Sorting Arrays

The previous two notebooks dealt with tools to access and operate on array data with numPy. This section deals with sorting values in arrays. 

There are many ways to sort lists, arrays with programs. Some of these algorithms are:
  * Insertion sort
  * Selection sort
  * Merge sort
  * Quick sort
  * Bubble sort
 
For example, a simple selection sort can be implemented in python. This algorithm repeatedly finds minimum value in a list and makes swaps until list is sorted.

In [1]:
import numpy as np

def selection_sort(x):
    for i in range(len(x)):
        swap = i + np.argmin(x[i:])
        (x[i], x[swap]) = (x[swap], x[i])
    return x
x = np.array([2,4,1,8,6,5])
selection_sort(x)

array([1, 2, 4, 5, 6, 8])

Any computer science student will tell you that while these sorting algorithms are simple, they are by no means the fastest or most efficient methods.

## Fast sorting in Numpy: np.sort and np.argsort

As usual, numPy's sorting algorithms blow pythons sort and sorted functions out of the water.

Sorting an array with numpy is as follows

In [2]:
x = np.array([2,1,5,7,9,15,11])
np.sort(x)

array([ 1,  2,  5,  7,  9, 11, 15])

Related to sort() is the function argsort() which returns the **indices** of the sorted elements.

In [3]:
x = np.array([2,4,1,7,5])
i = np.argsort(x)
print(i)

[2 0 1 4 3]


If you wanted to create the sorted array you simply pass the indices.

In [4]:
x[i]

array([1, 2, 4, 5, 7])

### Sorting along rows or columns

A useful feature of NumPy's sorting algorithm is the ability to sort along specific rows or columns by using the *axis* argument. This is especially useful for multi dimensional arrays.

In [5]:
rand = np.random.RandomState(14)
X = rand.randint(0, 10, (5,5))
X

array([[8, 6, 7, 9, 6],
       [0, 8, 9, 7, 6],
       [0, 7, 4, 0, 6],
       [4, 5, 5, 8, 5],
       [9, 8, 7, 3, 4]])

In [6]:
# Sort each column 
np.sort(X, axis=0)

array([[0, 5, 4, 0, 4],
       [0, 6, 5, 3, 5],
       [4, 7, 7, 7, 6],
       [8, 8, 7, 8, 6],
       [9, 8, 9, 9, 6]])

In [7]:
# Sort each row
np.sort(X, axis=1)

array([[6, 6, 7, 8, 9],
       [0, 6, 7, 8, 9],
       [0, 0, 4, 6, 7],
       [4, 5, 5, 5, 8],
       [3, 4, 7, 8, 9]])

This treats each row/column as its seperate array so relationsips between row or columns will be lost.