In [15]:
%matplotlib inline
import numpy as np

## 1. numpy.argsort
Numpy argsort is used to return the indices that would sort an array. Using this information we could sort a corresponding array that may have data related to the data in the sorted array.
For example: Say, we want to order the people according to their age and their weights.

In [18]:
ages = np.random.randint(low=20, high=40, size = 10)
weights = np.random.randint(low=150, high=200, size=10)

print(ages)
print(weights)

[27 33 27 22 38 32 29 30 30 24]
[168 172 179 155 162 150 159 160 150 150]


In [33]:
# Now we will sort the ages array first and use that sorter to sort the corresponding weights array
sorter = np.argsort(ages)
print(ages[sorter])
print(weights[sorter])

[22 24 27 27 29 30 30 32 33 38]
[155 150 168 179 159 160 150 150 172 162]


As we can see, the indices of the new sorted age data points are used to sort the corresponding weights array. ALthough the correlation that you observe between the age and the corresponding weight data may be a little funny, it is a reasonable example to understand argsort.

### Permutation In Numpy
Permutations in numpy are simply arrays.

Numpy DocRandomly permute a sequence, or return a permuted range.
If x is a multi-dimensional array, it is only shuffled along its first index (rows).
-- From official Numpy Documentation

In [45]:
## For example
ex_1 = np.arange(36).reshape((6, 6))
ex_1

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35]])

In [46]:
# Applying permutation shuffles along its first index

np.random.permutation(ex_1)

array([[30, 31, 32, 33, 34, 35],
       [24, 25, 26, 27, 28, 29],
       [18, 19, 20, 21, 22, 23],
       [ 6,  7,  8,  9, 10, 11],
       [ 0,  1,  2,  3,  4,  5],
       [12, 13, 14, 15, 16, 17]])

In [47]:
# For a single dimensional array,
permutation = np.random.permutation(10)

In [42]:
permutation

array([0, 4, 6, 8, 1, 5, 3, 7, 9, 2])

### Computing inverse of permutation
Permuataion can also be applied which is fast and efficient and and can be reused.

Frequently to solve the sort problem, programmers use sorted(zip(array_1, array_2)), which is much slower. Permutations offer us 10-20 times speed over zip() 

In [51]:
permutation = np.random.permutation(10)
original = np.array(list('abcdefghij'))
permuted = original[permutation]
print(permutation)
print(original)
print(permuted)

[3 0 1 2 6 8 7 9 5 4]
['a' 'b' 'c' 'd' 'e' 'f' 'g' 'h' 'i' 'j']
['d' 'a' 'b' 'c' 'g' 'i' 'h' 'j' 'f' 'e']


__Inverse permutation__ can be computed by using numpy.argsort (again!)

In [67]:
inverse_permutation = np.argsort(permutation)

print(permuted[inverse_permutation])

['a' 'b' 'c' 'd' 'e' 'f' 'g' 'h' 'i' 'j']


### Computing order of elements in array

Very often, we will want to compute the order of each value in array and where they stand in respect to other elements based on their magnitude. 

In other words, for each element in array we want to find the number of elements smaller than the given element.

In [69]:
ex_3 = np.random.random(10)

print(ex_3)
print(np.argsort(np.argsort(ex_3)))

[ 0.40203798  0.59398841  0.43402664  0.33076928  0.39004246  0.70140826
  0.44770018  0.16685312  0.51172183  0.24314027]
[4 8 5 2 3 9 6 0 7 1]


In the above result, we could see that 0.16685312 is the smallest of them all and so it has no elements smaller than that. Hence, argsort() will assign the value 0 to it. Next comes 0.24314027 in the list and it has 0.16685312  smaller to it. Hence, it gets 1 and so on.

But, there is scipy function which performs the same function for us, and it's more general and faster, so prefer using it:

In [72]:
from scipy.stats import rankdata
rankdata(ex_3)

array([  5.,   9.,   6.,   3.,   4.,  10.,   7.,   1.,   8.,   2.])

As you can see, the ranking starts from 1 and not 0. Hence, we would want to subtract 1 from the result to produce the same ordering.

In [73]:
rankdata(ex_3) - 1

array([ 4.,  8.,  5.,  2.,  3.,  9.,  6.,  0.,  7.,  1.])

In [None]:
import numpy as np
# Create the following rank 2 array with shape (3, 4)
# [[ 1 2 3 4]
# [ 5 6 7 8]
# [ 9 10 11 12]]
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
# Use slicing to pull out the subarray consisting of the first 2 rows
# and columns 1 and 2; b is the following array of shape (2, 2):
# [[2 3]
# [6 7]]
b = a[:2, 1:3]
# A slice of an array is a view into the same data, so modifying it
# will modify the original array.
print(a[0, 1]) # Prints "2"
b[0, 0] = 77 # b[0, 0] is the same piece of data as a[0, 1]
print(a[0, 1]) # Prints "77"