# Fancy Indexing


Fancy indexing is the name for when an array or list is used in-place of an index:

In [1]:
import numpy as np

In [2]:
A = np.array([[n+m*10 for n in range(5)] for m in range(5)])
A

array([[ 0,  1,  2,  3,  4],
       [10, 11, 12, 13, 14],
       [20, 21, 22, 23, 24],
       [30, 31, 32, 33, 34],
       [40, 41, 42, 43, 44]])

As we've seen, numpy extends the regular python slicing syntax to make it easy to pull our regtangular blocks from arrays:

In [3]:
A[2:4, 1:3]

array([[21, 22],
       [31, 32]])

But what if you want a subset of the array that is not a contiguous rectangular block? Enter Fancy Indexing.

There are two types of fancy indexing: boolean arrays, and index arrays.

### Boolean arrays

Boolean arrays let you pull out a subset of an array with a boolean "mask"

In [8]:
a = np.random.random_integers(0, 20, 7)
a

array([16, 19, 16,  6,  9,  8,  0])

In [9]:
# create a boolean mask
mask = np.array([True, False, True, False, False, False, True])

# index with that mask -- you get the elements where the mask is true
a[mask]

array([16, 16,  0])

But why is this useful? -- because we can create the mask with logical operations:

In [10]:
mask = a > 8
mask

array([ True,  True,  True, False,  True, False, False], dtype=bool)

And you can assign to the sub-array, too -- very handy:

Set all the values greater than 5 to 5:

In [11]:
a[a > 8] = 8
a

array([8, 8, 8, 6, 8, 8, 0])

### array indexing

But what if you know the indexes of the elements you want?

you can pass those in:

In [12]:
a[ [1, 3, 2, 3] ] # note that you can repeat the same index...

array([8, 6, 8, 6])

In [15]:
# and it can work on higher rank arrays, also:
row_indices = [0, 2, 3]
A[row_indices]

array([[ 0,  1,  2,  3,  4],
       [20, 21, 22, 23, 24],
       [30, 31, 32, 33, 34]])

In [16]:
col_indices = [1, 2, -1] # remember, index -1 means the last element
A[row_indices, col_indices]

array([ 1, 22, 34])

## Functions and methods for extracting data from arrays

### where
The index mask can be converted to position index using the where function

In [17]:
indices = np.where(mask)
indices

(array([0, 1, 2, 4]),)

In [21]:
x = np.random.rand(5)
x[indices] # this indexing is equivalent to the fancy indexing x[mask]

array([ 0.85854926,  0.56887552,  0.91253532,  0.61428927])

But where does more:

`where(condition, [x, y])`

Return elements, either from `x` or `y`, depending on `condition`.

In [22]:
x

array([ 0.85854926,  0.56887552,  0.91253532,  0.20302485,  0.61428927])

In [26]:
np.where ( x < 0.7, 0, [1,2,3,4,5])

array([1, 0, 3, 0, 0])

In [40]:
A = np.arange(20).reshape((4,5))
print A
mask1 = (A > 10).astype(np.uint8)
mask2 = (A < 12).astype(np.uint8)
print mask1
print mask2
print mask1 + mask2

print np.where(A > 10, 1, 0)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]
[[0 0 0 0 0]
 [0 0 0 0 0]
 [0 1 1 1 1]
 [1 1 1 1 1]]
[[1 1 1 1 1]
 [1 1 1 1 1]
 [1 1 0 0 0]
 [0 0 0 0 0]]
[[1 1 1 1 1]
 [1 1 1 1 1]
 [1 2 1 1 1]
 [1 1 1 1 1]]
[[0 0 0 0 0]
 [0 0 0 0 0]
 [0 1 1 1 1]
 [1 1 1 1 1]]


### diag
With the diag function we can also extract the diagonal and subdiagonals of an array:

In [None]:
A

In [None]:
np.diag(A)

In [None]:
# you get the diagonals that are not the main ones
np.diag(A, -1)

In [None]:
np.diag(A, 1)

### take
The take function is similar to fancy indexing described above

`np.take(a, indices, axis=None, out=None, mode='raise')`

Take elements from an array along an axis.

This function does the same thing as "fancy" indexing (indexing arrays
using arrays); however, it can be easier to use if you need elements
along a given axis.

In [None]:
A

Let's say we want the 1,2,4 rows.
Easy with "fancy indexing"

In [None]:
A[ [1,2,4] ]

But what if we want the 1,2,4 columns?

`take` makes that easy

In [None]:
A.take([1,2,4], axis=1)

In [None]:
A.take([1,2,4], axis=0) # zero is the default axis

In [None]:
A[ [1,2,4] ]

`take` is also a function (not method), and so works on lists and other objects:

In [None]:
np.take([-3, -2, -1,  0,  1,  2], row_indices)

## Choose
Constructs and array by picking elements from several arrays:

In [None]:
which = [1, 0, 2, 1]
# we are choosing from three different arrays:
#  0, 1, 2
# so you need to pass in three arrays
choices = [[-2,-2,-2,-2],
           [5,  5, 5, 5],
           [7, 7, 7, 7]]

np.choose(which, choices)
# what shape will the result be?
# hint: all arrays must be the same size

This one is pretty tricky -- but handy when you need it.

In [44]:
np.random.rand(3,4)


array([[ 0.40375903,  0.84254145,  0.70356773,  0.29033187],
       [ 0.08898717,  0.11998772,  0.26383434,  0.40832226],
       [ 0.52167695,  0.7661762 ,  0.77218552,  0.51308202]])

In [45]:
np.__version__

'1.9.2'

In [46]:
np.random.rand?