# Fancy Indexing

We have now considered 2 ways to access chunks of an array:

1. Slicing
2. Boolean masks

There is yet another way that we can get access to parts of an array, known as *fancy indexing.* Fancy indexing is like simple indexing and slicing - but instead we pass an array of indices that we want to extract.

Let's start by loading NumPy and setting a seed for reproducibility:


In [None]:
import numpy as np
rand = np.random.RandomState(1234567890)

## The Basics

Consider a simple random array:

In [None]:
x = rand.randint(100, size=10)
print(x)

if w want to get 3 elements from the array, we could do this:

In [None]:
[x[9], x[2], x[5]]


'Fancy indexing' gives us a simpler way to go forward. We write an array of the indices we want to select, and pass that to the way we would normally index:

In [None]:
# equiv to
ind = [9, 2, 5]
x[ind]

The shape of the output array will depend on the shaoe of the index array:

In [None]:
ind = np.array([[9, 2],
                [5, 5]])
x[ind]

Naturally, fancy indexing extends to multidimensional arrays:

In [None]:
# and on multiple dimensions

X = np.arange(15).reshape((3, 5))
X

In [None]:
row = np.array([0, 1, 2])
col = np.array([2, 1, 3])
X[row, col]

Or we can get a two dimensional result:

In [None]:
X[row[:, np.newaxis], col]

In [None]:
row[:, np.newaxis]

This last example shows both the power of fancy indexing and the need to think about the output you want - we can *broadcast* the shape of the indices, and then pull out the relevant data with *fancy indexing*

## Combined Indexing

We can combine fancy indexing with any other way of indxing data:

In [None]:
print(X)

Like simple indexing

In [None]:
X[2, [2, 0, 1]]

slicing:

In [None]:
X[1:, [2, 0, 1]]

and Boolean Masks:

In [None]:
mask = np.array([1, 0, 1, 0], dtype=bool)
X[row[:, np.newaxis], mask] # for each row get the second and third column

## Example: Selecting Random Points

One useful way to use fancy indexing is to extract random subsets of rows or columns from a data set:

In [None]:
mean = [0, 0]
cov = [[1, 2],
       [2, 5]]
X = rand.multivariate_normal(mean, cov, 100)
X.shape

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt

plt.scatter(X[:, 0], X[:, 1])

We can select random points from the data:

In [None]:
indices = np.random.choice(X.shape[0], 50, replace=False)
indices

In [None]:
selection = X[indices]  # fancy indexing here
selection.shape

In [None]:
plt.scatter(X[:, 0], X[:, 1])
plt.scatter(selection[:, 0], selection[:, 1], color='red');

## Modifying Values with Fancy Indexing


We can use fancy indexing to modify parts of an array:

In [None]:
x = np.arange(11)
x

In [None]:
idx = np.arange(0,11,2)
x[idx] = -x[idx]
print(x)

In [None]:
x[idx] *= 10
print(x)

## Challenge: 

1. Load the alabama unemployment data, random sample 50 points and compute summary stats. Do they look similar to the complete data? 
2. Repeat the computation for sample sizes 20 and 70.
